Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences

Classification

Institution
Publication Year
Publication
Publication Type

Articles 1 - 24 of 24

Full-Text Articles in Computer Sciences

Afnd: Arabic Fake News Dataset For The Detection And Classification Of Articles Credibility, Ashwaq Khalil, Moath Jarrah, Monther Aldwairi, Manar Jaradat Apr 2022

Afnd: Arabic Fake News Dataset For The Detection And Classification Of Articles Credibility, Ashwaq Khalil, Moath Jarrah, Monther Aldwairi, Manar Jaradat

All Works

The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset enables the research community to use supervised and unsupervised machine learning algorithms to classify the credibility of Arabic news articles. AFND consists of 606912 public news articles that were scraped from 134 public news websites of 19 different Arab countries over a 6-month period using Python scripts. The Arabic fact-check platform, Misbar, is …


Per-Pixel Cloud Cover Classification Of Multispectral Landsat-8 Data, Salome E. Carrasco [*], Torrey J. Wagner, Brent T. Langhals Jun 2021

Per-Pixel Cloud Cover Classification Of Multispectral Landsat-8 Data, Salome E. Carrasco [*], Torrey J. Wagner, Brent T. Langhals

Faculty Publications

Random forest and neural network algorithms are applied to identify cloud cover using 10 of the wavelength bands available in Landsat 8 imagery. The methods classify each pixel into 4 different classes: clear, cloud shadow, light cloud, or cloud. The first method is based on a fully connected neural network with ten input neurons, two hidden layers of 8 and 10 neurons respectively, and a single-neuron output for each class. This type of model is considered with and without L2 regularization applied to the kernel weighting. The final model type is a random forest classifier created from an ensemble of …


How Does Land Cover Classification In Google Earth Engine Compare With Traditional Methods Of Land Cover Classification? What Are The Tradeoffs?, Carlos Sebastian Reyes May 2021

How Does Land Cover Classification In Google Earth Engine Compare With Traditional Methods Of Land Cover Classification? What Are The Tradeoffs?, Carlos Sebastian Reyes

Open Access Theses & Dissertations

The project focuses on comparing land cover classification of traditional methods such as ArcGIS with newer ones such as Google Earth Engine (GEE) as well as discussing any potential tradeoffs. Two studies were performed in both platforms, the first involved analyzing land cover change in the Middle Rio Grande (MRG) region of southern New Mexico, far west Texas, and northern Chihuahua, Mexico. The MRG study focused on urban and agricultural change in the region using two different classification methods. The second study focused on creating a post-hurricane damage assessment (PDA) with the goal of developing an automated method of estimating …


A Description Of A Humans Knowledge Using Artificial Intelligence, Dj Price Jan 2020

A Description Of A Humans Knowledge Using Artificial Intelligence, Dj Price

Mahurin Honors College Capstone Experience/Thesis Projects

There currently does not exist a way to easily view the relationships between a collection of written items (e.g. sports articles, diary entries, research papers). In recent years, novel machine learning methods have been developed which are very good at extracting semantic relationships from large numbers of documents. One of them is the (unsupervised) machine learning model Doc2Vec which constructs vectors for documents. The research project detailed in this paper uses this and other already existing algorithms to analyze the relationship between pieces of text. We set forth a broader ambition for this project before discussing the use and need …


An Analysis Of The Success Of Farmers Markets In Kentucky Using Logistic Regression And Support Vector Machines, Jeron Russell Jan 2020

An Analysis Of The Success Of Farmers Markets In Kentucky Using Logistic Regression And Support Vector Machines, Jeron Russell

Mahurin Honors College Capstone Experience/Thesis Projects

The purpose of this research is to look at the relationship that market-specific, economic, and demographic variables have with the success of farmers markets in Kentucky. It additionally seeks to build a tool for predicting farmers market success that could be used by policy makers to aid in decision-making processes concerning farmers markets. Logistic regression and Support Vector Machines (SVMs) are used on data acquired from the Kentucky Department of Agriculture and the American Community Survey in order to analyze the data in a traditional statistical approach as well as a machine learning approach. The results included an SVM model …


Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe Jan 2020

Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe

Engineering Management & Systems Engineering Faculty Publications

Special information has a significant role in disaster management. Land cover mapping can detect short- and long-term changes and monitor the vulnerable habitats. It is an effective evaluation to be included in the disaster management system to protect the conservation areas. The critical visual and statistical information presented to the decision-makers can help in mitigation or adaption before crossing a threshold. This paper aims to contribute in the academic and the practice aspects by offering a potential solution to enhance the disaster data source effectiveness. The key research question that the authors try to answer in this paper is how …


A Data Science Approach To Defining A Data Scientist, Andy Ho, An Nguyen, Jodi L. Pafford, Robert Slater Dec 2019

A Data Science Approach To Defining A Data Scientist, Andy Ho, An Nguyen, Jodi L. Pafford, Robert Slater

SMU Data Science Review

In this paper, we present a common definition and list of skills for a Data Scientist using online job postings. The overlap and ambiguity of various roles such as data scientist, data engineer, data analyst, software engineer, database administrator, and statistician motivate the problem. To arrive at a single Data Scientist definition, we collect over 8,000 job postings from Indeed.com for the six job titles. Each corpus contains text on job qualifications, skills, responsibilities, educational preferences, and requirements. Our data science methodology and analysis rendered the single definition of a data scientist: A data scientist codes, collaborates, and communicates – …


Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack Nov 2019

Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack

CSE Conference and Workshop Papers

Includes framing, overview, and discussion of the explorations pursued as part of the Digital Libraries, Intelligent Data Analytics, and Augmented Description demonstration project, pursued by members of the Aida digital libraries research team at the University of Nebraska-Lincoln through a research services contract with the Library of Congress. This presentation covered: Aida research team and background for the demonstration project; broad outlines of “Digital Libraries, Intelligent Data Analytics, and Augmented Description”; what changed for us as a research team over the collaboration and why; deliverables of our work; thoughts toward “What next”; and deep-dives into the explorations. The machine learning …


Discovery Of Topological Constraints On Spatial Object Classes Using A Refined Topological Model, Ivan Majic, Elham Naghizade, Stephan Winter, Martin Tomko Jun 2019

Discovery Of Topological Constraints On Spatial Object Classes Using A Refined Topological Model, Ivan Majic, Elham Naghizade, Stephan Winter, Martin Tomko

Journal of Spatial Information Science

In a typical data collection process, a surveyed spatial object is annotated upon creation, and is classified based on its attributes. This annotation can also be guided by textual definitions of objects. However, interpretations of such definitions may differ among people, and thus result in subjective and inconsistent classification of objects. This problem becomes even more pronounced if the cultural and linguistic differences are considered. As a solution, this paper investigates the role of topology as the defining characteristic of a class of spatial objects. We propose a data mining approach based on frequent itemset mining to learn patterns in …


Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison May 2019

Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison

Computational and Data Sciences (MS) Theses

The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder is paramount to enabling the success of behavioral therapy; an essential step in this process being the labeling of challenging behaviors demonstrated in therapy sessions. These manifestations differ across individuals and within individuals over time and thus, the appropriate classification of a challenging behavior when considering purely qualitative factors can be unclear. In this thesis we seek to add quantitative depth to this otherwise qualitative task of challenging behavior classification. We do so through the application of natural language processing techniques to behavioral descriptions extracted from the …


A Survey Of Attention Deficit Hyperactivity Disorder Identification Using Psychophysiological Data, S. De Silva, S. Dayarathna, G. Ariyarathne, D. Meedeniya, Sampath Jayarathna Jan 2019

A Survey Of Attention Deficit Hyperactivity Disorder Identification Using Psychophysiological Data, S. De Silva, S. Dayarathna, G. Ariyarathne, D. Meedeniya, Sampath Jayarathna

Computer Science Faculty Publications

Attention Deficit Hyperactivity Disorder (ADHD) is one of the most common neurological disorders among children, that affects different areas in the brain that allows executing certain functionalities. This may lead to a variety of impairments such as difficulties in paying attention or focusing, controlling impulsive behaviours and overreacting. The continuous symptoms may have a severe impact in the long-term. This paper explores the ADHD identification studies using eye movement data and functional Magnetic Resonance Imaging (fMRI). This study discusses different machine learning techniques, existing models and analyses the existing literature. We have identified the current challenges and possible future directions …


Work-In-Progress Reports Submitted To The Library Of Congress As Part Of Digital Libraries, Intelligent Data Analytics, And Augmented Description, Chulwoo Pack, Yi Liu, Leen-Kiat Soh, Elizabeth Lorang Jan 2019

Work-In-Progress Reports Submitted To The Library Of Congress As Part Of Digital Libraries, Intelligent Data Analytics, And Augmented Description, Chulwoo Pack, Yi Liu, Leen-Kiat Soh, Elizabeth Lorang

CSE Technical Reports

This document includes work-in-progress reports submitted to the Library of Congress as part of the Aida digital libraries research team's work on Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project. These work-in-progress reports provide a snapshot glimpse, as well as underlying rationale and decision-making, at various points in the development of the project and its machine learning explorations. Reports cover explorations on historic newspapers, minimally-processed manuscript collections, materials digitized from physical originals and those digitized from microform surrogates, and investigate challenges related to image segmentation and document zoning, classification, document image quality analysis, metadata generation, and more.


Detecting Fake News In Social Media Networks, Monther Aldwairi, Ali Alwahedi Jan 2018

Detecting Fake News In Social Media Networks, Monther Aldwairi, Ali Alwahedi

All Works

© 2018 The Authors. Published by Elsevier Ltd. Fake news and hoaxes have been there since before the advent of the Internet. The widely accepted definition of Internet fake news is: fictitious articles deliberately fabricated to deceive readers'. Social media and news outlets publish fake news to increase readership or as part of psychological warfare. Ingeneral, the goal is profiting through clickbaits. Clickbaits lure users and entice curiosity with flashy headlines or designs to click links to increase advertisements revenues. This exposition analyzes the prevalence of fake news in light of the advances in communication made possible by the emergence …


Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan Sep 2017

Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan

Faculty Publications

We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) …


A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth Jan 2017

A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth

Kno.e.sis Publications

Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer …


On Profiling Bots In Social Media, Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee Peng Lim Nov 2016

On Profiling Bots In Social Media, Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee Peng Lim

Research Collection School Of Computing and Information Systems

The popularity of social media platforms such as Twitter has led to the proliferation of automated bots, creating both opportunities and challenges in information dissemination, user engagements, and quality of services. Past works on profiling bots had been focused largely on malicious bots, with the assumption that these bots should be removed. In this work, however, we find many bots that are benign, and propose a new, broader categorization of bots based on their behaviors. This includes broadcast, consumption, and spam bots. To facilitate comprehensive analyses of bots and how they compare to human accounts, we develop a systematic profiling …


A Summary Of Classification And Regression Tree With Application, Adem Meta Oct 2016

A Summary Of Classification And Regression Tree With Application, Adem Meta

UBT International Conference

Classification and regression tree (CART) is a non-parametric methodology that was introduced first by Breiman and colleagues in 1984. CART is a technique which divides populations into meaningful subgroups that allows the identification of groups of interest. CART as a classification method constructs decision trees. Depending on information that is available about the dataset, a classification tree or a regression tree can be constructed. The first part of this paper describes the fundamental principles of tree construction, pruning procedure and different splitting algorithms. The second part of the paper answers the questions why or why not the CART method should …


Towards An Infodemiological Algorithm For Classification Of Filipino Health Tweets, Ma. Regina Justina E. Estuar, Kennedy E. Espina, Delfin Jay Sabido Ix, Raymond Josef Edward Lara, Vikki Car De Los Reyes Jan 2016

Towards An Infodemiological Algorithm For Classification Of Filipino Health Tweets, Ma. Regina Justina E. Estuar, Kennedy E. Espina, Delfin Jay Sabido Ix, Raymond Josef Edward Lara, Vikki Car De Los Reyes

Department of Information Systems & Computer Science Faculty Publications

Finding innovative ICT solutions to enhance the Philippines’ health sector is part and parcel of the Philippine eHealth Strategic Framework and Plan 2020 program. This study sees the opportunity of using collected Twitter data to create a model that processes tweets to produce a dataset that may be relevant in the field of epidemiology and infodemiology. Through the collection of relevant tweets, future studies may make use of the output of this research for various purposes, such as the improvement of epidemiological systems of the Department of Health in support of the eHealth strategy. In this study, we …


Capstone Projects Mining System For Insights And Recommendations, Melvrivk Aik Chun Goh, Swapna Gottipati, Venky Shankararaman Dec 2015

Capstone Projects Mining System For Insights And Recommendations, Melvrivk Aik Chun Goh, Swapna Gottipati, Venky Shankararaman

Research Collection School Of Computing and Information Systems

In this paper, we present a classification based system to discover knowledge and trends in higher education students’ projects. Essentially, the educational capstone projects provide an opportunity for students to apply what they have learned and prepare themselves for industry needs. Therefore mining such projects gives insights of students’ experiences as well as industry project requirements and trends. In particular, we mine capstone projects executed by Information Systems students to discover patterns and insights related to people, organization, domain, industry needs and time. We build a capstone projects mining system (CPMS) based on classification models that leverage text mining, natural …


Immunology Inspired Detection Of Data Theft From Autonomous Network Activity, Theodore O. Cochran Apr 2015

Immunology Inspired Detection Of Data Theft From Autonomous Network Activity, Theodore O. Cochran

CCE Theses and Dissertations

The threat of data theft posed by self-propagating, remotely controlled bot malware is increasing. Cyber criminals are motivated to steal sensitive data, such as user names, passwords, account numbers, and credit card numbers, because these items can be parlayed into cash. For anonymity and economy of scale, bot networks have become the cyber criminal’s weapon of choice. In 2010 a single botnet included over one million compromised host computers, and one of the largest botnets in 2011 was specifically designed to harvest financial data from its victims. Unfortunately, current intrusion detection methods are unable to effectively detect data extraction techniques …


On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen Mar 2014

On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen

Dissertations and Theses Collection (Open Access)

User profiling such as user affiliation prediction in online social network is a challenging task, with many important applications in targeted marketing and personalized recommendation. The research task here is to predict some user affiliation attributes that suggest user participation in different social groups.


Demographic Prediction Of Mobile User From Phone Usage, Shahram Mohrehkesh, Shuiwang Ji, Tamer Nadeem, Michele C. Weigle Jan 2012

Demographic Prediction Of Mobile User From Phone Usage, Shahram Mohrehkesh, Shuiwang Ji, Tamer Nadeem, Michele C. Weigle

Computer Science Faculty Publications

In this paper, we describe how we use the mobile phone usage of users to predict their demographic attributes. Using call log, visited GSM cells information, visited Bluetooth devices, visited Wireless LAN devices, accelerometer data, and so on, we predict the gender, age, marital status, job and number of people in household of users. The accuracy of developed classifiers for these classification problems ranges from 45-87% depending upon the particular classification problem.


Vegetation Identification Based On Satellite Imagery, Vamsi K.R. Mantena, Ramu Pedada, Srinivas Jakkula, Yuzhong Shen, Jiang Li, Hamid R. Arabnia (Ed.) Jan 2008

Vegetation Identification Based On Satellite Imagery, Vamsi K.R. Mantena, Ramu Pedada, Srinivas Jakkula, Yuzhong Shen, Jiang Li, Hamid R. Arabnia (Ed.)

Electrical & Computer Engineering Faculty Publications

Automatic vegetation identification plays an important role in many applications including remote sensing and high performance flight simulations. This paper presents a method to automatically identify vegetation based upon satellite imagery. First, we utilize the ISODATA algorithm to cluster pixels in the images where the number of clusters is determined by the algorithm. We then apply morphological operations to the clustered images to smooth the boundaries between clusters and to fill holes inside clusters. After that, we compute six features for each cluster. These six features then go through a feature selection algorithm and three of them are determined to …


Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao May 2001

Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao

Kno.e.sis Publications

Classification aims to discover a model from training data that can be used to predict the class of test instances. In this paper, we propose the use of jumping emerging patterns (JEPs) as the basis for a new classifier called the JEP-Classifier. Each JEP can capture some crucial difference between a pair of datasets. Then, aggregating all JEPs of large supports can produce a more potent classification power. Procedurally, the JEP-Classifier learns the pair-wise features (sets of JEPs) contained in the training data, and uses the collective impacts contributed by the most expressive pair-wise features to determine the class labels …