Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Old Dominion University (4)
- Singapore Management University (3)
- University of Nebraska - Lincoln (2)
- Western Kentucky University (2)
- Wright State University (2)
-
- Zayed University (2)
- Air Force Institute of Technology (1)
- Ateneo de Manila University (1)
- Chapman University (1)
- Macalester College (1)
- Nova Southeastern University (1)
- Southern Methodist University (1)
- The University of Maine (1)
- University for Business and Technology in Kosovo (1)
- University of Texas at El Paso (1)
- Publication Year
- Publication
-
- All Works (2)
- Computer Science Faculty Publications (2)
- Faculty Publications (2)
- Kno.e.sis Publications (2)
- Mahurin Honors College Capstone Experience/Thesis Projects (2)
-
- Research Collection School Of Computing and Information Systems (2)
- CCE Theses and Dissertations (1)
- CSE Conference and Workshop Papers (1)
- CSE Technical Reports (1)
- Computational and Data Sciences (MS) Theses (1)
- Department of Information Systems & Computer Science Faculty Publications (1)
- Dissertations and Theses Collection (Open Access) (1)
- Electrical & Computer Engineering Faculty Publications (1)
- Engineering Management & Systems Engineering Faculty Publications (1)
- Journal of Spatial Information Science (1)
- Open Access Theses & Dissertations (1)
- SMU Data Science Review (1)
- UBT International Conference (1)
- Publication Type
Articles 1 - 24 of 24
Full-Text Articles in Computer Sciences
Afnd: Arabic Fake News Dataset For The Detection And Classification Of Articles Credibility, Ashwaq Khalil, Moath Jarrah, Monther Aldwairi, Manar Jaradat
Afnd: Arabic Fake News Dataset For The Detection And Classification Of Articles Credibility, Ashwaq Khalil, Moath Jarrah, Monther Aldwairi, Manar Jaradat
All Works
The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset enables the research community to use supervised and unsupervised machine learning algorithms to classify the credibility of Arabic news articles. AFND consists of 606912 public news articles that were scraped from 134 public news websites of 19 different Arab countries over a 6-month period using Python scripts. The Arabic fact-check platform, Misbar, is …
Per-Pixel Cloud Cover Classification Of Multispectral Landsat-8 Data, Salome E. Carrasco [*], Torrey J. Wagner, Brent T. Langhals
Per-Pixel Cloud Cover Classification Of Multispectral Landsat-8 Data, Salome E. Carrasco [*], Torrey J. Wagner, Brent T. Langhals
Faculty Publications
Random forest and neural network algorithms are applied to identify cloud cover using 10 of the wavelength bands available in Landsat 8 imagery. The methods classify each pixel into 4 different classes: clear, cloud shadow, light cloud, or cloud. The first method is based on a fully connected neural network with ten input neurons, two hidden layers of 8 and 10 neurons respectively, and a single-neuron output for each class. This type of model is considered with and without L2 regularization applied to the kernel weighting. The final model type is a random forest classifier created from an ensemble of …
How Does Land Cover Classification In Google Earth Engine Compare With Traditional Methods Of Land Cover Classification? What Are The Tradeoffs?, Carlos Sebastian Reyes
How Does Land Cover Classification In Google Earth Engine Compare With Traditional Methods Of Land Cover Classification? What Are The Tradeoffs?, Carlos Sebastian Reyes
Open Access Theses & Dissertations
The project focuses on comparing land cover classification of traditional methods such as ArcGIS with newer ones such as Google Earth Engine (GEE) as well as discussing any potential tradeoffs. Two studies were performed in both platforms, the first involved analyzing land cover change in the Middle Rio Grande (MRG) region of southern New Mexico, far west Texas, and northern Chihuahua, Mexico. The MRG study focused on urban and agricultural change in the region using two different classification methods. The second study focused on creating a post-hurricane damage assessment (PDA) with the goal of developing an automated method of estimating …
A Description Of A Humans Knowledge Using Artificial Intelligence, Dj Price
A Description Of A Humans Knowledge Using Artificial Intelligence, Dj Price
Mahurin Honors College Capstone Experience/Thesis Projects
There currently does not exist a way to easily view the relationships between a collection of written items (e.g. sports articles, diary entries, research papers). In recent years, novel machine learning methods have been developed which are very good at extracting semantic relationships from large numbers of documents. One of them is the (unsupervised) machine learning model Doc2Vec which constructs vectors for documents. The research project detailed in this paper uses this and other already existing algorithms to analyze the relationship between pieces of text. We set forth a broader ambition for this project before discussing the use and need …
An Analysis Of The Success Of Farmers Markets In Kentucky Using Logistic Regression And Support Vector Machines, Jeron Russell
An Analysis Of The Success Of Farmers Markets In Kentucky Using Logistic Regression And Support Vector Machines, Jeron Russell
Mahurin Honors College Capstone Experience/Thesis Projects
The purpose of this research is to look at the relationship that market-specific, economic, and demographic variables have with the success of farmers markets in Kentucky. It additionally seeks to build a tool for predicting farmers market success that could be used by policy makers to aid in decision-making processes concerning farmers markets. Logistic regression and Support Vector Machines (SVMs) are used on data acquired from the Kentucky Department of Agriculture and the American Community Survey in order to analyze the data in a traditional statistical approach as well as a machine learning approach. The results included an SVM model …
Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe
Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe
Engineering Management & Systems Engineering Faculty Publications
Special information has a significant role in disaster management. Land cover mapping can detect short- and long-term changes and monitor the vulnerable habitats. It is an effective evaluation to be included in the disaster management system to protect the conservation areas. The critical visual and statistical information presented to the decision-makers can help in mitigation or adaption before crossing a threshold. This paper aims to contribute in the academic and the practice aspects by offering a potential solution to enhance the disaster data source effectiveness. The key research question that the authors try to answer in this paper is how …
A Data Science Approach To Defining A Data Scientist, Andy Ho, An Nguyen, Jodi L. Pafford, Robert Slater
A Data Science Approach To Defining A Data Scientist, Andy Ho, An Nguyen, Jodi L. Pafford, Robert Slater
SMU Data Science Review
In this paper, we present a common definition and list of skills for a Data Scientist using online job postings. The overlap and ambiguity of various roles such as data scientist, data engineer, data analyst, software engineer, database administrator, and statistician motivate the problem. To arrive at a single Data Scientist definition, we collect over 8,000 job postings from Indeed.com for the six job titles. Each corpus contains text on job qualifications, skills, responsibilities, educational preferences, and requirements. Our data science methodology and analysis rendered the single definition of a data scientist: A data scientist codes, collaborates, and communicates – …
Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack
Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack
CSE Conference and Workshop Papers
Includes framing, overview, and discussion of the explorations pursued as part of the Digital Libraries, Intelligent Data Analytics, and Augmented Description demonstration project, pursued by members of the Aida digital libraries research team at the University of Nebraska-Lincoln through a research services contract with the Library of Congress. This presentation covered: Aida research team and background for the demonstration project; broad outlines of “Digital Libraries, Intelligent Data Analytics, and Augmented Description”; what changed for us as a research team over the collaboration and why; deliverables of our work; thoughts toward “What next”; and deep-dives into the explorations. The machine learning …
Discovery Of Topological Constraints On Spatial Object Classes Using A Refined Topological Model, Ivan Majic, Elham Naghizade, Stephan Winter, Martin Tomko
Discovery Of Topological Constraints On Spatial Object Classes Using A Refined Topological Model, Ivan Majic, Elham Naghizade, Stephan Winter, Martin Tomko
Journal of Spatial Information Science
In a typical data collection process, a surveyed spatial object is annotated upon creation, and is classified based on its attributes. This annotation can also be guided by textual definitions of objects. However, interpretations of such definitions may differ among people, and thus result in subjective and inconsistent classification of objects. This problem becomes even more pronounced if the cultural and linguistic differences are considered. As a solution, this paper investigates the role of topology as the defining characteristic of a class of spatial objects. We propose a data mining approach based on frequent itemset mining to learn patterns in …
Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison
Classifying Challenging Behaviors In Autism Spectrum Disorder With Neural Document Embeddings, Abigail Atchison
Computational and Data Sciences (MS) Theses
The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder is paramount to enabling the success of behavioral therapy; an essential step in this process being the labeling of challenging behaviors demonstrated in therapy sessions. These manifestations differ across individuals and within individuals over time and thus, the appropriate classification of a challenging behavior when considering purely qualitative factors can be unclear. In this thesis we seek to add quantitative depth to this otherwise qualitative task of challenging behavior classification. We do so through the application of natural language processing techniques to behavioral descriptions extracted from the …
A Survey Of Attention Deficit Hyperactivity Disorder Identification Using Psychophysiological Data, S. De Silva, S. Dayarathna, G. Ariyarathne, D. Meedeniya, Sampath Jayarathna
A Survey Of Attention Deficit Hyperactivity Disorder Identification Using Psychophysiological Data, S. De Silva, S. Dayarathna, G. Ariyarathne, D. Meedeniya, Sampath Jayarathna
Computer Science Faculty Publications
Attention Deficit Hyperactivity Disorder (ADHD) is one of the most common neurological disorders among children, that affects different areas in the brain that allows executing certain functionalities. This may lead to a variety of impairments such as difficulties in paying attention or focusing, controlling impulsive behaviours and overreacting. The continuous symptoms may have a severe impact in the long-term. This paper explores the ADHD identification studies using eye movement data and functional Magnetic Resonance Imaging (fMRI). This study discusses different machine learning techniques, existing models and analyses the existing literature. We have identified the current challenges and possible future directions …
Work-In-Progress Reports Submitted To The Library Of Congress As Part Of Digital Libraries, Intelligent Data Analytics, And Augmented Description, Chulwoo Pack, Yi Liu, Leen-Kiat Soh, Elizabeth Lorang
Work-In-Progress Reports Submitted To The Library Of Congress As Part Of Digital Libraries, Intelligent Data Analytics, And Augmented Description, Chulwoo Pack, Yi Liu, Leen-Kiat Soh, Elizabeth Lorang
CSE Technical Reports
This document includes work-in-progress reports submitted to the Library of Congress as part of the Aida digital libraries research team's work on Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project. These work-in-progress reports provide a snapshot glimpse, as well as underlying rationale and decision-making, at various points in the development of the project and its machine learning explorations. Reports cover explorations on historic newspapers, minimally-processed manuscript collections, materials digitized from physical originals and those digitized from microform surrogates, and investigate challenges related to image segmentation and document zoning, classification, document image quality analysis, metadata generation, and more.
Detecting Fake News In Social Media Networks, Monther Aldwairi, Ali Alwahedi
Detecting Fake News In Social Media Networks, Monther Aldwairi, Ali Alwahedi
All Works
© 2018 The Authors. Published by Elsevier Ltd. Fake news and hoaxes have been there since before the advent of the Internet. The widely accepted definition of Internet fake news is: fictitious articles deliberately fabricated to deceive readers'. Social media and news outlets publish fake news to increase readership or as part of psychological warfare. Ingeneral, the goal is profiting through clickbaits. Clickbaits lure users and entice curiosity with flashy headlines or designs to click links to increase advertisements revenues. This exposition analyzes the prevalence of fake news in light of the advances in communication made possible by the emergence …
Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan
Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan
Faculty Publications
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) …
A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth
A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth
Kno.e.sis Publications
Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer …
On Profiling Bots In Social Media, Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee Peng Lim
On Profiling Bots In Social Media, Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee Peng Lim
Research Collection School Of Computing and Information Systems
The popularity of social media platforms such as Twitter has led to the proliferation of automated bots, creating both opportunities and challenges in information dissemination, user engagements, and quality of services. Past works on profiling bots had been focused largely on malicious bots, with the assumption that these bots should be removed. In this work, however, we find many bots that are benign, and propose a new, broader categorization of bots based on their behaviors. This includes broadcast, consumption, and spam bots. To facilitate comprehensive analyses of bots and how they compare to human accounts, we develop a systematic profiling …
A Summary Of Classification And Regression Tree With Application, Adem Meta
A Summary Of Classification And Regression Tree With Application, Adem Meta
UBT International Conference
Classification and regression tree (CART) is a non-parametric methodology that was introduced first by Breiman and colleagues in 1984. CART is a technique which divides populations into meaningful subgroups that allows the identification of groups of interest. CART as a classification method constructs decision trees. Depending on information that is available about the dataset, a classification tree or a regression tree can be constructed. The first part of this paper describes the fundamental principles of tree construction, pruning procedure and different splitting algorithms. The second part of the paper answers the questions why or why not the CART method should …
Towards An Infodemiological Algorithm For Classification Of Filipino Health Tweets, Ma. Regina Justina E. Estuar, Kennedy E. Espina, Delfin Jay Sabido Ix, Raymond Josef Edward Lara, Vikki Car De Los Reyes
Towards An Infodemiological Algorithm For Classification Of Filipino Health Tweets, Ma. Regina Justina E. Estuar, Kennedy E. Espina, Delfin Jay Sabido Ix, Raymond Josef Edward Lara, Vikki Car De Los Reyes
Department of Information Systems & Computer Science Faculty Publications
Finding innovative ICT solutions to enhance the Philippines’ health sector is part and parcel of the Philippine eHealth Strategic Framework and Plan 2020 program. This study sees the opportunity of using collected Twitter data to create a model that processes tweets to produce a dataset that may be relevant in the field of epidemiology and infodemiology. Through the collection of relevant tweets, future studies may make use of the output of this research for various purposes, such as the improvement of epidemiological systems of the Department of Health in support of the eHealth strategy. In this study, we …
Capstone Projects Mining System For Insights And Recommendations, Melvrivk Aik Chun Goh, Swapna Gottipati, Venky Shankararaman
Capstone Projects Mining System For Insights And Recommendations, Melvrivk Aik Chun Goh, Swapna Gottipati, Venky Shankararaman
Research Collection School Of Computing and Information Systems
In this paper, we present a classification based system to discover knowledge and trends in higher education students’ projects. Essentially, the educational capstone projects provide an opportunity for students to apply what they have learned and prepare themselves for industry needs. Therefore mining such projects gives insights of students’ experiences as well as industry project requirements and trends. In particular, we mine capstone projects executed by Information Systems students to discover patterns and insights related to people, organization, domain, industry needs and time. We build a capstone projects mining system (CPMS) based on classification models that leverage text mining, natural …
Immunology Inspired Detection Of Data Theft From Autonomous Network Activity, Theodore O. Cochran
Immunology Inspired Detection Of Data Theft From Autonomous Network Activity, Theodore O. Cochran
CCE Theses and Dissertations
The threat of data theft posed by self-propagating, remotely controlled bot malware is increasing. Cyber criminals are motivated to steal sensitive data, such as user names, passwords, account numbers, and credit card numbers, because these items can be parlayed into cash. For anonymity and economy of scale, bot networks have become the cyber criminal’s weapon of choice. In 2010 a single botnet included over one million compromised host computers, and one of the largest botnets in 2011 was specifically designed to harvest financial data from its victims. Unfortunately, current intrusion detection methods are unable to effectively detect data extraction techniques …
On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen
On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen
Dissertations and Theses Collection (Open Access)
User profiling such as user affiliation prediction in online social network is a challenging task, with many important applications in targeted marketing and personalized recommendation. The research task here is to predict some user affiliation attributes that suggest user participation in different social groups.
Demographic Prediction Of Mobile User From Phone Usage, Shahram Mohrehkesh, Shuiwang Ji, Tamer Nadeem, Michele C. Weigle
Demographic Prediction Of Mobile User From Phone Usage, Shahram Mohrehkesh, Shuiwang Ji, Tamer Nadeem, Michele C. Weigle
Computer Science Faculty Publications
In this paper, we describe how we use the mobile phone usage of users to predict their demographic attributes. Using call log, visited GSM cells information, visited Bluetooth devices, visited Wireless LAN devices, accelerometer data, and so on, we predict the gender, age, marital status, job and number of people in household of users. The accuracy of developed classifiers for these classification problems ranges from 45-87% depending upon the particular classification problem.
Vegetation Identification Based On Satellite Imagery, Vamsi K.R. Mantena, Ramu Pedada, Srinivas Jakkula, Yuzhong Shen, Jiang Li, Hamid R. Arabnia (Ed.)
Vegetation Identification Based On Satellite Imagery, Vamsi K.R. Mantena, Ramu Pedada, Srinivas Jakkula, Yuzhong Shen, Jiang Li, Hamid R. Arabnia (Ed.)
Electrical & Computer Engineering Faculty Publications
Automatic vegetation identification plays an important role in many applications including remote sensing and high performance flight simulations. This paper presents a method to automatically identify vegetation based upon satellite imagery. First, we utilize the ISODATA algorithm to cluster pixels in the images where the number of clusters is determined by the algorithm. We then apply morphological operations to the clustered images to smooth the boundaries between clusters and to fill holes inside clusters. After that, we compute six features for each cluster. These six features then go through a feature selection algorithm and three of them are determined to …
Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao
Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao
Kno.e.sis Publications
Classification aims to discover a model from training data that can be used to predict the class of test instances. In this paper, we propose the use of jumping emerging patterns (JEPs) as the basis for a new classifier called the JEP-Classifier. Each JEP can capture some crucial difference between a pair of datasets. Then, aggregating all JEPs of large supports can produce a more potent classification power. Procedurally, the JEP-Classifier learns the pair-wise features (sets of JEPs) contained in the training data, and uses the collective impacts contributed by the most expressive pair-wise features to determine the class labels …