Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Computer Sciences

Comparative Adjudication Of Noisy And Subjective Data Annotation Disagreements For Deep Learning, Scott David Williams Jan 2023

Comparative Adjudication Of Noisy And Subjective Data Annotation Disagreements For Deep Learning, Scott David Williams

Browse all Theses and Dissertations

Obtaining accurate inferences from deep neural networks is difficult when models are trained on instances with conflicting labels. Algorithmic recognition of online hate speech illustrates this. No human annotator is perfectly reliable, so multiple annotators evaluate and label online posts in a corpus. Labeling scheme limitations, differences in annotators' beliefs, and limits to annotators' honesty and carefulness cause some labels to disagree. Consequently, decisive and accurate inferences become less likely. Some practical applications such as social research can tolerate some indecisiveness. However, an online platform using an indecisive classifier for automated content moderation could create more problems than it solves. …


Encryption And Compression Classification Of Internet Of Things Traffic, Mariam Najdat M Saleh Jan 2023

Encryption And Compression Classification Of Internet Of Things Traffic, Mariam Najdat M Saleh

Browse all Theses and Dissertations

The Internet of Things (IoT) is used in many fields that generate sensitive data, such as healthcare and surveillance. Increased reliance on IoT raised serious information security concerns. This dissertation presents three systems for analyzing and classifying IoT traffic using Deep Learning (DL) models, and a large dataset is built for systems training and evaluation. The first system studies the effect of combining raw data and engineered features to optimize the classification of encrypted and compressed IoT traffic using Engineered Features Classification (EFC), Raw Data Classification (RDC), and combined Raw Data and Engineered Features Classification (RDEFC) approaches. Our results demonstrate …


A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth Jan 2017

A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth

Kno.e.sis Publications

Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer …


Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani Jan 2015

Contrast Pattern Aided Regression And Classification, Vahid Taslimitehrani

Browse all Theses and Dissertations

Regression and classification techniques play an essential role in many data mining tasks and have broad applications. However, most of the state-of-the-art regression and classification techniques are often unable to adequately model the interactions among predictor variables in highly heterogeneous datasets. New techniques that can effectively model such complex and heterogeneous structures are needed to significantly improve prediction accuracy. In this dissertation, we propose a novel type of accurate and interpretable regression and classification models, named as Pattern Aided Regression (PXR) and Pattern Aided Classification (PXC) respectively. Both PXR and PXC rely on identifying regions in the data space where …


Distributed Owl El Reasoning: The Story So Far, Raghava Mutharaju, Pascal Hitzler, Prabhaker Mateti Oct 2014

Distributed Owl El Reasoning: The Story So Far, Raghava Mutharaju, Pascal Hitzler, Prabhaker Mateti

Computer Science and Engineering Faculty Publications

Automated generation of axioms from streaming data, such as traffic and text, can result in very large ontologies that single machine reasoners cannot handle. Reasoning with large ontologies requires distributed solutions. Scalable reasoning techniques for RDFS, OWL Horst and OWL 2 RL now exist. For OWL 2 EL, several distributed reasoning approaches have been tried, but are all perceived to be inefficient. We analyze this perception. We analyze completion rule based distributed approaches, using different characteristics, such as dependency among the rules, implementation optimizations, how axioms and rules are distributed. We also present a distributed queue approach for the classification …


Data Mining And Analysis On Multiple Time Series Object Data, Chunyu Jiang Jan 2007

Data Mining And Analysis On Multiple Time Series Object Data, Chunyu Jiang

Browse all Theses and Dissertations

Huge amount of data is available in our society and the need for turning such data into useful information and knowledge is urgent. Data mining is an important field addressing that need and significant progress has been achieved in the last decade. In several important application areas, data arises in the format of Multiple Time Series Object (MTSO) data, where each data object is an array of time series over a large set of features and each has an associated class or state. Very little research has been conducted towards this kind of data. Examples include computational toxicology, where each …


Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao May 2001

Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao

Kno.e.sis Publications

Classification aims to discover a model from training data that can be used to predict the class of test instances. In this paper, we propose the use of jumping emerging patterns (JEPs) as the basis for a new classifier called the JEP-Classifier. Each JEP can capture some crucial difference between a pair of datasets. Then, aggregating all JEPs of large supports can produce a more potent classification power. Procedurally, the JEP-Classifier learns the pair-wise features (sets of JEPs) contained in the training data, and uses the collective impacts contributed by the most expressive pair-wise features to determine the class labels …