Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

2017

Classification

Discipline
Institution
Publication
Publication Type

Articles 1 - 23 of 23

Full-Text Articles in Computer Sciences

Automated Species Classification Methods For Passive Acoustic Monitoring Of Beaked Whales, John Lebien Dec 2017

Automated Species Classification Methods For Passive Acoustic Monitoring Of Beaked Whales, John Lebien

University of New Orleans Theses and Dissertations

The Littoral Acoustic Demonstration Center has collected passive acoustic monitoring data in the northern Gulf of Mexico since 2001. Recordings were made in 2007 near the Deepwater Horizon oil spill that provide a baseline for an extensive study of regional marine mammal populations in response to the disaster. Animal density estimates can be derived from detections of echolocation signals in the acoustic data. Beaked whales are of particular interest as they remain one of the least understood groups of marine mammals, and relatively few abundance estimates exist. Efficient methods for classifying detected echolocation transients are essential for mining long-term passive …


Process Models Discovery And Traces Classification: A Fuzzy-Bpmn Mining Approach., Kingsley Okoye Dr, Usman Naeem Dr, Syed Islam Dr, Abdel-Rahman H. Tawil Dr, Elyes Lamine Dr Dec 2017

Process Models Discovery And Traces Classification: A Fuzzy-Bpmn Mining Approach., Kingsley Okoye Dr, Usman Naeem Dr, Syed Islam Dr, Abdel-Rahman H. Tawil Dr, Elyes Lamine Dr

Journal of International Technology and Information Management

The discovery of useful or worthwhile process models must be performed with due regards to the transformation that needs to be achieved. The blend of the data representations (i.e data mining) and process modelling methods, often allied to the field of Process Mining (PM), has proven to be effective in the process analysis of the event logs readily available in many organisations information systems. Moreover, the Process Discovery has been lately seen as the most important and most visible intellectual challenge related to the process mining. The method involves automatic construction of process models from event logs about any domain …


Automatic Loop-Invariant Generation And Refinement Through Selective Sampling, Jiaying Li, Jun Sun, Li Li, Quang Loc Le, Shang-Wei Lin Nov 2017

Automatic Loop-Invariant Generation And Refinement Through Selective Sampling, Jiaying Li, Jun Sun, Li Li, Quang Loc Le, Shang-Wei Lin

Research Collection School Of Computing and Information Systems

Automatic loop-invariant generation is important in program analysis and verification. In this paper, we propose to generate loop-invariants automatically through learning and verification. Given a Hoare triple of a program containing a loop, we start with randomly testing the program, collect program states at run-time and categorize them based on whether they satisfy the invariant to be discovered. Next, classification techniques are employed to generate a candidate loop-invariant automatically. Afterwards, we refine the candidate through selective sampling so as to overcome the lack of sufficient test cases. Only after a candidate invariant cannot be improved further through selective sampling, we …


Scalable Online Kernel Learning, Jing Lu Nov 2017

Scalable Online Kernel Learning, Jing Lu

Dissertations and Theses Collection (Open Access)

One critical deficiency of traditional online kernel learning methods is their increasing and unbounded number of support vectors (SV’s), making them inefficient and non-scalable for large-scale applications. Recent studies on budget online learning have attempted to overcome this shortcoming by bounding the number of SV’s. Despite being extensively studied, budget algorithms usually suffer from several drawbacks.
First of all, although existing algorithms attempt to bound the number of SV’s at each iteration, most of them fail to bound the number of SV’s for the final averaged classifier, which is commonly used for online-to-batch conversion. To solve this problem, we propose …


Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan Sep 2017

Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan

Faculty Publications

We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) …


Effect Of Label Noise On The Machine-Learned Classification Of Earthquake Damage, Jared Frank, Umaa Rebbapragada, James Bialas, Thomas Oommen, Timothy C. Havens Aug 2017

Effect Of Label Noise On The Machine-Learned Classification Of Earthquake Damage, Jared Frank, Umaa Rebbapragada, James Bialas, Thomas Oommen, Timothy C. Havens

Michigan Tech Publications

Automated classification of earthquake damage in remotely-sensed imagery using machine learning techniques depends on training data, or data examples that are labeled correctly by a human expert as containing damage or not. Mislabeled training data are a major source of classifier error due to the use of imprecise digital labeling tools and crowdsourced volunteers who are not adequately trained on or invested in the task. The spatial nature of remote sensing classification leads to the consistent mislabeling of classes that occur in close proximity to rubble, which is a major byproduct of earthquake damage in urban areas. In this study, …


Unsupervised Biomedical Named Entity Recognition, Omid Ghiasvand Aug 2017

Unsupervised Biomedical Named Entity Recognition, Omid Ghiasvand

Theses and Dissertations

Named entity recognition (NER) from text is an important task for several applications, including in the biomedical domain. Supervised machine learning based systems have been the most successful on NER task, however, they require correct annotations in large quantities for training. Annotating text manually is very labor intensive and also needs domain expertise. The purpose of this research is to reduce human annotation effort and to decrease cost of annotation for building NER systems in the biomedical domain. The method developed in this work is based on leveraging the availability of resources like UMLS (Unified Medical Language System), that contain …


Constructing Interactive Visual Classification, Clustering And Dimension Reduction Models For N-D Data, Boris Kovalerchuk, Dmytro Dovhalets Jul 2017

Constructing Interactive Visual Classification, Clustering And Dimension Reduction Models For N-D Data, Boris Kovalerchuk, Dmytro Dovhalets

Computer Science Faculty Scholarship

The exploration of multidimensional datasets of all possible sizes and dimensions is a long-standing challenge in knowledge discovery, machine learning, and visualization. While multiple efficient visualization methods for n-D data analysis exist, the loss of information, occlusion, and clutter continue to be a challenge. This paper proposes and explores a new interactive method for visual discovery of n-D relations for supervised learning. The method includes automatic, interactive, and combined algorithms for discovering linear relations, dimension reduction, and generalization for non-linear relations. This method is a special category of reversible General Line Coordinates (GLC). It produces graphs in 2-D that represent …


Knowledge Extraction From Metacognitive Reading Strategies Data Using Induction Trees, Christopher Taylor, Arun D. Kulkarni, Kouider Mokhtari Jul 2017

Knowledge Extraction From Metacognitive Reading Strategies Data Using Induction Trees, Christopher Taylor, Arun D. Kulkarni, Kouider Mokhtari

Arun Kulkarni

The assessment of students’ metacognitive knowledge and skills about reading is critical in determining their ability to read academic texts and do so with comprehension. In this paper, we used induction trees to extract metacognitive knowledge about reading from a reading strategies dataset obtained from a group of 1636 undergraduate college students. Using a C4.5 algorithm, we constructed decision trees, which helped us classify participants into three groups based on their metacognitive strategy awareness levels consisting of global, problem-solving and support reading strategies. We extracted rules from these decision trees, and in order to evaluate accuracy of the extracted rules, …


Time-Series Link Prediction Using Support Vector Machines, Proceso L. Fernandez Jr, Jan Miles Co Jun 2017

Time-Series Link Prediction Using Support Vector Machines, Proceso L. Fernandez Jr, Jan Miles Co

Department of Information Systems & Computer Science Faculty Publications

The prominence of social networks motivates developments in network analysis, such as link prediction, which deals with predicting the existence or emergence of links on a given network. The Vector Auto Regression (VAR) technique has been shown to be one of the best for time-series based link prediction. One VAR technique implementation uses an unweighted adjacency matrix and five additional matrices based on the similarity metrics of Common Neighbor, Adamic-Adar, Jaccard’s Coefficient, Preferential Attachment and Research Allocation Index. In our previous work, we proposed the use of the Support Vector Machines (SVM) for such prediction task, and, using the same …


On The Role Of Genetic Algorithms In The Pattern Recognition Task Of Classification, Isaac Ben Sherman May 2017

On The Role Of Genetic Algorithms In The Pattern Recognition Task Of Classification, Isaac Ben Sherman

Masters Theses

In this dissertation we ask, formulate an apparatus for answering, and answer the following three questions: Where do Genetic Algorithms fit in the greater scheme of pattern recognition? Given primitive mechanics, can Genetic Algorithms match or exceed the performance of theoretically-based methods? Can we build a generic universal Genetic Algorithm for classification? To answer these questions, we develop a genetic algorithm which optimizes MATLAB classifiers and a variable length genetic algorithm which does classification based entirely on boolean logic. We test these algorithms on disparate datasets rooted in cellular biology, music theory, and medicine. We then get results from these …


Investigation Of New Learning Methods For Visual Recognition, Qingfeng Liu Apr 2017

Investigation Of New Learning Methods For Visual Recognition, Qingfeng Liu

Dissertations

Visual recognition is one of the most difficult and prevailing problems in computer vision and pattern recognition due to the challenges in understanding the semantics and contents of digital images. Two major components of a visual recognition system are discriminatory feature representation and efficient and accurate pattern classification. This dissertation therefore focuses on developing new learning methods for visual recognition.

Based on the conventional sparse representation, which shows its robustness for visual recognition problems, a series of new methods is proposed. Specifically, first, a new locally linear K nearest neighbor method, or LLK method, is presented. The LLK method derives …


Crowdsensing And Analyzing Micro-Event Tweets For Public Transportation Insights, Thoong Hoang, Pei Hua (Xu Peihua) Cher, Philips Kokoh Prasetyo, Ee-Peng Lim Feb 2017

Crowdsensing And Analyzing Micro-Event Tweets For Public Transportation Insights, Thoong Hoang, Pei Hua (Xu Peihua) Cher, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Efficient and commuter friendly public transportation system is a critical part of a thriving and sustainable city. As cities experience fast growing resident population, their public transportation systems will have to cope with more demands for improvements. In this paper, we propose a crowdsensing and analysis framework to gather and analyze realtime commuter feedback from Twitter. We perform a series of text mining tasks identifying those feedback comments capturing bus related micro-events; extracting relevant entities; and, predicting event and sentiment labels. We conduct a series of experiments involving more than 14K labeled tweets. The experiments show that incorporating domain knowledge …


A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth Jan 2017

A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth

Kno.e.sis Publications

Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer …


An Introduction To The Theory And Applications Of Bayesian Networks, Anant Jaitha Jan 2017

An Introduction To The Theory And Applications Of Bayesian Networks, Anant Jaitha

CMC Senior Theses

Bayesian networks are a means to study data. A Bayesian network gives structure to data by creating a graphical system to model the data. It then develops probability distributions over these variables. It explores variables in the problem space and examines the probability distributions related to those variables. It conducts statistical inference over those probability distributions to draw meaning from them. They are good means to explore a large set of data efficiently to make inferences. There are a number of real world applications that already exist and are being actively researched. This paper discusses the theory and applications of …


K-Nn-Based Classification Of Sleep Apnea Types Using Ecg, Oğuz Han Ti̇muş, Emi̇ne Doğru Bolat Jan 2017

K-Nn-Based Classification Of Sleep Apnea Types Using Ecg, Oğuz Han Ti̇muş, Emi̇ne Doğru Bolat

Turkish Journal of Electrical Engineering and Computer Sciences

Obstructive sleep apnea syndrome (OSAS) is a common sleep disorder that yields cardiovascular diseases, excessive daytime sleepiness, and poor quality of life if not treated. Classification of OSAS from electrocardiograms (ECGs) is a noninvasive method and much more affordable than traditional methods. This study proposes a pattern recognition system for automated apnea diagnosis based on heart rate variability (HRV) and ECG-derived respiratory signals. The k-nearest neighbor (k-NN) classifier has been used to develop the models for classifying the sleep apnea types. For comparison purposes, classification models based on multilayer perceptron, support vector machines, and C4.5 decision tree (C4.5 DT) have …


Multiclass Semantic Segmentation Of Faces Using Crfs, Khalil Khan, Nasir Ahmad, Khalil Ullah, Irfanud Din Jan 2017

Multiclass Semantic Segmentation Of Faces Using Crfs, Khalil Khan, Nasir Ahmad, Khalil Ullah, Irfanud Din

Turkish Journal of Electrical Engineering and Computer Sciences

Multiclass semantic image segmentation is widely used in a variety of computer vision tasks, such as object segmentation and complex scene understanding. As it decomposes an image into semantically relevant regions, it can be applied in segmentation of face images. In this paper, an algorithm based on multiclass semantic segmentation of faces is proposed using conditional random fields. In the proposed model, each node corresponds to a superpixel, while the neighboring superpixels are connected to nodes through edges. Unlike previous approaches, which rely on three or four classes, the label set is extended here to six classes, i.e. hair, eyes, …


Classification Of Eeg Signals Of Familiar And Unfamiliar Face Stimuli Exploiting Most Discriminative Channels, Abdurrahman Özbeyaz, Sami̇ Arica Jan 2017

Classification Of Eeg Signals Of Familiar And Unfamiliar Face Stimuli Exploiting Most Discriminative Channels, Abdurrahman Özbeyaz, Sami̇ Arica

Turkish Journal of Electrical Engineering and Computer Sciences

The objective of the study is to classify electroencephalogram signals recorded in a familiar and unfamiliar face recognition experiment. Frontal views of familiar and unfamiliar face images were shown to 10 volunteers in different sessions. In contrast to previous studies, no marker button was used during the experiment. Participants had to decide whether the displayed face was familiar or unfamiliar at the instant of stimulus presentation. The signals were analyzed in the preprocessing, channel selection, feature extraction, and classification stages. The novel two-feature extraction and eight-channel selection methods were applied to the analyses. Sixteen classification results were compared and the …


Dtreesim: A New Approach To Compute Decision Tree Similarity Using Re-Mining, Gözde Bakirli, Derya Bi̇rant Jan 2017

Dtreesim: A New Approach To Compute Decision Tree Similarity Using Re-Mining, Gözde Bakirli, Derya Bi̇rant

Turkish Journal of Electrical Engineering and Computer Sciences

A number of recent studies have used a decision tree approach as a data mining technique; some of them needed to evaluate the similarity of decision trees to compare the knowledge reflected in different trees or datasets. There have been multiple perspectives and multiple calculation techniques to measure the similarity of two decision trees, such as using a simple formula or an entropy measure. The main objective of this study is to compute the similarity of decision trees using data mining techniques. This study proposes DTreeSim, a new approach that applies multiple data mining techniques (classification, sequential pattern mining, and …


Abusive Text Detection Using Neural Networks, Hao Chen, Susan Mckeever, Sarah Jane Delany Jan 2017

Abusive Text Detection Using Neural Networks, Hao Chen, Susan Mckeever, Sarah Jane Delany

Articles

Neurall network models have become increasingly popular for text classification in recent years. In particular, the emergence of word embeddings within deep learning architecture has recently attracted a high level of attention amongst researchers.


The Effect Of Code Obfuscation On Authorship Attribution Of Binary Computer Files, Steven Hendrikse Jan 2017

The Effect Of Code Obfuscation On Authorship Attribution Of Binary Computer Files, Steven Hendrikse

CCE Theses and Dissertations

In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator …


Enhanced Breast Cancer Classification With Automatic Thresholding Using Support Vector Machine And Harris Corner Detection, Mohammad Taheri Jan 2017

Enhanced Breast Cancer Classification With Automatic Thresholding Using Support Vector Machine And Harris Corner Detection, Mohammad Taheri

Electronic Theses and Dissertations

Image classification and extracting the characteristics of a tumor are the powerful tools in medical science. In case of breast cancer medical treatment, the breast cancer classification methods can be used to classify input images as benign and malignant classes for better diagnoses and earlier detection with breast tumors. However, classification process can be challenging because of the existence of noise in the images, and complicated structures of the image. Manual classification of the images is timeconsuming, and need to be done only by medical experts. Hence using an automated medical image classification tool is useful and necessary. In addition, …


Denial-Of-Service Attack Modelling And Detection For Http/2 Services, Erwin Adi Jan 2017

Denial-Of-Service Attack Modelling And Detection For Http/2 Services, Erwin Adi

Theses: Doctorates and Masters

Businesses and society alike have been heavily dependent on Internet-based services, albeit with experiences of constant and annoying disruptions caused by the adversary class. A malicious attack that can prevent establishment of Internet connections to web servers, initiated from legitimate client machines, is termed as a Denial of Service (DoS) attack; volume and intensity of which is rapidly growing thanks to the readily available attack tools and the ever-increasing network bandwidths. A majority of contemporary web servers are built on the HTTP/1.1 communication protocol. As a consequence, all literature found on DoS attack modelling and appertaining detection techniques, addresses only …