Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Physical Sciences and Mathematics

Pattern Discovery In Dna Using Stochastic Automata, Shweta Shweta Dec 2015

Pattern Discovery In Dna Using Stochastic Automata, Shweta Shweta

Master's Projects

We consider the problem of identifying similarities between different species of DNA. To do this we infer a stochastic finite automata from a given training data and compare it with a test data. The training and test data consist of DNA sequence of different species. Our method first identifies sentences in DNA. To identify sentences we read DNA sequence one character at a time, 3 characters form a codon and codons form proteins (also known as amino acid chains).Each amino acid in proteins belongs to a group. In total we have 5 groups’ polar, non-polar, acidic, basic and stop codons. …


Clustering Versus Svm For Malware Detection, Usha Narra May 2015

Clustering Versus Svm For Malware Detection, Usha Narra

Master's Projects

Previous work has shown that we can effectively cluster certain classes of mal- ware into their respective families. In this research, we extend this previous work to the problem of developing an automated malware detection system. We first compute clusters for a collection of malware families. Then we analyze the effectiveness of clas- sifying new samples based on these existing clusters. We compare results obtained using �-means and Expectation Maximization (EM) clustering to those obtained us- ing Support Vector Machines (SVM). Using clustering, we are able to detect some malware families with an accuracy comparable to that of SVMs. One …


Optimization Of Scheduling And Dispatching Cars On Demand, Vu Tran May 2015

Optimization Of Scheduling And Dispatching Cars On Demand, Vu Tran

Master's Projects

Taxicab is the most common type of on-demand transportation service in the city because its dispatching system offers better services in terms of shorter wait time. However, the shorter wait time and travel time for multiple passengers and destinations are very considerable. There are recent companies implemented the real-time ridesharing model that expects to reduce the riding cost when passengers are willing to share their rides with the others. This model does not solve the shorter wait time and travel time when there are multiple passengers and destinations. This paper investigates how the ridesharing can be improved by using the …


A Comparison Of Clustering Techniques For Malware Analysis, Swathi Pai May 2015

A Comparison Of Clustering Techniques For Malware Analysis, Swathi Pai

Master's Projects

In this research, we apply clustering techniques to the malware detection problem. Our goal is to classify malware as part of a fully automated detection strategy. We compute clusters using the well-known �-means and EM clustering algorithms, with scores obtained from Hidden Markov Models (HMM). The previous work in this area consists of using HMM and �-means clustering technique to achieve the same. The current effort aims to extend it to use EM clustering technique for detection and also compare this technique with the �-means clustering.


Malware Detection Using Dynamic Analysis, Swapna Vemparala May 2015

Malware Detection Using Dynamic Analysis, Swapna Vemparala

Master's Projects

In this research, we explore the field of dynamic analysis which has shown promis- ing results in the field of malware detection. Here, we extract dynamic software birth- marks during malware execution and apply machine learning based detection tech- niques to the resulting feature set. Specifically, we consider Hidden Markov Models and Profile Hidden Markov Models. To determine the effectiveness of this dynamic analysis approach, we compare our detection results to the results obtained by using static analysis. We show that in some cases, significantly stronger results can be obtained using our dynamic approach.


Using Neural Networks For Image Classification, Tim Kang May 2015

Using Neural Networks For Image Classification, Tim Kang

Master's Projects

This paper will focus on applying neural network machine learning methods to images for the purpose of automatic detection and classification. The main advantage of using neural network methods in this project is its adeptness at fitting non­linear data and its ability to work as an unsupervised algorithm. The algorithms will be run on common, publically available datasets, namely the MNIST and CIFAR­10, so that our results will be easily reproducible.


Using Probabilistic Graphical Models To Solve Np-Complete Puzzle Problems, Fengjiao Wu May 2015

Using Probabilistic Graphical Models To Solve Np-Complete Puzzle Problems, Fengjiao Wu

Master's Projects

Probabilistic Graphical Models (PGMs) are commonly used in machine learning to solve problems stemming from medicine, meteorology, speech recognition, image processing, intelligent tutoring, gambling, games, and biology. PGMs are applicable for both directed graph and undirected graph. In this work, I focus on the undirected graphical model. The objective of this work is to study how PGMs can be applied to find solutions to two puzzle problems, sudoku and jigsaw puzzles. First, both puzzle problems are represented as undirected graphs, and then I map the relations of nodes to PGMs and Belief Propagation (BP). This work represents the puzzle grid …


Comparative Analysis Of Particle Swarm Optimization Algorithms For Text Feature Selection, Shuang Wu May 2015

Comparative Analysis Of Particle Swarm Optimization Algorithms For Text Feature Selection, Shuang Wu

Master's Projects

With the rapid growth of Internet, more and more natural language text documents are available in electronic format, making automated text categorization a must in most fields. Due to the high dimensionality of text categorization tasks, feature selection is needed before executing document classification. There are basically two kinds of feature selection approaches: the filter approach and the wrapper approach. For the wrapper approach, a search algorithm for feature subsets and an evaluation algorithm for assessing the fitness of the selected feature subset are required. In this work, I focus on the comparison between two wrapper approaches. These two approaches …


Using Hidden Markov Models To Detect Dna Motifs, Santrupti Nerli May 2015

Using Hidden Markov Models To Detect Dna Motifs, Santrupti Nerli

Master's Projects

During the process of gene expression in eukaryotes, mRNA splicing is one of the key processes carried out by a complex called spliceosome. Spliceosome guarantees proper removal of introns and joining of exons before the translation process. Precise splicing is essential for the production of functional proteins. Spliceosome detects specific sequence motifs within an mRNA sequence called splice sites. Two of the splice sites are the 5’ and 3’ sites that border all the introns. Normal splicing process if disrupted by mutation may lead to fatal diseases. In this work, we predict splice sites in a human genome using hidden …