Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 61 - 86 of 86

Full-Text Articles in Physical Sciences and Mathematics

Javascript Metamorphic Malware Detection Using Machine Learning Techniques, Aakash Wadhwani May 2019

Javascript Metamorphic Malware Detection Using Machine Learning Techniques, Aakash Wadhwani

Master's Projects

Various factors like defects in the operating system, email attachments from unknown sources, downloading and installing a software from non-trusted sites make computers vulnerable to malware attacks. Current antivirus techniques lack the ability to detect metamorphic viruses, which vary the internal structure of the original malware code across various versions, but still have the exact same behavior throughout. Antivirus software typically relies on signature detection for identifying a virus, but code morphing evades signature detection quite effectively.

JavaScript is used to generate metamorphic malware by changing the code’s Abstract Syntax Tree without changing the actual functionality, making it very difficult …


Detection Of Hate Speech In Videos Using Machine Learning, Unnathi Bhandary May 2019

Detection Of Hate Speech In Videos Using Machine Learning, Unnathi Bhandary

Master's Projects

With the progression of the internet and social media, people are given multiple platforms to share their thoughts and opinions about various subject matters freely. However, this freedom of speech is misused to direct hate towards individuals or group of people due to their race, religion, gender etc. The rise of hate speech has led to conflicts and cases of cyber bullying, causing many organizations to look for optimal solutions to solve this problem.

Developments in the field of machine learning and deep learning have piqued the interest of researchers, leading them to research and implement solutions to solve the …


Stock Market Prediction Using Ensemble Of Graph Theory, Machine Learning And Deep Learning Models, Pratik Patil May 2019

Stock Market Prediction Using Ensemble Of Graph Theory, Machine Learning And Deep Learning Models, Pratik Patil

Master's Projects

Efficient Market Hypothesis (EMH) is the cornerstone of the modern financial theory and it states that it is impossible to predict the price of any stock using any trend, fundamental or technical analysis. Stock trading is one of the most important activities in the world of finance. Stock price prediction has been an age-old problem and many researchers from academia and business have tried to solve it using many techniques ranging from basic statistics to machine learning using relevant information such as news sentiment and historical prices. Even though some studies claim to get prediction accuracy higher than a random …


Sentiment Analysis For Search Engine, Saravana Gunaseelan May 2019

Sentiment Analysis For Search Engine, Saravana Gunaseelan

Master's Projects

The chief purpose of this study is to detect and eliminate the sentiment bias in a search engine. Sentiment bias means a bias induced in the search results based on the sentiment of the user’s search query. As people increasing depend on search engines for information, it is important to understand the quality of results produced by the search engines. This study does not try to build a search engine but leverage the existing search engines to provide better results to the user. In this study, only the queries that have high sentiment polarity are analyzed and the machine learning …


Pose Estimation And Action Recognition In Sports And Fitness, Parth Vyas May 2019

Pose Estimation And Action Recognition In Sports And Fitness, Parth Vyas

Master's Projects

The emergence of large datasets and major improvements in Deep Learning has lead to many real-world applications. These applications have been focused on automotive markets, mobile markets, stock markets, and the healthcare market. Although Deep Learning has strong foundations across many areas, the few applications in Sports, Fitness, or even Injury Rehabilitation could benefit greatly from it. For example, if you are performing a workout and you need to evaluate your form, but do not have access or resources for an instructor to evaluate your form, it would be great to have an Artificial Intelligent agent provide real time feedback …


Detecting Cars In A Parking Lot Using Deep Learning, Samuel Ordonia May 2019

Detecting Cars In A Parking Lot Using Deep Learning, Samuel Ordonia

Master's Projects

Detection of cars in a parking lot with deep learning involves locating all objects of interest in a parking lot image and classifying the contents of all bounding boxes as cars. Because of the variety of shape, color, contrast, pose, and occlusion, a deep neural net was chosen to encompass all the significant features required by the detector to differentiate cars from not cars. In this project, car detection was accomplished with a convolutional neural net (CNN) based on the You Only Look Once (YOLO) model architectures. An application was built to train and validate a car detection CNN as …


An Ensemble Model For Click Through Rate Prediction, Muthaiah Ramanathan May 2019

An Ensemble Model For Click Through Rate Prediction, Muthaiah Ramanathan

Master's Projects

Internet has become the most prominent and accessible way to spread the news about an event or to pitch, advertise and sell a product, globally. The success of any advertisement campaign lies in reaching the right class of target audience and eventually convert them as potential customers in the future. Search engines like the Google, Yahoo, Bing are a few of the most used ones by the businesses to market their product. Apart from this, certain websites like the www.alibaba.com that has more traffic also offer services for B2B customers to set their advertisement campaign. The look of the advertisement, …


Multifamily Malware Models, Samanvitha Basole May 2019

Multifamily Malware Models, Samanvitha Basole

Master's Projects

When training a machine learning model, there is likely to be a tradeoff between the accuracy of the model and the generality of the dataset. Previous research has shown that if we train a model to detect one specific malware family, we obtain stronger results as compared to a case where we train a single model on multiple diverse families. During the detection phase, it would be more efficient to have a single model that could detect multiple families, rather than having to score each sample against multiple models. In this research, we conduct experiments to quantify the relationship between …


Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna May 2019

Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna

Master's Projects

We consider the problem of identifying the classic cipher that was used to generate a given ciphertext message. We assume that the plaintext is English and we restrict our attention to ciphertext consisting only of alphabetic characters. Among the classic ciphers considered are the simple substitution, Vigenère cipher, playfair cipher, and column transposition cipher. The problem of classification is approached in two ways. The first method uses support vector machines (SVM) trained directly on ciphertext to classify the ciphers. In the second approach, we train hidden Markov models (HMM) on each ciphertext message, then use these trained HMMs as features …


Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha May 2019

Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha

Master's Projects

In resource constrained devices, malware detection is typically based on offline analysis using emulation. In previous work it has been claimed that such emulation fails for a significant percentage of Android malware because well-designed malware detects that the code is being emulated. An alternative to emulation is malware analysis based on code that is executing on an actual Android device. In this research, we collect features from a corpus of Android malware using both emulation and on-phone instrumentation. We train machine learning models based on emulated features and also train models based on features collected via instrumentation, and we compare …


Smartphone Gesture-Based Authentication, Preethi Sundaravaradhan May 2019

Smartphone Gesture-Based Authentication, Preethi Sundaravaradhan

Master's Projects

In this research, we consider the problem of authentication on a smartphone based on gestures, that is, movements of the phone. Accelerometer data from a number of subjects was collected and we analyze this data using a variety of machine learning techniques, including support vector machines (SVM) and convolutional neural networks (CNN). We analyze both the fraud rate (or false accept rate) and insult rate (or false reject rate) in each case.


Deep Learning For Image Spam Detection, Tazmina Sharmin May 2019

Deep Learning For Image Spam Detection, Tazmina Sharmin

Master's Projects

Spam can be defined as unsolicited bulk email. In an effort to evade text-based spam filters, spammers can embed their spam text in an image, which is referred to as image spam. In this research, we consider the problem of image spam detection, based on image analysis. We apply various machine learning and deep learning techniques to real-world image spam datasets, and to a challenge image spam-like dataset. We obtain results comparable to previous work for the real-world datasets, while our deep learning approach yields the best results to date for the challenge dataset.


Classification Of Malware Models, Akriti Sethi May 2019

Classification Of Malware Models, Akriti Sethi

Master's Projects

Automatically classifying similar malware families is a challenging problem. In this research, we attempt to classify malware families by applying machine learning to machine learning models. Specifically, we train hidden Markov models (HMM) for each malware family in our dataset. The resulting models are then compared in two ways. First, we treat the HMM matrices as images and experiment with convolutional neural networks (CNN) for image classification. Second, we apply support vector machines (SVM) to classify the HMMs. We analyze the results and discuss the relative advantages and disadvantages of each approach.


Machine Learning Versus Deep Learning For Malware Detection, Parth Jain May 2019

Machine Learning Versus Deep Learning For Malware Detection, Parth Jain

Master's Projects

It is often claimed that the primary advantage of deep learning is that such models can continue to learn as more data is available, provided that sufficient computing power is available for training. In contrast, for other forms of machine learning it is claimed that models ‘‘saturate,’’ in the sense that no additional learning can occur beyond some point, regardless of the amount of data or computing power available. In this research, we compare the accuracy of deep learning to other forms of machine learning for malware detection, as a function of the training dataset size. We experiment with a …


Masquerade Detection In Automotive Security, Ashraf Saber May 2019

Masquerade Detection In Automotive Security, Ashraf Saber

Master's Projects

In this paper, we consider intrusion detection systems (IDS) in the context of a controller area network (CAN), which is also known as the CAN bus. We provide a discussion of various IDS topics, including masquerade detection, and we include a selective survey of previous research involving IDS in a CAN network. We also discuss background topics and relevant practical issues, such as data collection on the CAN bus. Finally, we present experimental results where we have applied a variety of machine learning techniques to CAN data. We use both actual and simulated data in order to detect the status …


Ai Dining Suggestion App, Bao Pham May 2019

Ai Dining Suggestion App, Bao Pham

Master's Projects

Trying to decide what to eat can sometimes be challenging and time-consuming for people. Google and Yelp have large scale data sets of restaurant information as well as Application Program Interfaces (APIs) for using them. This restaurant data includes time, price range, traffic, temperature, etc. The goal of this project is to build an app that eases the process of finding a restaurant to eat. This app has a Tinder-like user friendly User Interface (UI) design to change the common way that lists of restaurants are presented to users on mobile apps. It also uses the help of Artificial Intelligence …


Topic Classification Using Hybrid Of Unsupervised And Supervised Learning, Jayant Shelke May 2019

Topic Classification Using Hybrid Of Unsupervised And Supervised Learning, Jayant Shelke

Master's Projects

There has been research around the idea of representing words in text as vectors and many models proposed that vary in performance as well as applications. Text processing is used for content recommendation, sentiment analysis, plagiarism detection, content creation, language translation, etc. to name a few. Specifically, we want to look at the problem of topic detection in text content of articles/blogs/summaries. With the humungous amount of text content published each and every minute on the internet, it is imperative that we have very good algorithms and approaches to analyze all the content and be able to classify most of …


Benchmarking Scalability Of Nosql Databases For Geospatial Queries, Yuvraj Singh Kanwar May 2019

Benchmarking Scalability Of Nosql Databases For Geospatial Queries, Yuvraj Singh Kanwar

Master's Projects

NoSQL databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by NoSQL databases become increasingly essential when dealing with geospatial data. The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of data that the data stores must manage. Such characteristics of big spatial data surpassed the capability and anticipated use cases of relational databases. Because we can choose from an extensive collection of NoSQL databases these days, it becomes vital for organizations to make an informed decision. NoSQL Database benchmarks provide system architects, who shoulder a considerable …


Toward On-Demand Profile Hidden Markov Models For Genetic Barcode Identification, Jessica Sheu May 2019

Toward On-Demand Profile Hidden Markov Models For Genetic Barcode Identification, Jessica Sheu

Master's Projects

Genetic identification aims to solve the shortcomings of morphological identification. By using the cytochrome c oxidase subunit 1 (COI) gene as the Eukaryotic “barcode,” scientists hope to research species that may be morphologically ambiguous, elusive, or similarly difficult to visually identify. Current COI databases allow users to search only for existing database records. However, as the number of sequenced, potential COI genes increases, COI identification tools should ideally also be informative of novel, previously unreported sequences that may represent new species. If an unknown COI sequence does not represent a reported organism, an ideal identification tool would report taxonomic ranks …


Predictive Analysis For Cloud Infrastructure Metrics, Paridhi Agrawal May 2019

Predictive Analysis For Cloud Infrastructure Metrics, Paridhi Agrawal

Master's Projects

In a cloud computing environment, enterprises have the flexibility to request resources according to their application demands. This elastic feature of cloud computing makes it an attractive option for enterprises to host their applications on the cloud. Cloud providers usually exploit this elasticity by auto-scaling the application resources for quality assurance. However, there is a setup-time delay that may take minutes between the demand for a new resource and it being prepared for utilization. This causes the static resource provisioning techniques, which request allocation of a new resource only when the application breaches a specific threshold, to be slow and …


Declassification Of Faceted Values In Javascript, Shreya Gangishetty May 2019

Declassification Of Faceted Values In Javascript, Shreya Gangishetty

Master's Projects

This research addresses the issues with protecting sensitive information at the language level using information flow control mechanisms (IFC). Most of the IFC mechanisms face the challenge of releasing sensitive information in a restricted or limited manner. This research uses faceted values, an IFC mechanism that has shown promising flexibility for downgrading the confidential information in a secure manner, also called declassification.

In this project, we introduce the concept of first-class labels to simplify the declassification of faceted values. To validate the utility of our approach we show how the combination of faceted values and first-class labels can build various …


Assessing Code Obfuscation Of Metamorphic Javascript, Kaushik Murli May 2019

Assessing Code Obfuscation Of Metamorphic Javascript, Kaushik Murli

Master's Projects

Metamorphic malware is one of the biggest and most ubiquitous threats in the digital world. It can be used to morph the structure of the target code without changing the underlying functionality of the code, thus making it very difficult to detect using signature-based detection and heuristic analysis. The focus of this project is to analyze Metamorphic JavaScript malware and techniques that can be used to mutate the code in JavaScript. To assess the capabilities of the metamorphic engine, we performed experiments to visualize the degree of code morphing. Further, this project discusses potential methods that have been used to …


Image Compression Using Neural Networks, Kunal Rajan Deshmukh May 2019

Image Compression Using Neural Networks, Kunal Rajan Deshmukh

Master's Projects

Image compression is a well-studied field of Computer Vision. Recently, many neural network based architectures have been proposed for image compression as well as enhancement. These networks are also put to use by frameworks such as end-to-end image compression.

In this project, we have explored the improvements that can be made over this framework to achieve better benchmarks in compressing images. Generative Adversarial Networks are used to generate new fake images which are very similar to original images. Single Image Super-Resolution Generative Adversarial Networks

(SI-SRGAN) can be employed to improve image quality. Our proposed architecture can be divided into four …


Species Classification Using Dna Barcoding And Profile Hidden Markov Models, Sphoorti Poojary May 2019

Species Classification Using Dna Barcoding And Profile Hidden Markov Models, Sphoorti Poojary

Master's Projects

Traditional classification systems for living organisms like the Linnaean taxonomy involved classification based on morphological features of species. This traditional system is being replaced by molecular approaches which involve using gene sequences. The COI gene, also known as the ”DNA barcode” since it is unique in every species, can be used to uniquely identify organisms and thus, classify them. Classifying using gene sequences has many advantages, including correct identification of cryptic species(individuals which appear similar but belong to different species) and species which are extremely small in size. In this project, I worked on classifying COI sequences of unknown species …


Nitrogenase Iron Protein Detection Using Neural Network, Ishan Shinde May 2019

Nitrogenase Iron Protein Detection Using Neural Network, Ishan Shinde

Master's Projects

Nitrogenase Iron Protein (nifH) is the enzyme responsible for nitrogen fixation. Microbes with nifH gene are responsible for injecting reduced nitrogen into the biosphere, which is essential for all living things. Obtaining sequences from GenBank database is problematic due to annotation errors, nomenclature variation and paralogues. One possible solution could be to retrieve sequences from the GenBank database and use a sequence classifier to label the sequences. In this research, we convert sequences to images and build a nifH sequence classifier using image processing and convolutional neural network. We built a nifH classification model which can classify sequences with an …


The South Bay Water Recycling Program: An Evaluation Of Water Recycling Outcomes In Comparison To Selected Cities And Countries, Shannon Nguyen May 2019

The South Bay Water Recycling Program: An Evaluation Of Water Recycling Outcomes In Comparison To Selected Cities And Countries, Shannon Nguyen

Master's Projects

Is the South Bay Water Recycling (SBWR) program achieving its planned recycled water outcomes? This research will compare the SBWR program's 2018 recycled water data with other water reuse programs in Las Vegas, Orange County, Singapore, and Australia. The purpose of the research is to determine whether the SBWR program is achieving its goals for conserving fresh water for beneficial reuse, and how the outcomes compare with selected cities and countries.