Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 70

Full-Text Articles in Physical Sciences and Mathematics

Image-Based Malware Classification With Convolutional Neural Networks And Extreme Learning Machines, Mugdha Jain Dec 2019

Image-Based Malware Classification With Convolutional Neural Networks And Extreme Learning Machines, Mugdha Jain

Master's Projects

Research in the field of malware classification often relies on machine learning models that are trained on high level features, such as opcodes, function calls, and control flow graphs. Extracting such features is costly, since disassembly or code execution is generally required. In this research, we conduct experiments to train and evaluate machine learning models for malware classification, based on features that can be obtained without disassembly or execution of code. Specifically, we visualize malware samples as images and employ image analysis techniques. In this context, we focus on two machine learning models, namely, Convolutional Neural Networks (CNN) and Extreme …


Hot Fusion Vs Cold Fusion For Malware Detection, Snehal Bichkar Dec 2019

Hot Fusion Vs Cold Fusion For Malware Detection, Snehal Bichkar

Master's Projects

A fundamental problem in malware research consists of malware detection, that is, dis- tinguishing malware samples from benign samples. This problem becomes more challeng- ing when we consider multiple malware families. A typical approach to this multi-family detection problem is to train a machine learning model for each malware family and score each sample against all models. The resulting scores are then used for classification. We refer to this approach as “cold fusion,” since we combine previously-trained models—no retraining of these base models is required when additional malware families are considered. An alternative approach is to train a single model …


Detecting Myocardial Infarctions Using Machine Learning Methods, Aniruddh Mathur Dec 2019

Detecting Myocardial Infarctions Using Machine Learning Methods, Aniruddh Mathur

Master's Projects

Myocardial Infarction (MI), commonly known as a heart attack, occurs when one of the three major blood vessels carrying blood to the heart get blocked, causing the death of myocardial (heart) cells. If not treated immediately, MI may cause cardiac arrest, which can ultimately cause death. Risk factors for MI include diabetes, family history, unhealthy diet and lifestyle. Medical treatments include various types of drugs and surgeries which can prove very expensive for patients due to high healthcare costs. Therefore, it is imperative that MI is diagnosed at the right time. Electrocardiography (ECG) is commonly used to detect MI. ECG …


Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg Dec 2019

Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg

Master's Projects

Inadequate drug experimental data and the use of unlicensed drugs may cause adverse drug reactions, especially in pediatric populations. Every year the U.S. Food and Drug Administration approves human prescription drugs for marketing. The labels associated with these drugs include information about clinical trials and drug response in pediatric population. In order for doctors to make an informed decision about the safety and effectiveness of these drugs for children, there is a need to analyze complex and often unstructured drug labels. In this work, first, an exploratory analysis of drug labels using a Natural Language Processing pipeline is performed. Second, …


Assessing Wildfire Damage From High Resolution Satellite Imagery Using Classification Algorithms, Ai-Linh Alten Dec 2019

Assessing Wildfire Damage From High Resolution Satellite Imagery Using Classification Algorithms, Ai-Linh Alten

Master's Projects

Wildfire damage assessments are important information for first responders, govern- ment agencies, and insurance companies to estimate the cost of damages and to help provide relief to those affected by a wildfire. With the help of Earth Observation satellite technology, determining the burn area extent of a fire can be done with traditional remote sensing methods like Normalized Burn Ratio. Using Very High Resolution satellites can help give even more accurate damage assessments but will come with some tradeoffs; these satellites can provide higher spatial and temporal resolution at the expense of better spectral resolution. As a wildfire burn area …


Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach, Omid Sharagi Dec 2019

Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach, Omid Sharagi

Master's Projects

This study observes how an evidential reasoning approach can be used as a diagnostic tool for early detection of pancreatic cancer. The evidential reasoning model combines the output of a linear Support Vector Classifier (SVC) with factors such as smoking history, health history, biopsy location, NGS technology used, and more to predict the likelihood of the disease. The SVC was trained using genomic data of pancreatic cancer patients derived from the National Cancer Institute (NIH) Genomic Data Commons (GDC). To test the evidential reasoning model, a variety of synthetic data was compiled to test the impact of combinations of different …


Image-Based Localization Of User-Interfaces, Riti Gupta Dec 2019

Image-Based Localization Of User-Interfaces, Riti Gupta

Master's Projects

Image localization corresponds to translating the text present in the images from one language to other language. The aim of the project is to develop a methodology to translate the text in image captions from English to Hindi by taking context of the images into account. A lot of work has been done in this field [22], but our aim was to explore if the accuracy can be further improved by consideration of the additional information imparted by the images apart from the text. We have explored Deep Learning using neural networks for this project. In particular, Recurrent Neural Networks …


3d Shape Prediction On Convolutional Deep Belief Networks, Gregory Y. Enriquez Dec 2019

3d Shape Prediction On Convolutional Deep Belief Networks, Gregory Y. Enriquez

Master's Projects

The field of image recognition software has grown immensely in recent years with the emergence of new deep learning techniques. Deep belief networks inspired by Hinton [11] were one of the earliest methodologies of deep learning in the late 2000s. More recently, convolutional neural networks have been used in deep learning techniques, architecture, and software to identify patterns in imagery in order to make predictions such as classification, image segmentation, etc. Traditional two-dimensional, or 2D, images stored as picture files, typically contain red, green, and blue color data for each individual pixel in the picture. However, more recent commercial 2.5D …


A Hybrid Approach For Multi-Document Text Summarization, Rashmi Varma Dec 2019

A Hybrid Approach For Multi-Document Text Summarization, Rashmi Varma

Master's Projects

Text summarization has been a long studied topic in the field of natural language processing. There have been various approaches for both extractive text summarization as well as abstractive text summarization. Summarizing texts for a single document is a methodical task. But summarizing multiple documents poses as a greater challenge. This thesis explores the application of Latent Semantic Analysis, Text-Rank, Lex-Rank and Reduction algorithms for single document text summarization and compares it with the proposed approach of creating a hybrid system combining each of the above algorithms, individually, with Restricted Boltzmann Machines for multi-document text summarization and analyzing how all …


Music Retrieval System Using Query-By-Humming, Parth Patel Dec 2019

Music Retrieval System Using Query-By-Humming, Parth Patel

Master's Projects

Music Information Retrieval (MIR) is a particular research area of great interest because there are various strategies to retrieve music. To retrieve music, it is important to find a similarity between the input query and the matching music. Several solutions have been proposed that are currently being used in the application domain(s) such as Query- by-Example (QBE) which takes a sample of an audio recording playing in the background and retrieves the result. However, there is no efficient approach to solve this problem in a Query-by-Humming (QBH) application. In a Query-by-Humming application, the aim is to retrieve music that is …


Fast High Resolution Image Completion, Chinmay Mishra May 2019

Fast High Resolution Image Completion, Chinmay Mishra

Master's Projects

This paper presents a method for image completion, an active research area in the field of computer vision. The method described in the paper aims at achieving comparable results to other state of the art methods with approximately four and a half times reduction in training time. It is a two step procedure which involves image completion and enhancing the resolution of the completed image. We use the SSIM metric to evaluate the quality of the completed image and to also time our model against other image completion models.


Music Mood Classification Using Convolutional Neural Networks, Revanth Akella May 2019

Music Mood Classification Using Convolutional Neural Networks, Revanth Akella

Master's Projects

Grouping music into moods is useful as music is migrating from to online streaming services as it can help in recommendations. To establish the connection between music and mood we develop an end-to-end, open source approach for mood classification using lyrics. We develop a pipeline for tag extraction, lyric extraction, and establishing classification models for classifying music into moods. We investigate techniques to classify music into moods using lyrics and audio features. Using various natural language processing methods with machine learning and deep learning we perform a comparative study across different classification and mood models. The results infer that features …


Learning For Free – Object Detectors Trained On Synthetic Data, Charles Thane Mackay May 2019

Learning For Free – Object Detectors Trained On Synthetic Data, Charles Thane Mackay

Master's Projects

A picture is worth a thousand words, or if you want it labeled, it’s worth about four cents per bounding box. Data is the fuel that powers modern technologies run by artificial intelligence engines which is increasingly valuable in today’s industry. High quality labeled data is the most important factor in producing accurate machine learning models which can be used to make powerful predictions and identify patterns humans may not see. Acquiring high quality labeled data however, can be expensive and time consuming. For small companies, academic researchers, or machine learning hobbyists, gathering large datasets for a specific task that …


Detecting Crispr Arrays Using Long-Short Term Memory Network, Shantanu Deshmukh May 2019

Detecting Crispr Arrays Using Long-Short Term Memory Network, Shantanu Deshmukh

Master's Projects

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) is a se- quence found in the DNA sequence of an organism. It provides provides immunity to the organism. Recently, it was found that the CRISPR-based immunity mechanism can be manipulated to perform genome editing. The problem is, it is hard to know the specificity of this system and in turn, making it highly specific is difficult. More re- search is required to improve this CRISPR-based genome editing. Detecting CRISPR arrays in the DNA sequence is the first step towards this research. In this work, a CRISPR array detection pipeline, CRISPRLstm, is proposed. …


Context-Based Multi-Stage Offline Handwritten Mathematical Symbol Recognition Using Deep Learning, Sui Kun Guan May 2019

Context-Based Multi-Stage Offline Handwritten Mathematical Symbol Recognition Using Deep Learning, Sui Kun Guan

Master's Projects

We propose a multi-stage machine learning (ML) architecture to improve the accuracy of offline handwritten mathematical symbol recognition. In the first stage, we train and assemble multiple deep convolutional neural networks to classify isolated mathematical symbols. However, certain ambiguous symbols are hard to classify without the context information of the mathematical expressions where the symbols belong. In the second stage, we train a deep convolutional neural network that further classifies the ambiguous symbols based on the context information of the symbols. To further improve the classification accuracy, in the third stage, we develop a set of rules to classify the …


Machine Learning In Crop Classification Of Temporal Multispectral Satellite Image, Ravali Koppaka May 2019

Machine Learning In Crop Classification Of Temporal Multispectral Satellite Image, Ravali Koppaka

Master's Projects

Recently, there has been a remarkable growth in Artificial Intelligence (AI) with

the development of efficient AI models and high-power computational resources for processing complex datasets. There has been a growing number of applications of machine learning in satellite remote sensing image data processing. In this work, machine learning methods were applied for crop classification of temporal multi- spectral satellite image to achieve better prediction of crop-wise area statistics. In India, agriculture has a huge impact on the national economy and most of the critical decisions are dependent on agricultural statistics. Sentinel-2 satellite image data for the Guntur district region …


Designing Single Guide Rnas For Crispr/Cas9, Neha Atul Bhagwat May 2019

Designing Single Guide Rnas For Crispr/Cas9, Neha Atul Bhagwat

Master's Projects

Researchers have been working towards development of tools to facilitate regular use genome engineering techniques. In recent years, the focus of these efforts has been the Clustered Regularly Interspaced Short Palindromic Repeats(CRISPR)/CRISPR associated(Cas) systems. These systems, while found naturally in bacteria and archaea as an immunity mechanism, can be used for genome engineering in eukaryotes.

There are three major computational challenges associated with the use of CRISPR/Cas9 in genome engineering for mammals - identification of CRISPR arrays, single guide RNA design and minimizing off-target effects. This project attempts to solve the problem of single guide RNA design using a novel …


Sql Injection Detection Using Machine Learning, Sonali Mishra May 2019

Sql Injection Detection Using Machine Learning, Sonali Mishra

Master's Projects

Sharing information over the Internet over multiple platforms and web-applications has become a quite common phenomenon in the recent times. The web-based applications that accept critical information from users store this information in databases. These applications and the databases connected to them are susceptible to all kinds of information security threats due to being accessible through the Internet. The threats include attacks such as Cross Side Scripting (CSS), Denial of Service Attack (DoS0, and Structured Query Language (SQL) Injection attacks. SQL Injection attacks fall under the top ten vulnerabilities when we talk about web-based applications. Through this kind of attack, …


Deep Learning Based Real Time Devanagari Character Recognition, Aseem Chhabra May 2019

Deep Learning Based Real Time Devanagari Character Recognition, Aseem Chhabra

Master's Projects

The revolutionization of the technology behind optical character recognition (OCR) has helped it to become one of those technologies that have found plenty of uses in the entire industrial space. Today, the OCR is available for several languages and have the capability to recognize the characters in real time, but there are some languages for which this technology has not developed much. All these advancements have been possible because of the introduction of concepts like artificial intelligence and deep learning. Deep Neural Networks have proven to be the best choice when it comes to a task involving recognition. There are …


Learning To Play The Trading Game, Neeraj Kulkarni May 2019

Learning To Play The Trading Game, Neeraj Kulkarni

Master's Projects

Can we train a stock trading bot that can take decisions in high-entropy envi- ronments like stock markets to generate profits based on some optimal policy? Can we further extend this learning for any general trading problem? Quantitative Al- gorithms are responsible for more than 75% of the stock trading around the world. Creating a stock market prediction model is comparatively easy. But creating a prof- itable prediction model is still considered as a challenging task in the field of machine learning and deep learning due to the unpredictability of the financial markets. Us- ing biologically inspired computing techniques of …


Graph Classification Using Machine Learning Algorithms, Monica Golahalli Seenappa May 2019

Graph Classification Using Machine Learning Algorithms, Monica Golahalli Seenappa

Master's Projects

In the Graph classification problem, given is a family of graphs and a group of different categories, and we aim to classify all the graphs (of the family) into the given categories. Earlier approaches, such as graph kernels and graph embedding techniques have focused on extracting certain features by processing the entire graph. However, real world graphs are complex and noisy and these traditional approaches are computationally intensive. With the introduction of the deep learning framework, there have been numerous attempts to create more efficient classification approaches.

For this project, we will be focusing on modifying an existing kernel graph …


Intelligent Log Analysis For Anomaly Detection, Steven Yen May 2019

Intelligent Log Analysis For Anomaly Detection, Steven Yen

Master's Projects

Computer logs are a rich source of information that can be analyzed to detect various issues. The large volumes of logs limit the effectiveness of manual approaches to log analysis. The earliest automated log analysis tools take a rule-based approach, which can only detect known issues with existing rules. On the other hand, anomaly detection approaches can detect new or unknown issues. This is achieved by looking for unusual behavior different from the norm, often utilizing machine learning (ML) or deep learning (DL) models. In this project, we evaluated various ML and DL techniques used for log anomaly detection. We …


Over Speed Detection Using Artificial Intelligence, Samkit Patira May 2019

Over Speed Detection Using Artificial Intelligence, Samkit Patira

Master's Projects

Over speeding is one of the most common traffic violations. Around 41 million people are issued speeding tickets each year in USA i.e one every second. Existing approaches to detect over- speeding are not scalable and require manual efforts. In this project, by the use of computer vision and artificial intelligence, I have tried to detect over speeding and report the violation to the law enforcement officer. It was observed that when predictions are done using YoloV3, we get the best results.


Tsar : A System For Defending Hate Speech Detection Models Against Adversaries, Brian Tuan Khieu May 2019

Tsar : A System For Defending Hate Speech Detection Models Against Adversaries, Brian Tuan Khieu

Master's Projects

Although current state-of-the-art hate speech detection models achieve praiseworthy results, these models have shown themselves to be vulnerable to attack. Easy to execute lexical manipulations such as the removal of whitespace from a given text create significant issues for word-based hate speech detection models. In this paper, we reproduce the results of five cutting edge models as well as four significant evasion schemes from prior work. Only a limited amount of evasion schemes that also maintain readability exists, and this works to our advantage in the recreation of the original data. Furthermore, we demonstrate that each lexical attack or evasion …


Influence Analysis Based On Political Twitter Data, Jace Rose May 2019

Influence Analysis Based On Political Twitter Data, Jace Rose

Master's Projects

Studies of online behavior often consider how users interact online, their posting behaviors, what they are tweeting about, and how likely they are to follow other people. The problem is there is that no deeper study on the people that a user has interacted with and how these other users affect them. This study examines if it is possible to draw similar sentiment from users with whom the target user has interacted with. The data collection process gathers data from Twitter users posting to popular political hashtags, which the highest at the time published were #MAGA and #TRUMP, as well …


Using Computer Vision To Quantify Coral Reef Biodiversity, Niket Bhodia May 2019

Using Computer Vision To Quantify Coral Reef Biodiversity, Niket Bhodia

Master's Projects

The preservation of the world’s oceans is crucial to human survival on this planet, yet we know too little to begin to understand anthropogenic impacts on marine life. This is especially true for coral reefs, which are the most diverse marine habitat per unit area (if not overall) as well as the most sensitive. To address this gap in knowledge, simple field devices called autonomous reef monitoring structures (ARMS) have been developed, which provide standardized samples of life from these complex ecosystems. ARMS have now become successful to the point that the amount of data collected through them has outstripped …


Yoda – Your Only Design Assistant, Siddharth Kulkarni May 2019

Yoda – Your Only Design Assistant, Siddharth Kulkarni

Master's Projects

Converting user interface designs created by graphic designers into computer code is a typical job of a front end engineer in order to develop functional web and mobile applications. This conversion process can often be extremely tedious, slow and prone to human error. In this project, deep learning based object detection along with optical character recognition is used to generate platform ready prototypes directly from design sketches. Also, a new design language is introduced to facilitate expressive prototyping and allowing the creation of more expressive and functional designs. It is observed that the AI powered application along with modern web …


Robust Lightweight Object Detection, Siddharth Kumar May 2019

Robust Lightweight Object Detection, Siddharth Kumar

Master's Projects

Object detection is a very challenging problem in computer vision and has been a prominent subject of research for nearly three decades. There has been a promising in- crease in the accuracy and performance of object detectors ever since deep convolutional networks (CNN) were introduced. CNNs can be trained on large datasets made of high resolution images without flattening them, thereby using the spatial information. Their superior learning ability also makes them ideal for image classification and object de- tection tasks. Unfortunately, this power comes at the big cost of compute and memory. For instance, the Faster R-CNN detector required …


Benchmarking Optimization Algorithms For Capacitated Vehicle Routing Problems, Pratik Surana May 2019

Benchmarking Optimization Algorithms For Capacitated Vehicle Routing Problems, Pratik Surana

Master's Projects

The Vehicle Routing Problem (VRP) originated in the 1950s when algorithms and mathematical approaches were applied to find solutions for routing vehicles. Since then, there has been extensive research in the field of VRPs to solve real-life problems. The process of generating an optimal routing schedule for a VRP is complex due to two reasons. First, VRP is considered to be an NP-Hard problem. Second, there are several constraints involved, such as the number of available vehicles, the vehicle capacities, time-windows for pickup or delivery etc.

The main goal for this project was to compare different optimization algorithms for solving …


Deep Learning On Graphs Using Graph Convolutional Networks, Saurabh Mithe May 2019

Deep Learning On Graphs Using Graph Convolutional Networks, Saurabh Mithe

Master's Projects

Graphs are a powerful way to model network data with the objects as nodes and the relationship between the various objects as links. Such graphs contain a plethora of valuable information about the underlying data which can be extracted, analyzed, and visualized using Machine Learning (ML). The challenge to this task is that graphs are non-Euclidean structures which means that they cannot be directly used with ML techniques because ML techniques only work with Euclidean structures like grids or sequences. In order to overcome this challenge, the graph structure first needs to be encoded into an equivalent Euclidean representation in …