Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 83

Full-Text Articles in Physical Sciences and Mathematics

Detecting Myocardial Infarctions Using Machine Learning Methods, Aniruddh Mathur Dec 2019

Detecting Myocardial Infarctions Using Machine Learning Methods, Aniruddh Mathur

Master's Projects

Myocardial Infarction (MI), commonly known as a heart attack, occurs when one of the three major blood vessels carrying blood to the heart get blocked, causing the death of myocardial (heart) cells. If not treated immediately, MI may cause cardiac arrest, which can ultimately cause death. Risk factors for MI include diabetes, family history, unhealthy diet and lifestyle. Medical treatments include various types of drugs and surgeries which can prove very expensive for patients due to high healthcare costs. Therefore, it is imperative that MI is diagnosed at the right time. Electrocardiography (ECG) is commonly used to detect MI. ECG ...


Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg Dec 2019

Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg

Master's Projects

Inadequate drug experimental data and the use of unlicensed drugs may cause adverse drug reactions, especially in pediatric populations. Every year the U.S. Food and Drug Administration approves human prescription drugs for marketing. The labels associated with these drugs include information about clinical trials and drug response in pediatric population. In order for doctors to make an informed decision about the safety and effectiveness of these drugs for children, there is a need to analyze complex and often unstructured drug labels. In this work, first, an exploratory analysis of drug labels using a Natural Language Processing pipeline is performed ...


Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach, Omid Sharagi Dec 2019

Toward Early Detection Of Pancreatic Cancer: An Evidence-Based Approach, Omid Sharagi

Master's Projects

This study observes how an evidential reasoning approach can be used as a diagnostic tool for early detection of pancreatic cancer. The evidential reasoning model combines the output of a linear Support Vector Classifier (SVC) with factors such as smoking history, health history, biopsy location, NGS technology used, and more to predict the likelihood of the disease. The SVC was trained using genomic data of pancreatic cancer patients derived from the National Cancer Institute (NIH) Genomic Data Commons (GDC). To test the evidential reasoning model, a variety of synthetic data was compiled to test the impact of combinations of different ...


Image-Based Localization Of User-Interfaces, Riti Gupta Dec 2019

Image-Based Localization Of User-Interfaces, Riti Gupta

Master's Projects

Image localization corresponds to translating the text present in the images from one language to other language. The aim of the project is to develop a methodology to translate the text in image captions from English to Hindi by taking context of the images into account. A lot of work has been done in this field [22], but our aim was to explore if the accuracy can be further improved by consideration of the additional information imparted by the images apart from the text. We have explored Deep Learning using neural networks for this project. In particular, Recurrent Neural Networks ...


3d Shape Prediction On Convolutional Deep Belief Networks, Gregory Y. Enriquez Dec 2019

3d Shape Prediction On Convolutional Deep Belief Networks, Gregory Y. Enriquez

Master's Projects

The field of image recognition software has grown immensely in recent years with the emergence of new deep learning techniques. Deep belief networks inspired by Hinton [11] were one of the earliest methodologies of deep learning in the late 2000s. More recently, convolutional neural networks have been used in deep learning techniques, architecture, and software to identify patterns in imagery in order to make predictions such as classification, image segmentation, etc. Traditional two-dimensional, or 2D, images stored as picture files, typically contain red, green, and blue color data for each individual pixel in the picture. However, more recent commercial 2 ...


A Hybrid Approach For Multi-Document Text Summarization, Rashmi Varma Dec 2019

A Hybrid Approach For Multi-Document Text Summarization, Rashmi Varma

Master's Projects

Text summarization has been a long studied topic in the field of natural language processing. There have been various approaches for both extractive text summarization as well as abstractive text summarization. Summarizing texts for a single document is a methodical task. But summarizing multiple documents poses as a greater challenge. This thesis explores the application of Latent Semantic Analysis, Text-Rank, Lex-Rank and Reduction algorithms for single document text summarization and compares it with the proposed approach of creating a hybrid system combining each of the above algorithms, individually, with Restricted Boltzmann Machines for multi-document text summarization and analyzing how all ...


Music Retrieval System Using Query-By-Humming, Parth Patel Dec 2019

Music Retrieval System Using Query-By-Humming, Parth Patel

Master's Projects

Music Information Retrieval (MIR) is a particular research area of great interest because there are various strategies to retrieve music. To retrieve music, it is important to find a similarity between the input query and the matching music. Several solutions have been proposed that are currently being used in the application domain(s) such as Query- by-Example (QBE) which takes a sample of an audio recording playing in the background and retrieves the result. However, there is no efficient approach to solve this problem in a Query-by-Humming (QBH) application. In a Query-by-Humming application, the aim is to retrieve music that ...


Predicting Switch-Like Behavior In Proteins Using Logistic Regression On Sequence-Based Descriptors, Benjamin Strauss Jul 2019

Predicting Switch-Like Behavior In Proteins Using Logistic Regression On Sequence-Based Descriptors, Benjamin Strauss

Master's Projects

Ligands can bind at specific protein locations, inducing conformational changes such as those involving secondary structure. Identifying these possible switches from sequence, including homology, is an important ongoing area of research. We attempt to predict possible secondary structure switches from sequence in proteins using machine learning, specifically a logistic regression approach with 48 N-acetyltransferases as our learning set and 5 sirtuins as our test set. Validated residue binary assignments of 0 (no change in secondary structure) and 1 (change in secondary structure) were determined (DSSP) from 3D X-ray structures for sets of virtually identical chains crystallized under different conditions. Our ...


Randition: Random Blockchain Partitioning For Write Throughput, David Nguyen May 2019

Randition: Random Blockchain Partitioning For Write Throughput, David Nguyen

Master's Projects

This paper proposes to support dynamic runtime partitioning of Tendermint, which is an in-development state machine replication algorithm that uses the blockchain model to provide Byzantine-fault tolerance. We call this variation Randition. We incorporate recent research from blockchain consensus and replicated state machine partitioning to allow Randition users to partition their blockchain for improved write performance at the cost of some Byzantine fault tolerance. We conduct an experiment to compare the raw write throughput of Randition and Tendermint. Finally, we discuss the experiment results and discuss further improvements to Randition.


Designing Single Guide Rnas For Crispr/Cas9, Neha Atul Bhagwat May 2019

Designing Single Guide Rnas For Crispr/Cas9, Neha Atul Bhagwat

Master's Projects

Researchers have been working towards development of tools to facilitate regular use genome engineering techniques. In recent years, the focus of these efforts has been the Clustered Regularly Interspaced Short Palindromic Repeats(CRISPR)/CRISPR associated(Cas) systems. These systems, while found naturally in bacteria and archaea as an immunity mechanism, can be used for genome engineering in eukaryotes.

There are three major computational challenges associated with the use of CRISPR/Cas9 in genome engineering for mammals - identification of CRISPR arrays, single guide RNA design and minimizing off-target effects. This project attempts to solve the problem of single guide RNA design ...


Context-Based Multi-Stage Offline Handwritten Mathematical Symbol Recognition Using Deep Learning, Sui Kun Guan May 2019

Context-Based Multi-Stage Offline Handwritten Mathematical Symbol Recognition Using Deep Learning, Sui Kun Guan

Master's Projects

We propose a multi-stage machine learning (ML) architecture to improve the accuracy of offline handwritten mathematical symbol recognition. In the first stage, we train and assemble multiple deep convolutional neural networks to classify isolated mathematical symbols. However, certain ambiguous symbols are hard to classify without the context information of the mathematical expressions where the symbols belong. In the second stage, we train a deep convolutional neural network that further classifies the ambiguous symbols based on the context information of the symbols. To further improve the classification accuracy, in the third stage, we develop a set of rules to classify the ...


Machine Learning In Crop Classification Of Temporal Multispectral Satellite Image, Ravali Koppaka May 2019

Machine Learning In Crop Classification Of Temporal Multispectral Satellite Image, Ravali Koppaka

Master's Projects

Recently, there has been a remarkable growth in Artificial Intelligence (AI) with

the development of efficient AI models and high-power computational resources for processing complex datasets. There has been a growing number of applications of machine learning in satellite remote sensing image data processing. In this work, machine learning methods were applied for crop classification of temporal multi- spectral satellite image to achieve better prediction of crop-wise area statistics. In India, agriculture has a huge impact on the national economy and most of the critical decisions are dependent on agricultural statistics. Sentinel-2 satellite image data for the Guntur district region ...


Detecting Crispr Arrays Using Long-Short Term Memory Network, Shantanu Deshmukh May 2019

Detecting Crispr Arrays Using Long-Short Term Memory Network, Shantanu Deshmukh

Master's Projects

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) is a se- quence found in the DNA sequence of an organism. It provides provides immunity to the organism. Recently, it was found that the CRISPR-based immunity mechanism can be manipulated to perform genome editing. The problem is, it is hard to know the specificity of this system and in turn, making it highly specific is difficult. More re- search is required to improve this CRISPR-based genome editing. Detecting CRISPR arrays in the DNA sequence is the first step towards this research. In this work, a CRISPR array detection pipeline, CRISPRLstm, is proposed ...


Fast High Resolution Image Completion, Chinmay Mishra May 2019

Fast High Resolution Image Completion, Chinmay Mishra

Master's Projects

This paper presents a method for image completion, an active research area in the field of computer vision. The method described in the paper aims at achieving comparable results to other state of the art methods with approximately four and a half times reduction in training time. It is a two step procedure which involves image completion and enhancing the resolution of the completed image. We use the SSIM metric to evaluate the quality of the completed image and to also time our model against other image completion models.


Music Mood Classification Using Convolutional Neural Networks, Revanth Akella May 2019

Music Mood Classification Using Convolutional Neural Networks, Revanth Akella

Master's Projects

Grouping music into moods is useful as music is migrating from to online streaming services as it can help in recommendations. To establish the connection between music and mood we develop an end-to-end, open source approach for mood classification using lyrics. We develop a pipeline for tag extraction, lyric extraction, and establishing classification models for classifying music into moods. We investigate techniques to classify music into moods using lyrics and audio features. Using various natural language processing methods with machine learning and deep learning we perform a comparative study across different classification and mood models. The results infer that features ...


An Industry Driven Genre Classification Application Using Natural Language Processing, Sharan Duggirala May 2019

An Industry Driven Genre Classification Application Using Natural Language Processing, Sharan Duggirala

Master's Projects

With the advent of digitized music, many online streaming companies such as Spotify have capitalized on a listener’s need for a common stream platform. An essential component of such a platform is the recommender systems that suggest to the constituent user base, related tracks, albums and artists. In order to sustain such a recommender system, labeling data to indicate which genre it belongs to is essential. Most recent academic publications that deal with music genre classification focus on the use of deep neural networks developed and applied within the music genre classification domain. This thesis attempts to use some ...


Learning For Free – Object Detectors Trained On Synthetic Data, Charles Thane Mackay May 2019

Learning For Free – Object Detectors Trained On Synthetic Data, Charles Thane Mackay

Master's Projects

A picture is worth a thousand words, or if you want it labeled, it’s worth about four cents per bounding box. Data is the fuel that powers modern technologies run by artificial intelligence engines which is increasingly valuable in today’s industry. High quality labeled data is the most important factor in producing accurate machine learning models which can be used to make powerful predictions and identify patterns humans may not see. Acquiring high quality labeled data however, can be expensive and time consuming. For small companies, academic researchers, or machine learning hobbyists, gathering large datasets for a specific ...


Sql Injection Detection Using Machine Learning, Sonali Mishra May 2019

Sql Injection Detection Using Machine Learning, Sonali Mishra

Master's Projects

Sharing information over the Internet over multiple platforms and web-applications has become a quite common phenomenon in the recent times. The web-based applications that accept critical information from users store this information in databases. These applications and the databases connected to them are susceptible to all kinds of information security threats due to being accessible through the Internet. The threats include attacks such as Cross Side Scripting (CSS), Denial of Service Attack (DoS0, and Structured Query Language (SQL) Injection attacks. SQL Injection attacks fall under the top ten vulnerabilities when we talk about web-based applications. Through this kind of attack ...


A Webrtc Video Chat Implementation Within The Yioop Search Engine, Yangcha Ho May 2019

A Webrtc Video Chat Implementation Within The Yioop Search Engine, Yangcha Ho

Master's Projects

Web real-time communication (abbreviated as WebRTC) is one of the latest Web application technologies that allows voice, video, and data to work collectively in a browser without a need for third-party plugins or proprietary software installation. When two browsers from different locations communicate with each other, they must know how to locate each other,

bypass security and firewall protections, and transmit all multimedia communications in real time. This project not only illustrates how WebRTC technology works but also walks through a real example of video chat-style application. The application communicates between two remote users using WebSocket and the data encryption ...


Graph Classification Using Machine Learning Algorithms, Monica Golahalli Seenappa May 2019

Graph Classification Using Machine Learning Algorithms, Monica Golahalli Seenappa

Master's Projects

In the Graph classification problem, given is a family of graphs and a group of different categories, and we aim to classify all the graphs (of the family) into the given categories. Earlier approaches, such as graph kernels and graph embedding techniques have focused on extracting certain features by processing the entire graph. However, real world graphs are complex and noisy and these traditional approaches are computationally intensive. With the introduction of the deep learning framework, there have been numerous attempts to create more efficient classification approaches.

For this project, we will be focusing on modifying an existing kernel graph ...


Deep Learning Based Real Time Devanagari Character Recognition, Aseem Chhabra May 2019

Deep Learning Based Real Time Devanagari Character Recognition, Aseem Chhabra

Master's Projects

The revolutionization of the technology behind optical character recognition (OCR) has helped it to become one of those technologies that have found plenty of uses in the entire industrial space. Today, the OCR is available for several languages and have the capability to recognize the characters in real time, but there are some languages for which this technology has not developed much. All these advancements have been possible because of the introduction of concepts like artificial intelligence and deep learning. Deep Neural Networks have proven to be the best choice when it comes to a task involving recognition. There are ...


Classification Of Humans Into Ayurvedic Prakruti Types Using Computer Vision, Gayatri Gadre May 2019

Classification Of Humans Into Ayurvedic Prakruti Types Using Computer Vision, Gayatri Gadre

Master's Projects

Ayurveda, a 5000 years old Indian medical science, believes that the universe and hence humans are made up of five elements namely ether, fire, water, earth, and air. The three Doshas (Tridosha) Vata, Pitta, and Kapha originated from the combinations of these elements. Every person has a unique combination of Tridosha elements contributing to a person’s ‘Prakruti’. Prakruti governs the physiological and psychological tendencies in all living beings as well as the way they interact with the environment. This balance influences their physiological features like the texture and colour of skin, hair, eyes, length of fingers, the shape of ...


Using Computer Vision To Quantify Coral Reef Biodiversity, Niket Bhodia May 2019

Using Computer Vision To Quantify Coral Reef Biodiversity, Niket Bhodia

Master's Projects

The preservation of the world’s oceans is crucial to human survival on this planet, yet we know too little to begin to understand anthropogenic impacts on marine life. This is especially true for coral reefs, which are the most diverse marine habitat per unit area (if not overall) as well as the most sensitive. To address this gap in knowledge, simple field devices called autonomous reef monitoring structures (ARMS) have been developed, which provide standardized samples of life from these complex ecosystems. ARMS have now become successful to the point that the amount of data collected through them has ...


Schema Migration From Relational Databases To Nosql Databases With Graph Transformation And Selective Denormalization, Krishna Chaitanya Mullapudi May 2019

Schema Migration From Relational Databases To Nosql Databases With Graph Transformation And Selective Denormalization, Krishna Chaitanya Mullapudi

Master's Projects

We witnessed a dramatic increase in the volume, variety and velocity of data leading to the era of big data. The structure of data has become highly flexible leading to the development of many storage systems that are different from the traditional structured relational databases where data is stored in “tables,” with columns representing the lowest granularity of data. Although relational databases are still predominant in the industry, there has been a major drift towards alternative database systems that support unstructured data with better scalability leading to the popularity of “Not Only SQL.”

Migration from relational databases to NoSQL databases ...


Predicting Off-Target Potential Of Crispr-Cas9 Single Guide Rna, Ishita Mathur May 2019

Predicting Off-Target Potential Of Crispr-Cas9 Single Guide Rna, Ishita Mathur

Master's Projects

With advancements in the field of genome engineering, researchers have come up with potential ways for site-specific gene editing. One of the methods uses the Clustered Regularly Interspaced Short Palindromic Repeats - CRISPR-Cas technology. It consists of a Cas9 nuclease and a single guide RNA (sgRNA) that cleaves the DNA at the intended target site. However, the target genome could contain multiple potential off-target sites and cleaving an off-target site can have deleterious effects in case of gene editing in humans.

Lab based assays have been developed to test the off-target effects of guide RNAs. However, it is not feasible to ...


Network Alignment In Heterogeneous Social Networks, Priyanka Kasbekar May 2019

Network Alignment In Heterogeneous Social Networks, Priyanka Kasbekar

Master's Projects

Online Social Networks (OSN) have numerous applications and an ever growing user base. This has led to users being a part of multiple social networks at the same time. Identifying a similar user from one social network on another social network will give in- formation about a user’s behavior on different platforms. It further helps in community detection and link prediction tasks. The process of identifying or aligning users in multiple networks is called Network Alignment. More the information we have about the nodes / users better the results of Network Alignment. Unlike other related work in this field that ...


Learning To Play The Trading Game, Neeraj Kulkarni May 2019

Learning To Play The Trading Game, Neeraj Kulkarni

Master's Projects

Can we train a stock trading bot that can take decisions in high-entropy envi- ronments like stock markets to generate profits based on some optimal policy? Can we further extend this learning for any general trading problem? Quantitative Al- gorithms are responsible for more than 75% of the stock trading around the world. Creating a stock market prediction model is comparatively easy. But creating a prof- itable prediction model is still considered as a challenging task in the field of machine learning and deep learning due to the unpredictability of the financial markets. Us- ing biologically inspired computing techniques of ...


Intelligent Log Analysis For Anomaly Detection, Steven Yen May 2019

Intelligent Log Analysis For Anomaly Detection, Steven Yen

Master's Projects

Computer logs are a rich source of information that can be analyzed to detect various issues. The large volumes of logs limit the effectiveness of manual approaches to log analysis. The earliest automated log analysis tools take a rule-based approach, which can only detect known issues with existing rules. On the other hand, anomaly detection approaches can detect new or unknown issues. This is achieved by looking for unusual behavior different from the norm, often utilizing machine learning (ML) or deep learning (DL) models. In this project, we evaluated various ML and DL techniques used for log anomaly detection. We ...


Breaking Audio Captcha Using Machine Learning/Deep Learning And Related Defense Mechanism, Heemany Shekhar May 2019

Breaking Audio Captcha Using Machine Learning/Deep Learning And Related Defense Mechanism, Heemany Shekhar

Master's Projects

CAPTCHA is a web-based authentication method used by websites to distinguish between humans (valid users) and bots(attackers). Audio captcha is an accessible captcha meant for the visually disabled section of users such as color-blind, blind, near-sighted users. In this project, I analyzed the security of audio captchas from attacks that employ machine learning and deep learning models. Audio captchas of varying lengths (5, 7 and 10) and varying background noise (no noise, medium noise or high noise) were analyzed. I found that audio captchas with no background noise or medium background noise were easily attacked with 99% - 100% accuracy ...


On Adversarial Attacks On Deep Learning Models, Nag Mani May 2019

On Adversarial Attacks On Deep Learning Models, Nag Mani

Master's Projects

With recent advancements in the field of artificial intelligence, deep learning has created a niche in the technology space and is being actively used in autonomous and IoT systems globally. Unfortunately, these deep learning models have become susceptible to adversarial attacks which can severely impact their integrity. Research has shown that many state-of-the-art models are vulnerable to attacks by well-crafted adversarial examples. These adversarial examples are perturbed versions of clean data which have small amount of noise added to them. These adversarial samples are imperceptible to the human eye but can easily fool the targeted model. The exposed vulnerabilities of ...