Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 45

Full-Text Articles in Physical Sciences and Mathematics

Cat Tracks – Tracking Wildlife Through Crowdsourcing Using Firebase, Tracy Ho Dec 2020

Cat Tracks – Tracking Wildlife Through Crowdsourcing Using Firebase, Tracy Ho

Master's Projects

Many mountain lions are killed in the state of California every year from roadkill. To reduce these numbers, it is important that a system be built to track where these mountain lions have been around. One such system could be built using the platform-as-a-service, Firebase. Firebase is a platform service that collects and manages data that comes in through a mobile application. For the development of cross-platform mobile applications, Flutter is used as a toolkit for developers for both iOS and Android. This entire system, Cat Tracks is proposed as a crowdsource platform to track wildlife, with the current focus …


A Neat Approach To Malware Classification, Jason Do Dec 2020

A Neat Approach To Malware Classification, Jason Do

Master's Projects

Current malware detection software often relies on machine learning, which is seen as an improvement over signature-based techniques. Problems with a machine learning based approach can arise when malware writers modify their code with the intent to evade detection. This leads to a cat and mouse situation where new models must constantly be trained to detect new malware variants. In this research, we experiment with genetic algorithms as a means of evolving machine learning models to detect malware. Genetic algorithms, which simulate natural selection, provide a way for models to adapt to continuous changes in a malware families, and thereby …


Lidar Object Detection Utilizing Existing Cnns For Smart Cities, Vinay Ponnaganti Dec 2020

Lidar Object Detection Utilizing Existing Cnns For Smart Cities, Vinay Ponnaganti

Master's Projects

As governments and private companies alike race to achieve the vision of a smart city — where artificial intelligence (AI) technology is used to enable self-driving cars, cashier-less shopping experiences and connected home devices from thermostats to robot vacuum cleaners — advancements are being made in both software and hardware to enable increasingly real-time, accurate inference at the edge. One hardware solution adopted for this purpose is the LiDAR sensor, which utilizes infrared lasers to accurately detect and map its surroundings in 3D. On the software side, developers have turned to artificial neural networks to make predictions and recommendations with …


Detecting Deepfakes With Deep Learning, Eric C. Tjon Dec 2020

Detecting Deepfakes With Deep Learning, Eric C. Tjon

Master's Projects

Advances in generative models and manipulation techniques have given rise to digitally altered videos known as deepfakes. These videos are difficult to identify for both humans and machines. Typical detection methods exploit various imperfections in deepfake videos, such as inconsistent posing and visual artifacts. In this paper, we propose a pipeline with two distinct pathways for examining individual frames and video clips. The image pathway contains a novel architecture called Eff-YNet capable of both segmenting and detecting frames from deepfake videos. It consists of a U-Net with a classification branch and an EfficientNet B4 encoder. The video pathway implements a …


End-To-End Learning Utilizing Temporal Information For Vision- Based Autonomous Driving, Dapeng Guo Dec 2020

End-To-End Learning Utilizing Temporal Information For Vision- Based Autonomous Driving, Dapeng Guo

Master's Projects

End-to-End learning models trained with conditional imitation learning (CIL) have demonstrated their capabilities in driving autonomously in dynamic environments. The performance of such models however is limited as most of them fail to utilize the temporal information, which resides in a sequence of observations. In this work, we explore the use of temporal information with a recurrent network to improve driving performance. We propose a model that combines a pre-trained, deeper convolutional neural network to better capture image features with a long short-term memory network to better explore temporal information. Experimental results indicate that the proposed model achieves performance gain …


Multi-Agent Deep Reinforcement Learning For Walkers, Inhee Park Dec 2020

Multi-Agent Deep Reinforcement Learning For Walkers, Inhee Park

Master's Projects

This project was motivated by seeking an AI method towards Artificial General Intelligence (AGI), that is, more similar to learning behavior of human-beings. As of today, Deep Reinforcement Learning (DRL) is the most closer to the AGI compared to other machine learning methods. To better understand the DRL, we compares and contrasts to other related methods: Deep Learning, Dynamic Programming and Game Theory.

We apply one of state-of-art DRL algorithms, called Proximal Policy Op- timization (PPO) to the robot walkers locomotion, as a simple yet challenging environment, inherently continuous and high-dimensional state/action space.

The end goal of this project is …


Findfur: A Tool For Predicting Furin Cleavage Sites Of Viral Envelope Substrates, Christine Gu Dec 2020

Findfur: A Tool For Predicting Furin Cleavage Sites Of Viral Envelope Substrates, Christine Gu

Master's Projects

Most biologically active proteins of eukaryotic cells are initially synthesized in the secretory pathway as inactive precursors and require proteolytic processing to become functionally active. This process is performed by a specialized family of endogenous enzymes known as proproteases convertases (PCs). Within this family of proteases, the most notorious and well-research is furin. Found ubiquitously throughout the human body, typical furin substrates are cleaved at sites composed of paired basic amino acids, specifically at the consensus sequence, R-X-[K/R]-R↓. Furin is often exploited by many pathogens, such as enveloped viruses, for proteolytic processing and maturation of their proteins. Glycoproteins of enveloped …


Malware Classification With Gaussian Mixture Model-Hidden Markov Models, Jing Zhao Dec 2020

Malware Classification With Gaussian Mixture Model-Hidden Markov Models, Jing Zhao

Master's Projects

Discrete hidden Markov models (HMM) are often applied to the malware detection and classification problems. However, the continuous analog of discrete HMMs, that is, Gaussian mixture model-HMMs (GMM-HMM), are rarely considered in the field of cybersecurity. In this study, we apply GMM-HMMs to the malware classification problem and we compare our results to those obtained using discrete HMMs. As features, we consider opcode sequences and entropy-based sequences. For our opcode features, GMM-HMMs produce results that are comparable to those obtained using discrete HMMs, whereas for our entropy-based features, GMM-HMMs generally improve on the classification results that we can attain with …


The Use Of Evidential Reasoning Model With Biomarkers In Pancreatic Cancer Prediction, Qianhui Fan Dec 2020

The Use Of Evidential Reasoning Model With Biomarkers In Pancreatic Cancer Prediction, Qianhui Fan

Master's Projects

In this project, an evidential reasoning model is built to amalgamate factors that could be used in early detection of pancreatic cancer. Our machine learning model outputs a probability of a given patient having prostate cancer based on various input variables. These variables include health history factors, such as smoking and medical history, technical artifacts, such as biopsy sequencing technology, and genomic biomarkers such as mutational, transcriptional and methylomic profiles, cfDNA, and copy number variation. The dataset used in this project is a part of The Cancer Genome Atlas (TCGA) project and was collected from the National Cancer Institute (NIH) …


Visualization Of Large Networks Using Recursive Community Detection, Xinyuan Fan Dec 2020

Visualization Of Large Networks Using Recursive Community Detection, Xinyuan Fan

Master's Projects

Networks show relationships between people or things. For instance, a person has a social network of friends, and websites are connected through a network of hyperlinks. Networks are most commonly represented as graphs, so graph drawing becomes significant for network visualization. An effective graph drawing can quickly reveal connections and patterns within a network that would be difficult to discern without visual aid. But graph drawing becomes a challenge for large networks. Am- biguous edge crossings are inevitable in large networks with numerous nodes and edges, and large graphs often become a complicated tangle of lines. These issues greatly reduce …


Malware Classification Using Lstms, Dennis Dang Dec 2020

Malware Classification Using Lstms, Dennis Dang

Master's Projects

Signature and anomaly based detection have long been quintessential techniques used in malware detection. However, these techniques have become increasingly ineffective as malware becomes more complex. Researchers have therefore turned to deep learning to construct better performing models. In this project, we create four different long-short term memory (LSTM) models and train each model to classify malware by family type. Our data consists of opcodes extracted from malware executables. We employ techniques used in natural language processing (NLP) such as word embedding and bidirection LSTMs (biLSTM). We also use convolutional neural networks (CNN). We found that our model consisting of …


Bioinformatics Metadata Extraction For Machine Learning Analysis, Zachary Tom Dec 2020

Bioinformatics Metadata Extraction For Machine Learning Analysis, Zachary Tom

Master's Projects

Next generation sequencing (NGS) has revolutionized the biological sciences. Today, entire genomes can be rapidly sequenced, enabling advancements in personalized medicine, genetic diseases, and more. The National Center for Biotechnology Information (NCBI) hosts the Sequence Read Archive (SRA) containing vast amounts of valuable NGS data. Recently, research has shown that sequencing errors in conventional NGS workflows are key confounding factors for detecting mutations. Various steps such as sample handling and library preparation can introduce artifacts that affect the accuracy of calling rare mutations. Thus, there is a need for more insight into the exact relationship between various steps of the …


Quantifying Deepfake Detection Accuracy For A Variety Of Natural Settings, Pratikkumar Prajapati Dec 2020

Quantifying Deepfake Detection Accuracy For A Variety Of Natural Settings, Pratikkumar Prajapati

Master's Projects

Deep fakes are videos generated from a starting video of a person where that person's face has been swapped for someone else's. In this report, we describe our work to develop general, deep learning-based models to classify Deep Fake content. Our first experiments involved simple Convolution Neural Network (CNN)-based models where we varied how individual frames from the source video were passed to the CNN. These simple models tended to give low accuracy scores for discriminating fake versus non-fake videos of less than 60%. We then developed three more sophisticated models: one based on choosing test frames, one based on …


Evidence-Based Detection Of Pancreatic Canc, Rajeshwari Deepak Chandratre May 2020

Evidence-Based Detection Of Pancreatic Canc, Rajeshwari Deepak Chandratre

Master's Projects

This study is an effort to develop a tool for early detection of pancreatic cancer using evidential reasoning. An evidential reasoning model predicts the likelihood of an individual developing pancreatic cancer by processing the outputs of a Support Vector Classifier, and other input factors such as smoking history, drinking history, sequencing reads, biopsy location, family and personal health history. Certain features of the genomic data along with the mutated gene sequence of pancreatic cancer patients was obtained from the National Cancer Institute (NIH) Genomic Data Commons (GDC). This data was used to train the SVC. A prediction accuracy of ~85% …


Using Machine Learning To Optimize Predictive Models Used For Big Data Analytics In Various Sports Events, Akhil Kumar Gour May 2020

Using Machine Learning To Optimize Predictive Models Used For Big Data Analytics In Various Sports Events, Akhil Kumar Gour

Master's Projects

In today’s world, data is growing in huge volume and type day by day. Historical data can hence be leveraged to predict the likelihood of the events which are to occur in the future. This process of using statistical or any other form of data to predict future outcomes is commonly termed as predictive modelling. Predictive modelling is becoming more and more important and is trending because of several reasons. But mainly, it enables businesses or individual users to gain accurate insights and allows to decide suitable actions for a profitable outcome.

Machine learning techniques are generally used in order …


Predicting Students’ Performance By Learning Analytics, Sandeep Subhash Madnaik May 2020

Predicting Students’ Performance By Learning Analytics, Sandeep Subhash Madnaik

Master's Projects

The field of Learning Analytics (LA) has many applications in today’s technology and online driven education. Learning Analytics is a multidisciplinary topic for learn- ing purposes that uses machine learning, statistic, and visualization techniques [1]. We can harness academic performance data of various components in a course, along with the data background of each student (learner), and other features that might affect his/her academic performance. This collected data then can be fed to a sys- tem with the task to predict the final academic performance of the student, e.g., the final grade. Moreover, it allows students to monitor and self-assess …


Detection Of Mild Cognitive Impairment Using Diffusion Compartment Imaging, Matthew Jones May 2020

Detection Of Mild Cognitive Impairment Using Diffusion Compartment Imaging, Matthew Jones

Master's Projects

The result of applying the Neurite Orientation Density and Dispersion Index (NODDI) algorithm to improve the prediction accuracy for patients diagnosed with MCI is reported. Calculations were carried out using a collection of 68 patients (34 control and 34 with MCI) gathered from the Alzheimer’s Disease Neuroimaging Initiative database (ADNI). Patient data includes the use of high-resolution Magnetic Resonance Images as with as Diffusion Tensor Imaging. A Linear Regression accuracy of 83% was observed using the added NODDI summary statistic: Orientation Dispersion Index (ODI). A statistically significant difference in groups was found between control patients and patients with MCI with …


Probabilistic And Machine Learning Enhancement To Conn Toolbox, Gayathri Hanuma Ravali Kuppachi May 2020

Probabilistic And Machine Learning Enhancement To Conn Toolbox, Gayathri Hanuma Ravali Kuppachi

Master's Projects

Clinical depression is a state of mind where the person suffers from persevering and overpowering sorrow. Existing examinations have exhibited that the course of action of arrangement in the brain of patients with clinical depression has a weird framework topology structure. In the earlier decade, resting-state images of the brain have been under the radar a. Specifically, the topological relationship of the brain aligned with graph hypothesis has discovered a strong connection in patients experiencing clinical depression. However, the systems to break down brain networks still have a couple of issues to be unwound. This paper attempts to give a …


Pattern Analysis And Prediction Of Mild Cognitive Impairment Using The Conn Toolbox, Meenakshi Anbukkarasu May 2020

Pattern Analysis And Prediction Of Mild Cognitive Impairment Using The Conn Toolbox, Meenakshi Anbukkarasu

Master's Projects

Alzheimer's is an irreversible neurodegenerative disorder described by dynamic psychological and memory defalcation. It has been accounted for that the pervasiveness of Alzheimer's is to increase by 4 times in a few years, where one in every 75 people will have this disorder. Hence, there is a critical requirement for the analysis of Alzheimer's at its beginning stage to diminish the difficulty of the overall medical complications. The initial state of Alzheimer’s is called Mild cognitive impairment (MCI), and hence it is a decent target for premature diagnosis and treatment of Alzheimer's. This project focuses on coordinating numerous imaging modalities …


Higher-Order Link Prediction Using Graph Embeddings, Neeraj Chavan May 2020

Higher-Order Link Prediction Using Graph Embeddings, Neeraj Chavan

Master's Projects

Link prediction is an emerging field that predicts if two nodes in a network are likely to be connected or not in the near future. Networks model real-world systems using pairwise interactions of nodes. However, many of these interactions may involve more than two nodes or entities simultaneously. For example, social interactions often occur in groups of people, research collaborations are among more than two authors, and biological networks describe interactions of a group of proteins. An interaction that consists of more than two entities is called a higher-order structure. Predicting the occurrence of such higher-order structures helps us solve …


Rehearsal Scheduling Problem, Thuan Bao May 2020

Rehearsal Scheduling Problem, Thuan Bao

Master's Projects

Scheduling is a common task that plays a crucial role in many industries such as manufacturing or servicing. In a competitive environment, effective scheduling is one of the key factors to reduce cost and increase productivity. Therefore, scheduling problems have been studied by many researchers over the past thirty years. Rehearsal scheduling problem (RSP) is similar to the popular resource-constrained project scheduling problem (RCPSP); however, it does not have activity precedence constraints and the resources’ availabilities are not fixed during processing time. RSP can be used to schedule rehearsal in theatre industry or to schedule group scheduling when each member …


Word Embedding Techniques For Malware Classification, Aniket Chandak May 2020

Word Embedding Techniques For Malware Classification, Aniket Chandak

Master's Projects

Word embeddings are often used in natural language processing as a means to quantify relationships between words. More generally, these same word embedding techniques can be used to quantify relationships between features. In this paper, we conduct a series of experiments that are designed to determine the effectiveness of word embedding in the context of malware classification. First, we conduct experiments where hidden Markov models (HMM) are directly applied to opcode sequences. These results serve to establish a baseline for comparison with our subsequent word embedding experiments. We then experiment with word embedding vectors derived from HMMs— a technique that …


Detection And Analysis Of Malware Evolution, Sunhera Barunkumar Paul May 2020

Detection And Analysis Of Malware Evolution, Sunhera Barunkumar Paul

Master's Projects

Malware is a malicious software that causes disruption, allows access to unapproved resources, or performs other unauthorized activity. Developing effective malware detection techniques is a critical aspect of information security. One difficulty that arises is that malware often evolves over time, due to changing goals of malware developers, or to counter advances in detection. This evolution can occur through various modifications in malware code. To maintain effective malware detection, it is necessary to detect and analyze malware evolution so that appropriate countermeasures can be taken. We perform a variety of experiments to detect points in time where a malware family …


Sentiment Analysis For Troll Activity Detection On Sina Weibo, Zidong Jiang May 2020

Sentiment Analysis For Troll Activity Detection On Sina Weibo, Zidong Jiang

Master's Projects

The impact of social media on the modern world is difficult to overstate. Virtually all companies and public figures have social media accounts on popular platforms such as Twitter and Facebook. In China, the micro-blogging service provider Sina Weibo is the most popular such service. To overcome negative publicity, Weibo trolls the so called Water Army can be hired to post deceptive comments.

In recent years, troll detection and sentiment analysis have been studied, but we are not aware of any research that considers troll detection based on sentiment analysis. In this research, we focus on troll detection via sentiment …


Video Synthesis From The Stylegan Latent Space, Lei Zhang May 2020

Video Synthesis From The Stylegan Latent Space, Lei Zhang

Master's Projects

Generative models have shown impressive results in generating synthetic images. However, video synthesis is still difficult to achieve, even for these generative models. The best videos that generative models can currently create are a few seconds long, distorted, and low resolution. For this project, I propose and implement a model to synthesize videos at 1024x1024x32 resolution that include human facial expressions by using static images generated from a Generative Adversarial Network trained on the human facial images. To the best of my knowledge, this is the first work that generates realistic videos that are larger than 256x256 resolution from single …


Housing Market Crash Prediction Using Machine Learning And Historical Data, Parnika De May 2020

Housing Market Crash Prediction Using Machine Learning And Historical Data, Parnika De

Master's Projects

The 2008 housing crisis was caused by faulty banking policies and the use of credit derivatives of mortgages for investment purposes. In this project, we look into datasets that are the markers to a typical housing crisis. Using those data sets we build three machine learning techniques which are, Linear regression, Hidden Markov Model, and Long Short-Term Memory. After building the model we did a comparative study to show the prediction done by each model. The linear regression model did not predict a housing crisis, instead, it showed that house prices would be rising steadily and the R-squared score of …


An Ai For A Modification Of Dou Di Zhu, Xuesong Luo May 2020

An Ai For A Modification Of Dou Di Zhu, Xuesong Luo

Master's Projects

We describe our implementation of AIs for the Chinese game Dou Di Zhu. Dou Di Zhu is a three-player game played with a standard 52 card deck together with two jokers. One player acts as a landlord and has the advantage of receiving three extra cards, the other two players play as peasants. We designed and implemented a Deep Q-learning Neural Network (DQN) agent to play the Dou Di Zhu. At the same time, we also designed and made a pure Q-learning based agent as well as a Zhou rule-based agent to compare with our main agent. We show the …


Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund May 2020

Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund

Master's Projects

Lightyears beyond the Planet Earth there exist plenty of unknown and unexplored stars and Galaxies that need to be studied in order to support the Big Bang Theory and also make important astronomical discoveries in quest of knowing the unknown. Sophisticated devices and high-power computational resources are now deployed to make a positive effort towards data gathering and analysis. These devices produce massive amount of data from the astronomical surveys and the data is usually in terabytes or petabytes. It is exhaustive to process this data and determine the findings in short period of time. Many details can be missed …


Using Deep Learning And Linguistic Analysis To Predict Fake News Within Text, John Nguyen May 2020

Using Deep Learning And Linguistic Analysis To Predict Fake News Within Text, John Nguyen

Master's Projects

The spread of information about current events is a way for everybody in the world to learn and understand what is happening in the world. In essence, the news is an important and powerful tool that could be used by various groups of people to spread awareness and facts for the good of mankind. However, as information becomes easily and readily available for public access, the rise of deceptive news becomes an increasing concern. The reason is due to the fact that it will cause people to be misled and thus could affect the livelihood of themselves or others. The …


Yoga Pose Classification Using Deep Learning, Shruti Kothari May 2020

Yoga Pose Classification Using Deep Learning, Shruti Kothari

Master's Projects

Human pose estimation is a deep-rooted problem in computer vision that has exposed many challenges in the past. Analyzing human activities is beneficial in many fields like video- surveillance, biometrics, assisted living, at-home health monitoring etc. With our fast-paced lives these days, people usually prefer exercising at home but feel the need of an instructor to evaluate their exercise form. As these resources are not always available, human pose recognition can be used to build a self-instruction exercise system that allows people to learn and practice exercises correctly by themselves. This project lays the foundation for building such a system …