Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 34

Full-Text Articles in Physical Sciences and Mathematics

Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar Dec 2021

Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar

Master's Projects

In an average human life, the eyes not only passively scan visual scenes, but most times end up actively performing tasks including, but not limited to, searching, comparing, and counting. As a result of the advances in technology, we are observing a boost in the average screen time. Humans are now looking at an increasing number of screens and in turn images and videos. Understanding what scene a user is looking at and what type of visual task is being performed can be useful in developing intelligent user interfaces, and in virtual reality and augmented reality devices. In this research, …


The Impact Of Programming Language’S Type On Probabilistic Machine Learning Models, Sherif Elsaid Dec 2021

The Impact Of Programming Language’S Type On Probabilistic Machine Learning Models, Sherif Elsaid

Master's Projects

Software development is an expensive and difficult process. Mistakes can be easily made, and without extensive review process, those mistakes can make it to the production code and may have unintended disastrous consequences.

This is why various automated code review services have arisen in the recent years. From AWS’s CodeGuro and Microsoft’s Code Analysis to more integrated code assistants, like IntelliCode and auto completion tools. All of which are designed to help and assist the developers with their work and help catch overlooked bugs.

Thanks to recent advances in machine learning, these services have grown tremen- dously in sophistication to …


Privacy Preserving For Multiple Computer Vision Tasks, Amala Varghese Wilson Dec 2021

Privacy Preserving For Multiple Computer Vision Tasks, Amala Varghese Wilson

Master's Projects

Privacy-preserving visual recognition is an important area of research that is gaining momentum in the field of computer vision. In a production environment, it is critical to have neural network models learn continually from user data. However, sharing raw user data with a server is less desirable from a regulatory, security and privacy perspective. Federated learning addresses the problem of privacy- preserving visual recognition. More specifically, we closely examine and dissect a framework known as Dual User Adaptation (DUA) presented by Lange et al. at CVPR 2020, due to its novel idea of bringing about user-adaptation on both the server-side …


Nitrogenase Iron Protein Classification Using Cnn Neural Network, Amer Rez Dec 2021

Nitrogenase Iron Protein Classification Using Cnn Neural Network, Amer Rez

Master's Projects

The nitrogenase iron protein (NifH) is extensively used to study nitrogen fixation, the ecologically vital process of reducing atmospheric nitrogen to a bioavailable form. The discovery rate of novel NifH sequences is high, and there is an ongoing need for software tools to mine NifH records from the GenBank repository. Since record annotations are unreliable, because they contain errors, classifiers based on sequence alone are required. The ARBitrator classifier is highly successful but must be initialized by extensive manual effort. A Deep Learning approach could substantially reduce manual intervention. However, attempts to build a character-based Deep Learning NifH classifier were …


Predicting Stocks With Lstm-Based Drnn And Gan, Duy Ngo Dec 2021

Predicting Stocks With Lstm-Based Drnn And Gan, Duy Ngo

Master's Projects

Trading equities can be very lucrative for some and a gamble for others. Professional traders and retail traders are constantly amassing information to be a step ahead of the market to profit off the value of stocks on the market. Some of the tools in their arsenal include different types of calculations based on a variety of data collected on a stock. Technical analysis is a technique for traders to analyze the data of equities presented on charts. Often, the way the price changes over time can be used as an indicator for traders to predict how future prices will …


An Open Source Direct Messaging And Enhanced Recommendation System For Yioop, Aniruddha Dinesh Mallya Dec 2021

An Open Source Direct Messaging And Enhanced Recommendation System For Yioop, Aniruddha Dinesh Mallya

Master's Projects

Recommendation systems and direct messaging systems are two popular components of web portals. A recommendation system is an information filtering system that seeks to predict the "rating" or "preference" a user would give to an item and a direct messaging system allows private communication between users of any platform. Yioop, is an open source, PHP search engine and web portal that can be configured to allow users to create discussion groups, blogs, wikis etc.

In this project, we expanded on Yioop’s group system so that every user now has a personal group. Personal groups were then used to add user …


Employee Churn Prediction Using Logistic Regression And Support Vector Machine, Rajendra Maharjan Dec 2021

Employee Churn Prediction Using Logistic Regression And Support Vector Machine, Rajendra Maharjan

Master's Projects

It is a challenge for Human Resource (HR) team to retain their existing employees than to hire a new one. For any company, losing their valuable employees is a loss in terms of time, money, productivity, and trust, etc. This loss could be possibly minimized if HR could beforehand find out their potential employees who are planning to quit their job hence, we investigated solving the employee churn problem through the machine learning perspective. We have designed machine learning models using supervised and classification-based algorithms like Logistic Regression and Support Vector Machine (SVM). The models are trained with the IBM …


Identifying Bots On Twitter With Benford’S Law, Sanmesh Bhosale Dec 2021

Identifying Bots On Twitter With Benford’S Law, Sanmesh Bhosale

Master's Projects

Over time Online Social Networks (OSNs) have grown exponentially in terms of active users and have now become an influential factor in the formation of public opinions. Due to this, the use of bots and botnets for spreading misinformation on OSNs has become a widespread concern. The biggest example of this was during the 2016 American Presidential Elections, where Russian bots on Twitter pumped out fake news to influence the election results.

Identifying bots and botnets on Twitter is not just based on visual analysis and can require complex statistical methods to score a profile based on multiple features and …


Analysis Of Camera Trap Footage Through Subject Recognition, Nirnayak Bhardwaj Dec 2021

Analysis Of Camera Trap Footage Through Subject Recognition, Nirnayak Bhardwaj

Master's Projects

Motion-sensitive cameras, otherwise known as camera traps, have become increasingly popular amongst ecologists for studying wildlife. These cameras allow scientists to remotely observe animals through an inexpensive and non-invasive approach. Due to the lenient nature of motion cameras, studies involving them often generate excessive amounts of footage with many photographs not containing any animal subjects. Thus, there is a need for a system that is capable of analyzing camera trap footage to determine if a picture holds value for researchers. While research into automated image recognition is well documented, it has had limited applications in the field of ecology. This …


Statistical Potentials For Rna-Protein Interactions Optimized By Cma-Es, Takayuki Kimura, Nobuaki Yasuo, Masakazu Sekijima, Brooke Lustig Oct 2021

Statistical Potentials For Rna-Protein Interactions Optimized By Cma-Es, Takayuki Kimura, Nobuaki Yasuo, Masakazu Sekijima, Brooke Lustig

Faculty Research, Scholarly, and Creative Activity

Characterizing RNA-protein interactions remains an important endeavor, complicated by the difficulty in obtaining the relevant structures. Evaluating model structures via statistical potentials is in principle straight-forward and effective. However, given the relatively small size of the existing learning set of RNA-protein complexes optimization of such potentials continues to be problematic. Notably, interaction-based statistical potentials have problems in addressing large RNA-protein complexes. In this study, we adopted a novel strategy with covariance matrix adaptation (CMA-ES) to calculate statistical potentials, successfully identifying native docking poses.


Computer-Aided Diagnosis Of Low Grade Endometrial Stromal Sarcoma (Lgess), Xinxin Yang, Mark Stamp Sep 2021

Computer-Aided Diagnosis Of Low Grade Endometrial Stromal Sarcoma (Lgess), Xinxin Yang, Mark Stamp

Faculty Research, Scholarly, and Creative Activity

Low grade endometrial stromal sarcoma (LGESS) accounts for about 0.2% of all uterine cancer cases. Approximately 75% of LGESS patients are initially misdiagnosed with leiomyoma, which is a type of benign tumor, also known as fibroids. In this research, uterine tissue biopsy images of potential LGESS patients are preprocessed using segmentation and stain normalization algorithms. We then apply a variety of classic machine learning and advanced deep learning models to classify tissue images as either benign or cancerous. For the classic techniques considered, the highest classification accuracy we attain is about 0.85, while our best deep learning model achieves an …


Clickbait Detection In Youtube Videos, Ruchira Gothankar May 2021

Clickbait Detection In Youtube Videos, Ruchira Gothankar

Master's Projects

YouTube videos often include captivating descriptions and intriguing thumbnails designed to increase the number of views, and thereby increase the revenue for the person who posted the video. This creates an incentive for people to post clickbait videos, in which the content might deviate significantly from the title, description, or thumbnail. In effect, users are tricked into clicking on clickbait videos. In this research, we consider the challenging problem of detecting clickbait YouTube videos. We experiment with logistic regression, random forests, and multilayer perceptrons, based on a variety of textual features. We obtain a maximum accuracy in excess of 94%.


Keystroke Dynamics Based On Machine Learning, Han-Chih Chang May 2021

Keystroke Dynamics Based On Machine Learning, Han-Chih Chang

Master's Projects

The development of active and passive biometric authentication and identification technology plays an increasingly important role in cybersecurity. Biometrics that utilize features derived from keystroke dynamics have been studied in this context. Keystroke dynamics can be used to analyze the way that a user types by monitoring various keyboard inputs. Previous work has considered the feasibility of user authentication and classification based on keystroke features. In this research, we analyze a wide variety of machine learning and deep learning models based on keystroke-derived features, we optimize the resulting models, and we compare our results to those obtained in related research. …


Image-Based Real Estate Appraisal Using Cnns And Ensemble Learning, Prathamesh Dnyanesh Kumkar May 2021

Image-Based Real Estate Appraisal Using Cnns And Ensemble Learning, Prathamesh Dnyanesh Kumkar

Master's Projects

Real Estate Appraisal is performed to evaluate properties during a range of activities like buying, selling, mortgaging, or insuring. Traditionally, this process is done by real estate brokers who consider factors like the location of a house, its area, the number of bedrooms and bathrooms, along with other amenities to assess the property. This approach is quite subjective since different brokers may arrive at a different quote for the same property depending on their analysis. The development in machine learning algorithms has given rise to several Automated Valuation Models (AVMs) to estimate real estate prices. Real estate websites use such …


Presentation Attack Detection In Facial Biometric Authentication, Hardik Kumar May 2021

Presentation Attack Detection In Facial Biometric Authentication, Hardik Kumar

Master's Projects

Biometric systems are referred to those structures that enable recognizing an individual, or specifically a characteristic, using biometric data and mathematical algorithms. These are known to be widely employed in various organizations and companies, mostly as authentication systems. Biometric authentic systems are usually much more secure than a classic one, however they also have some loopholes. Presentation attacks indicate those attacks which spoof the biometric systems or sensors. The presentation attacks covered in this project are: photo attacks and deepfake attacks. In the case of photo attacks, it is observed that interactive action check like Eye Blinking proves efficient in …


Classifying Illegal Advertisements On The Darknet Using Nlp, Karan Shashin Shah May 2021

Classifying Illegal Advertisements On The Darknet Using Nlp, Karan Shashin Shah

Master's Projects

The Darknet has become a place to conduct various illegal activities like child labor, contract murder, drug selling while staying anonymous. Traditionally, international and government agencies try to control these activities, but most of those actions are manual and time-consuming. Recently, various researchers developed Machine Learning (ML) approaches trying to aid in the process of detecting illegal activities. The above problem can benefit by using different Natural Language Processing (NLP) techniques. More specifically, researchers have used various classical topic modeling techniques like bag of words, N-grams, Term Frequency, Term Frequency Inverse Document Frequency (TF-IDF) to represent features and train machine …


Hidden Markov Model-Based Clustering For Malware Classification, Shamli Singh May 2021

Hidden Markov Model-Based Clustering For Malware Classification, Shamli Singh

Master's Projects

Automated techniques to classify malware samples into their respective families are critical in cybersecurity. Previously research applied ��-means clustering to scores generated by hidden Markov models (HMM) as a means of dealing with the malware classification problem. In this research, we follow a somewhat similar approach, but instead of using HMMs to generate scores, we directly cluster the HMMs themselves. We obtain good results on a challenging malware dataset.


Malware Analysis With Auxiliary-Classifier Gan, Rakesh Nagaraju May 2021

Malware Analysis With Auxiliary-Classifier Gan, Rakesh Nagaraju

Master's Projects

A generative adversarial network (GAN) is a powerful machine learning concept where both a generative and discriminative model are trained simultaneously. A recent trend in malware research consists of treating executables as images and employing image-based analysis techniques. In this research, we generate fake malware images using GANs, and we also consider the effectiveness of GANs for malware classification. Specifically, we consider auxiliary classifier GAN (AC-GAN), which enables us to work with multiclass data. We find that AC-GAN generates malware images that cannot be reliably distinguished from real malware images. In addition, we find that the detection capabilities of AC-GAN …


Keystroke Dynamics For User Authentication With Fixed And Free Text, Jianwei Li May 2021

Keystroke Dynamics For User Authentication With Fixed And Free Text, Jianwei Li

Master's Projects

YouTube videos often include captivating descriptions and intriguing thumbnails designed to increase the number of views, and thereby increase the revenue for the person who posted the video. This creates an incentive for people to post clickbait videos, in which the content might deviate significantly from the title, description, or thumbnail. In effect, users are tricked into clicking on clickbait videos. In this research, we consider the challenging problem of detecting clickbait YouTube videos. We experiment with multiple state of the art machine learning techniques and a variety of textual features.


Computer-Aided Diagnosis Of Low Grade Endometrial Stromal Sarcoma (Lgess), Xinxin Yang May 2021

Computer-Aided Diagnosis Of Low Grade Endometrial Stromal Sarcoma (Lgess), Xinxin Yang

Master's Projects

Low grade endometrial stromal sarcoma (LGESS) is rare form of cancer, account- ing for about 0.2% of all uterine cancer cases. Approximately 75% of LGESS patients are initially misdiagnosed with leiomyoma, which is a type of benign tumor that is also known as fibroids. In this research, uterine tissue biopsy images of potential LGESS patients are preprocessed using segmentation and staining normalization algorithms. A wide variety of classic machine learning and leading deep learning models are then applied to classify tissue images as either benign or cancerous. For classic techniques, the highest classification accuracy we attain is 85%, while our …


Machine Learning To Detect Malware Evolution, Lolitha Sresta Tupadha May 2021

Machine Learning To Detect Malware Evolution, Lolitha Sresta Tupadha

Master's Projects

Malware evolves over time and anti-virus must adapt to such evolution. Hence, it is critical to detect those points in time where malware has evolved so that appro-priate countermeasures can be undertaken. In this research, we perform a variety of experiments to determine when malware evolution is likely to have occurred. All of the evolution detection techniques that we consider are based on machine learning and can be fully automated—in particular, no reverse engineering or other labor-intensive manual analysis is required. Specifically, we consider analysis based on hidden Markov models and various word embedding techniques, among other machine learning based …


Malware Classification With Bert, Joel Lawrence Alvares May 2021

Malware Classification With Bert, Joel Lawrence Alvares

Master's Projects

Malware Classification is used to distinguish unique types of malware from each other.

This project aims to carry out malware classification using word embeddings which are used in Natural Language Processing (NLP) to identify and evaluate the relationship between words of a sentence. Word embeddings generated by BERT and Word2Vec for malware samples to carry out multi-class classification. BERT is a transformer based pre- trained natural language processing (NLP) model which can be used for a wide range of tasks such as question answering, paraphrase generation and next sentence prediction. However, the attention mechanism of a pre-trained BERT model can …


Fake Malware Classification With Cnn Via Image Conversion: A Game Theory Approach, Yash Sahasrabuddhe May 2021

Fake Malware Classification With Cnn Via Image Conversion: A Game Theory Approach, Yash Sahasrabuddhe

Master's Projects

Improvements in malware detection techniques have grown significantly over the past decade. These improvements have resulted in better security for systems from various forms of malware attacks. However, it is also the reason for continuous evolution of malware which makes it harder for current security mechanisms to detect them. Hence, there is a need to understand different malwares and study classification techniques using the ever-evolving field of machine learning. The goal of this research project is to identify similarities between malware families and to improve on classification of malwares within different malware families by implementing Convolutional Neural Networks (CNNs) on …


Translating Natural Language Queries To Sparql, Shreya Satish Bhajikhaye May 2021

Translating Natural Language Queries To Sparql, Shreya Satish Bhajikhaye

Master's Projects

The Semantic Web is an extensive knowledge base that contains facts in the form of RDF
triples. These facts are not easily accessible to the average user because to use them requires
an understanding of ontologies and a query language like SPARQL. Question answering systems
form a layer of abstraction on linked data to overcome these issues. These systems allow the
user to input a question in a natural language and receive the equivalent SPARQL query. The
user can then execute the query on the database to fetch the desired results. The standard
techniques involved in translating natural language questions …


Detecting And Predicting Visual Affordance Of Objects In A Given Environment, Bhumika Kaur Matharu May 2021

Detecting And Predicting Visual Affordance Of Objects In A Given Environment, Bhumika Kaur Matharu

Master's Projects

The rapid growth of the development of autonomous robots is transforming the manufacturing and healthcare industry in many ways, but they still face many challenges. One of the challenges experienced by autonomous robots is their inability to manipulate an unknown object without human supervision. One way through which autonomous robots can manipulate an unknown object is affordance learning [1]. Affordance describes the action a user can perform on the object in given surroundings. This report describes our proposed model to detect and predict the affordance of an object from videos by leveraging the spatial-temporal feature extraction through ConvLSTM and Fully …


A Hybrid Gaze Pointer With Voice Control, Indhuja Ravi May 2021

A Hybrid Gaze Pointer With Voice Control, Indhuja Ravi

Master's Projects

Accessibility in technology has been a challenge since the beginning of the 1800s. Starting with building typewriters for the blind by Pellegrino Turri to the on-screen keyboard built by Microsoft, there have been several advancements towards assistive technologies. The basic tools necessary for anyone to operate a computer are to be able to navigate the device, input information, and perceive the output. All these three categories have been undergoing tremendous advancements over the years. Especially, with the internet boom, it has now become a necessity to point onto a computer screen. This has somewhat attracted research into this particular area. …


Visual And Lingual Emotion Recognition Using Deep Learning Techniques, Akshay Kajale May 2021

Visual And Lingual Emotion Recognition Using Deep Learning Techniques, Akshay Kajale

Master's Projects

Emotion recognition has been an integral part of many applications like video games, cognitive computing, and human computer interaction. Emotion can be recognized by many sources including speech, facial expressions, hand gestures and textual attributes. We have developed a prototype emotion recognition system using computer vision and natural language processing techniques. Our goal hybrid system uses mobile camera frames and features abstracted from speech named Mel Frequency Cepstral Coefficient (MFCC) to recognize the emotion of a person. To acknowledge the emotions based on facial expressions, we have developed a Convolutional Neural Network (CNN) model, which has an accuracy of 68%. …


Machine Learning Using Serverless Computing, Vidish Naik May 2021

Machine Learning Using Serverless Computing, Vidish Naik

Master's Projects

Machine learning has been trending in the domain of computer science for quite some time. Newer and newer models and techniques are being developed every day. The adoption of cloud computing has only expedited the process of training machine learning. With its variety of services, cloud computing provides many options for training machine learning models. Leveraging these services is up to the user. Serverless computing is an important service offered by cloud service providers. It is useful for short tasks that are event-driven or periodic. Machine learning training can be divided into short tasks or batches to take advantage of …


Automating Text Encapsulation Using Deep Learning, Anket Sah May 2021

Automating Text Encapsulation Using Deep Learning, Anket Sah

Master's Projects

Data is an important aspect in any form be it communication, reviews, news articles, social media data, machine or real-time data. With the emergence of Covid-19, a pandemic seen like no other in recent times, information is being poured in from all directions on the internet. At times it is overwhelming to determine which data to read and follow. Another crucial aspect is separating factual data from distorted data that is being circulated widely. The title or short description of this data can play a key role. Many times, these descriptions can deceive a user with unwanted information. The user …


American Sign Language Assistant, Charulata Lodha May 2021

American Sign Language Assistant, Charulata Lodha

Master's Projects

Our implementation of a prototype computer vision system to help the deaf and mute
communicate in a shopping setting. Our system uses live video feeds to recognize American Sign Language (ASL) gestures and notify shop clerks of deaf and mute patrons’ intents. It generates a video dataset in the Unity Game Engine of 3D humanoid models in a shop setting performing ASL signs. Our system uses OpenPose to detect and recognize the bone points of the human body
from the live feed. The system then represents the motion sequences as high dimensional skeleton joint point trajectories followed by a time-warping …