Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 21 of 21

Full-Text Articles in Physical Sciences and Mathematics

Machine Learning-Based Anomaly Detection In Cloud Virtual Machine Resource Usage, Tarun Mourya Satveli Jan 2023

Machine Learning-Based Anomaly Detection In Cloud Virtual Machine Resource Usage, Tarun Mourya Satveli

Master's Projects

Anomaly detection is an important activity in cloud computing systems because it aids in the identification of odd behaviours or actions that may result in software glitch, security breaches, and performance difficulties. Detecting aberrant resource utilization trends in virtual machines is a typical application of anomaly detection in cloud computing (VMs). Currently, the most serious cyber threat is distributed denial-of-service attacks. The afflicted server's resources and internet traffic resources, such as bandwidth and buffer size, are slowed down by restricting the server's capacity to give resources to legitimate customers.

To recognize attacks and common occurrences, machine learning techniques such as …


Classification Of Darknet Traffic By Application Type, Shruti Sharma Jan 2023

Classification Of Darknet Traffic By Application Type, Shruti Sharma

Master's Projects

The darknet is frequently exploited for illegal purposes and activities, which makes darknet traffic detection an important security topic. Previous research has focused on various classification techniques for darknet traffic using machine learning and deep learning. We extend previous work by considering the effectiveness of a wide range of machine learning and deep learning technique for the classification of darknet traffic by application type. We consider the CICDarknet2020 dataset, which has been used in many previous studies, thus enabling a direct comparison of our results to previous work. We find that XGBoost performs the best among the classifiers that we …


Federated Learning For Protecting Medical Data Privacy, Abhishek Reddy Punreddy Jan 2023

Federated Learning For Protecting Medical Data Privacy, Abhishek Reddy Punreddy

Master's Projects

Deep learning is one of the most advanced machine learning techniques, and its prominence has increased in recent years. Language processing, predictions in medical research and pattern recognition are few of the numerous fields in which it is widely utilized. Numerous modern medical applications benefit greatly from the implementation of machine learning (ML) models and the disruptive innovations in the entire modern health care system. It is extensively used for constructing accurate and robust statistical models from large volumes of medical data collected from a variety of sources in contemporary healthcare systems [1]. Due to privacy concerns that restrict access …


Application Of Adversarial Attacks On Malware Detection Models, Vaishnavi Nagireddy Jan 2023

Application Of Adversarial Attacks On Malware Detection Models, Vaishnavi Nagireddy

Master's Projects

Malware detection is vital as it ensures that a computer is safe from any kind of malicious software that puts users at risk. Too many variants of these malicious software are being introduced everyday at increased speed. Thus, to guarantee security of computer systems, huge advancements in the field of malware detection are made and one such approach is to use machine learning for malware detection. Even though machine learning is very powerful, it is prone to adversarial attacks. In this project, we will try to apply adversarial attacks on malware detection models. To perform these attacks, fake samples that …


A Novel Handover Method Using Destination Prediction In 5g-V2x Networks, Pooja Shyamsundar Jan 2022

A Novel Handover Method Using Destination Prediction In 5g-V2x Networks, Pooja Shyamsundar

Master's Projects

This paper proposes a novel approach to handover optimization in fifth generation vehicular networks. A key principle in designing fifth generation vehicular network technology is continuous connectivity. This makes it important to ensure that there are no gaps in communication for mobile user equipment. Handovers can cause disruption in connectivity as the process involves switching from one base station to another. Issues in the handover process include poor load management for moving traffic resulting in low bandwidth or connectivity gaps, too many hops resulting in multiple unneccessary handovers, short dwell times and ineffective base station selection resulting in delays and …


Caption And Image Based Next-Word Auto-Completion, Meet Patel Jan 2022

Caption And Image Based Next-Word Auto-Completion, Meet Patel

Master's Projects

With the increasing number of options or choices in terms of entities like products, movies, songs, etc. which are now available to users, they try to save time by looking for an application or system that provides automatic recommendations. Recommender systems are automated computing processes that leverage concepts of Machine Learning, Data Mining and Artificial Intelligence towards generating product recommendations based on a user’s preferences. These systems have given a significant boost to businesses across multiple segments as a result of reduced human intervention. One similar aspect of this is content writing. It would save users a lot of time …


Graph Neural Networks For Malware Classification, Vrinda Malhotra Jan 2022

Graph Neural Networks For Malware Classification, Vrinda Malhotra

Master's Projects

Malware is a growing threat to the digital world. The first step to managing this threat is malware detection and classification. While traditional techniques rely on static or dynamic analysis of malware, the generation of these features requires expert knowledge. Function call graphs (FCGs) consist of program functions as their nodes and their interprocedural calls as their edges, providing a wealth of knowledge that can be utilized to classify malware without feature extraction that requires experts. This project treats malware classification as a graph classification problem, setting node features using the Local Degree Profile (LDP) model and using different graph …


Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar Dec 2021

Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar

Master's Projects

In an average human life, the eyes not only passively scan visual scenes, but most times end up actively performing tasks including, but not limited to, searching, comparing, and counting. As a result of the advances in technology, we are observing a boost in the average screen time. Humans are now looking at an increasing number of screens and in turn images and videos. Understanding what scene a user is looking at and what type of visual task is being performed can be useful in developing intelligent user interfaces, and in virtual reality and augmented reality devices. In this research, …


Analysis Of Camera Trap Footage Through Subject Recognition, Nirnayak Bhardwaj Dec 2021

Analysis Of Camera Trap Footage Through Subject Recognition, Nirnayak Bhardwaj

Master's Projects

Motion-sensitive cameras, otherwise known as camera traps, have become increasingly popular amongst ecologists for studying wildlife. These cameras allow scientists to remotely observe animals through an inexpensive and non-invasive approach. Due to the lenient nature of motion cameras, studies involving them often generate excessive amounts of footage with many photographs not containing any animal subjects. Thus, there is a need for a system that is capable of analyzing camera trap footage to determine if a picture holds value for researchers. While research into automated image recognition is well documented, it has had limited applications in the field of ecology. This …


Identifying Bots On Twitter With Benford’S Law, Sanmesh Bhosale Dec 2021

Identifying Bots On Twitter With Benford’S Law, Sanmesh Bhosale

Master's Projects

Over time Online Social Networks (OSNs) have grown exponentially in terms of active users and have now become an influential factor in the formation of public opinions. Due to this, the use of bots and botnets for spreading misinformation on OSNs has become a widespread concern. The biggest example of this was during the 2016 American Presidential Elections, where Russian bots on Twitter pumped out fake news to influence the election results.

Identifying bots and botnets on Twitter is not just based on visual analysis and can require complex statistical methods to score a profile based on multiple features and …


Analytical Models For Traffic Congestion And Accident Analysis, Hongrui Liu, Rahul Ramachandra Shetty Nov 2021

Analytical Models For Traffic Congestion And Accident Analysis, Hongrui Liu, Rahul Ramachandra Shetty

Mineta Transportation Institute Publications

In the US, over 38,000 people die in road crashes each year, and 2.35 million are injured or disabled, according to the statistics report from the Association for Safe International Road Travel (ASIRT) in 2020. In addition, traffic congestion keeping Americans stuck on the road wastes millions of hours and billions of dollars each year. Using statistical techniques and machine learning algorithms, this research developed accurate predictive models for traffic congestion and road accidents to increase understanding of the complex causes of these challenging issues. The research used US Accidents data consisting of 49 variables describing 4.2 million accident records …


Image-Based Real Estate Appraisal Using Cnns And Ensemble Learning, Prathamesh Dnyanesh Kumkar May 2021

Image-Based Real Estate Appraisal Using Cnns And Ensemble Learning, Prathamesh Dnyanesh Kumkar

Master's Projects

Real Estate Appraisal is performed to evaluate properties during a range of activities like buying, selling, mortgaging, or insuring. Traditionally, this process is done by real estate brokers who consider factors like the location of a house, its area, the number of bedrooms and bathrooms, along with other amenities to assess the property. This approach is quite subjective since different brokers may arrive at a different quote for the same property depending on their analysis. The development in machine learning algorithms has given rise to several Automated Valuation Models (AVMs) to estimate real estate prices. Real estate websites use such …


Evidence-Based Detection Of Pancreatic Canc, Rajeshwari Deepak Chandratre May 2020

Evidence-Based Detection Of Pancreatic Canc, Rajeshwari Deepak Chandratre

Master's Projects

This study is an effort to develop a tool for early detection of pancreatic cancer using evidential reasoning. An evidential reasoning model predicts the likelihood of an individual developing pancreatic cancer by processing the outputs of a Support Vector Classifier, and other input factors such as smoking history, drinking history, sequencing reads, biopsy location, family and personal health history. Certain features of the genomic data along with the mutated gene sequence of pancreatic cancer patients was obtained from the National Cancer Institute (NIH) Genomic Data Commons (GDC). This data was used to train the SVC. A prediction accuracy of ~85% …


Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund May 2020

Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund

Master's Projects

Lightyears beyond the Planet Earth there exist plenty of unknown and unexplored stars and Galaxies that need to be studied in order to support the Big Bang Theory and also make important astronomical discoveries in quest of knowing the unknown. Sophisticated devices and high-power computational resources are now deployed to make a positive effort towards data gathering and analysis. These devices produce massive amount of data from the astronomical surveys and the data is usually in terabytes or petabytes. It is exhaustive to process this data and determine the findings in short period of time. Many details can be missed …


Network Traffic Based Botnet Detection Using Machine Learning, Anand Ravindra Vishwakarma May 2020

Network Traffic Based Botnet Detection Using Machine Learning, Anand Ravindra Vishwakarma

Master's Projects

The field of information and computer security is rapidly developing in today’s world as the number of security risks is continuously being explored every day. The moment a new software or a product is launched in the market, a new exploit or vulnerability is exposed and exploited by the attackers or malicious users for different motives. Many attacks are distributed in nature and carried out by botnets that cause widespread disruption of network activity by carrying out DDoS (Distributed Denial of Service) attacks, email spamming, click fraud, information and identity theft, virtual deceit and distributed resource usage for cryptocurrency mining. …


Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg Dec 2019

Information Extraction From Biomedical Text Using Machine Learning, Deepti Garg

Master's Projects

Inadequate drug experimental data and the use of unlicensed drugs may cause adverse drug reactions, especially in pediatric populations. Every year the U.S. Food and Drug Administration approves human prescription drugs for marketing. The labels associated with these drugs include information about clinical trials and drug response in pediatric population. In order for doctors to make an informed decision about the safety and effectiveness of these drugs for children, there is a need to analyze complex and often unstructured drug labels. In this work, first, an exploratory analysis of drug labels using a Natural Language Processing pipeline is performed. Second, …


Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna May 2019

Classifying Classic Ciphers Using Machine Learning, Nivedhitha Ramarathnam Krishna

Master's Projects

We consider the problem of identifying the classic cipher that was used to generate a given ciphertext message. We assume that the plaintext is English and we restrict our attention to ciphertext consisting only of alphabetic characters. Among the classic ciphers considered are the simple substitution, Vigenère cipher, playfair cipher, and column transposition cipher. The problem of classification is approached in two ways. The first method uses support vector machines (SVM) trained directly on ciphertext to classify the ciphers. In the second approach, we train hidden Markov models (HMM) on each ciphertext message, then use these trained HMMs as features …


Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha May 2019

Emulation Vs Instrumentation For Android Malware Detection, Anukriti Sinha

Master's Projects

In resource constrained devices, malware detection is typically based on offline analysis using emulation. In previous work it has been claimed that such emulation fails for a significant percentage of Android malware because well-designed malware detects that the code is being emulated. An alternative to emulation is malware analysis based on code that is executing on an actual Android device. In this research, we collect features from a corpus of Android malware using both emulation and on-phone instrumentation. We train machine learning models based on emulated features and also train models based on features collected via instrumentation, and we compare …


Support Vector Machines For Image Spam Analysis, Aneri Chavda, Katerina Potika, Fabio Di Troia, Mark Stamp Jan 2018

Support Vector Machines For Image Spam Analysis, Aneri Chavda, Katerina Potika, Fabio Di Troia, Mark Stamp

Faculty Publications, Computer Science

Email is one of the most common forms of digital communication. Spam is unsolicited bulk email, while image spam consists of spam text embedded inside an image. Image spam is used as a means to evade text-based spam filters, and hence image spam poses a threat to email-based communication. In this research, we analyze image spam detection using support vector machines (SVMs), which we train on a wide variety of image features. We use a linear SVM to quantify the relative importance of the features under consideration. We also develop and analyze a realistic “challenge” dataset that illustrates the limitations …


Document Classification Using Machine Learning, Ankit Basarkar May 2017

Document Classification Using Machine Learning, Ankit Basarkar

Master's Projects

To perform document classification algorithmically, documents need to be represented such that it is understandable to the machine learning classifier. The report discusses the different types of feature vectors through which document can be represented and later classified. The project aims at comparing the Binary, Count and TfIdf feature vectors and their impact on document classification. To test how well each of the three mentioned feature vectors perform, we used the 20-newsgroup dataset and converted the documents to all the three feature vectors. For each feature vector representation, we trained the Naïve Bayes classifier and then tested the generated classifier …


Credit Scoring Using Logistic Regression, Ansen Mathew May 2017

Credit Scoring Using Logistic Regression, Ansen Mathew

Master's Projects

This report presents an approach to predict the credit scores of customers using the Logistic Regression machine learning algorithm. The research objective of this project is to perform a comparative study between feature selection and feature extraction, against the same dataset using the Logistic Regression machine learning algorithm. For feature selection, we have used Stepwise Logistic Regression. For feature extraction, we have used Singular Value Decomposition (SVD) and Weighted Singular Value Decomposition (SVD). In order to test the accuracy obtained using feature selection and feature extraction, we used a public credit dataset having 11 features and 150,000 records. After performing …