Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 239

Full-Text Articles in Physical Sciences and Mathematics

Influence Of Pavement Conditions On Commercial Motor Vehicle Crashes, Stephen Arhin, Babin Manandhar, Adam Gatiba Dec 2023

Influence Of Pavement Conditions On Commercial Motor Vehicle Crashes, Stephen Arhin, Babin Manandhar, Adam Gatiba

Mineta Transportation Institute Publications

Commercial motor vehicle (CMV) safety is a major concern in the United States, including the District of Columbia (DC), where CMVs make up 15% of traffic. This research uses a comprehensive approach, combining statistical analysis and machine learning techniques, to investigate the impact of road pavement conditions on CMV accidents. The study integrates traffic crash data from the Traffic Accident Reporting and Analysis Systems Version 2.0 (TARAS2) database with pavement condition data provided by the District Department of Transportation (DDOT). Data spanning from 2016 to 2020 was collected and analyzed, focusing on CMV routes in DC. The analysis employs binary …


Examining The Externalities Of Highway Capacity Expansions In California: An Analysis Of Land Use And Land Cover (Lulc) Using Remote Sensing Technology, Serena E. Alexander, Bo Yang, Owen Hussey, Derek Hicks Nov 2023

Examining The Externalities Of Highway Capacity Expansions In California: An Analysis Of Land Use And Land Cover (Lulc) Using Remote Sensing Technology, Serena E. Alexander, Bo Yang, Owen Hussey, Derek Hicks

Mineta Transportation Institute Publications

There are over 590,000 bridges dispersed across the roadway network that stretches across the United States alone. Each bridge with a length of 20 feet or greater must be inspected at least once every 24 months, according to the Federal Highway Act (FHWA) of 1968. This research developed an artificial intelligence (AI)-based framework for bridge and road inspection using drones with multiple sensors collecting capabilities. It is not sufficient to conduct inspections of bridges and roads using cameras alone, so the research team utilized an infrared (IR) camera along with a high-resolution optical camera. In many instances, the IR camera …


Dynamic Predictions Of Thermal Heating And Cooling Of Silicon Wafer, Hitesh Kumar Jan 2023

Dynamic Predictions Of Thermal Heating And Cooling Of Silicon Wafer, Hitesh Kumar

Master's Projects

Neural Networks are now emerging in every industry. All the industries are trying their best to exploit the benefits of neural networks and deep learning to make predictions or simulate their ongoing process with the use of their generated data. The purpose of this report is to study the heating pattern of a silicon wafer and make predictions using various machine learning techniques. The heating of the silicon wafer involves various factors ranging from number of lamps, wafer properties and points taken in consideration to capture the heating temperature. This process involves dynamic inputs which facilitates the heating of the …


Hate Speech Detection In Hindi, Pranjali Prakash Bansod Jan 2023

Hate Speech Detection In Hindi, Pranjali Prakash Bansod

Master's Projects

Social media is a great place to share one’s thoughts and to express oneself. Very often the same social media platforms become a means for spewing hatred.The large amount of data being shared on these platforms make it difficult to moderate the content shared by users. In a diverse country like India hate is present on social media in all regional languages, making it even more difficult to detect hate because of a lack of enough data to train deep/ machine learning models to make them understand regional languages.This work is our attempt at tackling hate speech in Hindi. We …


Yelp Restaurant Popularity Score Calculator, Sneh Bindesh Chitalia Jan 2023

Yelp Restaurant Popularity Score Calculator, Sneh Bindesh Chitalia

Master's Projects

Yelp is a popular social media platform that has gained much traction over the last few years. The critical feature of Yelp is it has information about any small or large-scale business, as well as reviews received from customers. The reviews have both a 1 to 5 star rating, as well as text. For a particular business, any user can view the reviews, but the stars are what most users check because it is an easy and fast way to decide. Therefore, the star rating is a good metric to measure a particular business’s value. However, there are other attributes …


Machine Learning-Based Anomaly Detection In Cloud Virtual Machine Resource Usage, Tarun Mourya Satveli Jan 2023

Machine Learning-Based Anomaly Detection In Cloud Virtual Machine Resource Usage, Tarun Mourya Satveli

Master's Projects

Anomaly detection is an important activity in cloud computing systems because it aids in the identification of odd behaviours or actions that may result in software glitch, security breaches, and performance difficulties. Detecting aberrant resource utilization trends in virtual machines is a typical application of anomaly detection in cloud computing (VMs). Currently, the most serious cyber threat is distributed denial-of-service attacks. The afflicted server's resources and internet traffic resources, such as bandwidth and buffer size, are slowed down by restricting the server's capacity to give resources to legitimate customers.

To recognize attacks and common occurrences, machine learning techniques such as …


Analyzing Improvement Of Mask R-Cnn On Arms Plates (And Sponges And Coral), James Lee Jan 2023

Analyzing Improvement Of Mask R-Cnn On Arms Plates (And Sponges And Coral), James Lee

Master's Projects

Coral Reefs and their diverse array of life forms play a vital role in maintaining the health of our planet's environment. However, due to their fragility, it can be challenging to study the reefs without damaging their delicate ecosystem. To address this issue, researchers have employed non-invasive methods such as using Autonomous Reef Monitoring Structures (ARMS) plates to monitor biodiversity. Data was collected as genetic samples from the plates, and high-resolution photographs were taken. To make the best use of this image data, scientists have turned to machine learning and computer vision. Prior to this study, MASKR-CNN was utilized as …


Federated Learning For Protecting Medical Data Privacy, Abhishek Reddy Punreddy Jan 2023

Federated Learning For Protecting Medical Data Privacy, Abhishek Reddy Punreddy

Master's Projects

Deep learning is one of the most advanced machine learning techniques, and its prominence has increased in recent years. Language processing, predictions in medical research and pattern recognition are few of the numerous fields in which it is widely utilized. Numerous modern medical applications benefit greatly from the implementation of machine learning (ML) models and the disruptive innovations in the entire modern health care system. It is extensively used for constructing accurate and robust statistical models from large volumes of medical data collected from a variety of sources in contemporary healthcare systems [1]. Due to privacy concerns that restrict access …


Malware Classification Using Graph Neural Networks, Manasa Mananjaya Jan 2023

Malware Classification Using Graph Neural Networks, Manasa Mananjaya

Master's Projects

Word embeddings are widely recognized as important in natural language pro- cessing for capturing semantic relationships between words. In this study, we conduct experiments to explore the effectiveness of word embedding techniques in classifying malware. Specifically, we evaluate the performance of Graph Neural Network (GNN) applied to knowledge graphs constructed from opcode sequences of malware files. In the first set of experiments, Graph Convolution Network (GCN) is applied to knowledge graphs built with different word embedding techniques such as Bag-of-words, TF-IDF, and Word2Vec. Our results indicate that Word2Vec produces the most effective word embeddings, serving as a baseline for comparison …


Malware Classification Using Api Call Information And Word Embeddings, Sahil Aggarwal Jan 2023

Malware Classification Using Api Call Information And Word Embeddings, Sahil Aggarwal

Master's Projects

Malware classification is the process of classifying malware into recognizable categories and is an integral part of implementing computer security. In recent times, machine learning has emerged as one of the most suitable techniques to perform this task. Models can be trained on various malware features such as opcodes, and API calls among many others to deduce information that would be helpful in the classification.

Word embeddings are a key part of natural language processing and can be seen as a representation of text wherein similar words will have closer representations. These embeddings can be used to discover a quantifiable …


Leveraging Tweets For Rapid Disaster Response Using Bert-Bilstm-Cnn Model, Satya Pranavi Manthena Jan 2023

Leveraging Tweets For Rapid Disaster Response Using Bert-Bilstm-Cnn Model, Satya Pranavi Manthena

Master's Projects

Digital networking sites such as Twitter give a global platform for users to discuss and express their own experiences with others. People frequently use social media to share their daily experiences, local news, and activities with others. Many rescue services and agencies frequently monitor this sort of data to identify crises and limit the danger of loss of life. During a natural catastrophe, many tweets are made in reference to the tragedy, making it a hot topic on Twitter. Tweets containing natural disaster phrases but do not discuss the event itself are not informational and should be labeled as non-disaster …


Video Sign Language Recognition Using Pose Extraction And Deep Learning Models, Shayla Luong Jan 2023

Video Sign Language Recognition Using Pose Extraction And Deep Learning Models, Shayla Luong

Master's Projects

Sign language recognition (SLR) has long been a studied subject and research field within the Computer Vision domain. Appearance-based and pose-based approaches are two ways to tackle SLR tasks. Various models from traditional to current state-of-the-art including HOG-based features, Convolutional Neural Network, Recurrent Neural Network, Transformer, and Graph Convolutional Network have been utilized to tackle the area of SLR. While classifying alphabet letters in sign language has shown high accuracy rates, recognizing words presents its set of difficulties including the large vocabulary size, the subtleties in body motions and hand orientations, and regional dialects and variations. The emergence of deep …


Gender Classification Via Human Joints Using Convolutional Neural Network, Cheng-En Sung Jan 2023

Gender Classification Via Human Joints Using Convolutional Neural Network, Cheng-En Sung

Master's Projects

With the growing demand for gender-related data on diverse applications, including security systems for ascertaining an individual’s identity for border crossing, as well as marketing purposes of digging the potential customer and tailoring special discounts for them, gender classification has become an essential task within the field of computer vision and deep learning. There has been extensive research conducted on classifying human gender using facial expression, exterior appearance (e.g., hair, clothes), or gait movement. However, within the scope of our research, none have specifically focused gender classification on two-dimensional body joints. Knowing this, we believe that a new prediction pipeline …


Codeval, Aditi Agrawal Jan 2023

Codeval, Aditi Agrawal

Master's Projects

Grading coding assignments call for a lot of work. There are numerous aspects of the code that need to be checked, such as compilation errors, runtime errors, the number of test cases passed or failed, and plagiarism. Automated grading tools for programming assignments can be used to help instructors and graders in evaluating the programming assignments quickly and easily. Creating the assignment on Canvas is again a time taking process and can be automated. We developed CodEval, which instantly grades the student assignment submitted on Canvas and provides feedback to the students. It also uploads, creates, and edits assignments, thereby …


Enhancing Facial Emotion Recognition Using Image Processing With Cnn, Sourabh Deokar Jan 2023

Enhancing Facial Emotion Recognition Using Image Processing With Cnn, Sourabh Deokar

Master's Projects

Facial expression recognition (FER) has been a challenging task in computer vision for decades. With recent advancements in deep learning, convolutional neural networks (CNNs) have shown promising results in this field. However, the accuracy of FER using CNNs heavily relies on the quality of the input images and the size of the dataset. Moreover, even in pictures of the same person with the same expression, brightness, backdrop, and stance might change. These variations are emphasized when comparing pictures of individuals with varying ethnic backgrounds and facial features, which makes it challenging for deep-learning models to classify. In this paper, we …


Twitter Bot Detection Using Nlp And Graph Classification, Warada Jayant Kulkarni Jan 2023

Twitter Bot Detection Using Nlp And Graph Classification, Warada Jayant Kulkarni

Master's Projects

Social media platforms are one of the primary resources for information as it is easily accessible, low in cost, and provides a high rate of information spread. Online social media (OSM) have become the main source of news information around the world, but because of the distributed nature of the web, it has increased the risk of fake news spread. Fake news is misleading information that is published as real news. Therefore, identifying fake news and flagging them as such, as well as detecting sources that generate them is an ongoing task for researchers and OSM companies. Bots are artificial …


Machine Learning To Predict Sports-Related Concussion Recovery Using Clinical Data, Yan Chu, Gregory Knell, Riley P. Brayton, Scott O. Burkhart, Xiaoqian Jiang, Shayan Shams Feb 2022

Machine Learning To Predict Sports-Related Concussion Recovery Using Clinical Data, Yan Chu, Gregory Knell, Riley P. Brayton, Scott O. Burkhart, Xiaoqian Jiang, Shayan Shams

Faculty Research, Scholarly, and Creative Activity

Objectives
Sport-related concussions (SRCs) are a concern for high school athletes. Understanding factors contributing to SRC recovery time may improve clinical management. However, the complexity of the many clinical measures of concussion data precludes many traditional methods. This study aimed to answer the question, what is the utility of modeling clinical concussion data using machine-learning algorithms for predicting SRC recovery time and protracted recovery?
Methods
This was a retrospective case series of participants aged 8 to 18 years with a diagnosis of SRC. A 6-part measure was administered to assess pre-injury risk factors, initial injury severity, and post-concussion symptoms, including …


Adversarial Attacks On Speech Separation Systems, Kendrick Trinh Jan 2022

Adversarial Attacks On Speech Separation Systems, Kendrick Trinh

Master's Projects

Speech separation is a special form of blind source separation in which the objective is to decouple two or more sources such that they are distinct. The need for such an ability grows as speech activated device usage increases in our every day life. These systems, however, are susceptible to malicious actors. In this work, we repurpose proven adversarial attacks and leverage them against a combination speech separation and speech recognition system. The attack adds adversarial noise to a mixture of two voices such that the two outputs of the speech separation system are similarly transcribed by the speech recognition …


Hidden Markov Models With Momentum, Andrew Miller Jan 2022

Hidden Markov Models With Momentum, Andrew Miller

Master's Projects

Momentum is a popular technique for improving convergence rates during gradient descent. In this research, we experiment with adding momentum to the Baum-Welch expectation-maximization algorithm for training Hidden Markov Models. We compare discrete Hidden Markov Models trained with and without momentum on English text and malware opcode data. The effectiveness of momentum is determined by measuring the changes in model score and classification accuracy due to momentum. Experiments indicate that adding momentum to Baum-Welch can reduce the number of iterations required for initial convergence during HMM training, particularly in cases where the model is slow to converge. However, momentum does …


Robustness Of Image-Based Malware Analysis, Katrina Tran Jan 2022

Robustness Of Image-Based Malware Analysis, Katrina Tran

Master's Projects

Being able to identify malware is important in preventing attacks. Image-based malware analysis is the study of images that are created from malware. Analyzing these images can help identify patterns in malware families. In previous work, "gist descriptor" features extracted from images have been used in malware classification problems and have shown promising results. In this research, we determine whether gist descriptors are robust with respect to malware obfuscation techniques, as compared to Convolutional Neural Networks (CNN) trained directly on malware images. Using the Python Image Library, we create images from malware executables and from malware that we obfuscate. We …


A Novel Handover Method Using Destination Prediction In 5g-V2x Networks, Pooja Shyamsundar Jan 2022

A Novel Handover Method Using Destination Prediction In 5g-V2x Networks, Pooja Shyamsundar

Master's Projects

This paper proposes a novel approach to handover optimization in fifth generation vehicular networks. A key principle in designing fifth generation vehicular network technology is continuous connectivity. This makes it important to ensure that there are no gaps in communication for mobile user equipment. Handovers can cause disruption in connectivity as the process involves switching from one base station to another. Issues in the handover process include poor load management for moving traffic resulting in low bandwidth or connectivity gaps, too many hops resulting in multiple unneccessary handovers, short dwell times and ineffective base station selection resulting in delays and …


Darknet Traffic Classification, Nhien Rust-Nguyen Jan 2022

Darknet Traffic Classification, Nhien Rust-Nguyen

Master's Projects

The anonymous nature of darknets is commonly exploited for illegal activities. Previous research has employed machine learning and deep learning techniques to automate the detection of darknet traffic to block these criminal activities. This research aims to improve darknet traffic detection by assessing Support Vector Machines (SVM), Random Forest (RF), Convolutional Neural Networks (CNN) and Auxiliary-Classifier Generative Adversarial Networks (AC-GAN) for classification of network traffic and the underlying application types. We find that our RF model outperforms the state-of-the-art machine learning techniques used by prior work with the CIC-Darknet2020 dataset. To evaluate the robustness of our RF classifier, we degrade …


Analysis Of Public Sentiment Of Covid-19 Pandemic, Vaccines, And Lockdowns, Devinesh Singh Jan 2022

Analysis Of Public Sentiment Of Covid-19 Pandemic, Vaccines, And Lockdowns, Devinesh Singh

Master's Projects

CoV-2 pandemic prompted lockdown measures to be implemented worldwide; these directives were implemented nationwide to stunt the spread of the infection. Throughout the lockdowns, millions of individuals resorted to social media for entertainment, communicate with friends and family, and express their opinions about the pandemic. Simultaneously, social media aided in the dissemination of misinformation, which has proven to be a threat to global health. Sentiment analysis, a technique used to analyze textual data, can be used to gain an overview of public opinion behind CoV-2 from Twitter and TikTok. The primary focus of the project is to build a deep …


Empirical Evaluation Of The Shift And Scale Parameters In Batch Normalization, Yashna Peerthum Jan 2022

Empirical Evaluation Of The Shift And Scale Parameters In Batch Normalization, Yashna Peerthum

Master's Projects

Batch Normalization (BatchNorm) is a technique that enables the training of deep neural networks, especially Convolutional Neural Networks (CNN) for computer vision tasks. It has been empirically demonstrated that BatchNorm increases per- formance, stability, and accuracy, although the reasons for these improvements are unclear. BatchNorm consists of a normalization step with trainable shift and scale parameters. In this paper, we examine the role of normalization and the shift and scale parameters in BatchNorm. We implement two new optimizers in PyTorch: a version of BatchNorm that we refer to as AffineLayer, which includes the shift and scale transform without normalization, and …


Multi-Step Prediction Using Tree Generation For Reinforcement Learning, Kevin Prakash Jan 2022

Multi-Step Prediction Using Tree Generation For Reinforcement Learning, Kevin Prakash

Master's Projects

The goal of reinforcement learning is to learn a policy that maximizes a reward function. In some environments with complete information, search algorithms are highly useful in simulating action sequences in a game tree. However, in many practical environments, such effective search strategies are not applicable since their state transition information may not be available. This paper proposes a novel method to approximate a game tree that enables reinforcement learning to use search strategies even in incomplete information environments. With an approximated game tree, the agent predicts all possible states multiple steps into the future and evaluates the states to …


Using Machine Learning To Maximize First-Generation Student Success A Contribution To The Mission Of Aiding The Underserved, Mustafa Emre Yesilyurt Jan 2022

Using Machine Learning To Maximize First-Generation Student Success A Contribution To The Mission Of Aiding The Underserved, Mustafa Emre Yesilyurt

Master's Projects

The Leadership and Career Accelerator (UNVS 101) is a course offered at San José State University (SJSU) designed to hone industry skills in and provide support to students of underserved backgrounds. The main goal of this study is to determine which features are most significant to identifying the students at risk of failing the course. This will allow faculty to better focus data collection efforts and facilitate an increase in classifier accuracy. The data came as three distinct sets (sources). One contained features describing student demographics and academic history, another described the students’ experience in the course, and a third …


Contextualized Vector Embeddings For Malware Detection, Vinay Pandya Jan 2022

Contextualized Vector Embeddings For Malware Detection, Vinay Pandya

Master's Projects

Malware classification is a technique to classify different types of malware which form an integral part of system security. The aim of this project is to use context dependant word embeddings to classify malware. Tansformers is a novel architecture which utilizes self attention to handle long range dependencies. They are particularly effective in many complex natural language processing tasks such as Masked Lan- guage Modelling(MLM) and Next Sentence Prediction(NSP). Different transfomer architectures such as BERT, DistilBert, Albert, and Roberta are used to generate context dependant word embeddings. These embeddings would help in classifying different malware samples based on their similarity …


Abstractive Text Summarization For Tweets, Siyu Chen Jan 2022

Abstractive Text Summarization For Tweets, Siyu Chen

Master's Projects

In the high-tech age, we can access a vast number of articles, information, news, and opinion online. The wealth of information allows us to learn about the topics we are interested in more easily and cheaply, but it also requires us to spend an enormous amount of time reading online. Text summarization can help us save a lot of reading time so that we can know more information in a shorter period. The primary goal of text summarization is to shorten the text while including as much vital information as possible in the original text so fewer people use this …


Whole File Chunk Based Deduplication Using Reinforcement Learning, Xincheng Yuan Jan 2022

Whole File Chunk Based Deduplication Using Reinforcement Learning, Xincheng Yuan

Master's Projects

Deduplication is the process of removing replicated data content from storage facilities like online databases, cloud datastore, local file systems, etc., which is commonly performed as part of data preprocessing to eliminate redundant data that requires unnecessary storage spaces and computing power. Deduplication is even more specifically essential for file backup systems since duplicated files will presumably consume more storage space, especially with a short backup period like daily [8]. A common technique in this field involves splitting files into chunks whose hashes can be compared using data structures or techniques like clustering. In this project we explore the possibility …


Cloud Provisioning And Management With Deep Reinforcement Learning, Alexandru Tol Jan 2022

Cloud Provisioning And Management With Deep Reinforcement Learning, Alexandru Tol

Master's Projects

The first web applications appeared in the early nineteen nineties. These applica- tions were entirely hosted in house by companies that developed them. In the mid 2000s the concept of a digital cloud was introduced by the then CEO of google Eric Schmidt. Now in the current day most companies will at least partially host their applications on proprietary servers hosted at data-centers or commercial clouds like Amazon Web Services (AWS) or Heroku.

This arrangement seems like a straight forward win-win for both parties, the customer gets rid of the hassle of maintaining a live server for their applications and …