Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 35

Full-Text Articles in Entire DC Network

Decreasing Trash In Local Creeks: A Program Evaluation Of The City Of San Jose’S Direct Discharge Trash Control Program, Lakeisha Bryant Dec 2022

Decreasing Trash In Local Creeks: A Program Evaluation Of The City Of San Jose’S Direct Discharge Trash Control Program, Lakeisha Bryant

Master's Projects

The entire San Francisco Bay was once a navigable waterway in the 1850s during the Gold Rush era. Large amounts of sediment from upstream erosion and mining flowed to the bay resulting in the downsizing of the bay’s square miles (Environmental Protection Agency, 2022). As a result of intense development on the bay shores and adjacent lands, the bay faces several challenges that affect its water quality and threatens aquatic ecosystems. Pesticides, mercury, metals, and pathogens are just a few substances in the bay that cause unhealthy conditions for aquatic life and threaten human health. California’s Water Resources Control Board …


Abstractive Text Summarization For Tweets, Siyu Chen Jan 2022

Abstractive Text Summarization For Tweets, Siyu Chen

Master's Projects

In the high-tech age, we can access a vast number of articles, information, news, and opinion online. The wealth of information allows us to learn about the topics we are interested in more easily and cheaply, but it also requires us to spend an enormous amount of time reading online. Text summarization can help us save a lot of reading time so that we can know more information in a shorter period. The primary goal of text summarization is to shorten the text while including as much vital information as possible in the original text so fewer people use this …


Proxy Re-Encryption In Blockchain-Based Application, Wangcheng Yuan Jan 2022

Proxy Re-Encryption In Blockchain-Based Application, Wangcheng Yuan

Master's Projects

Nowadays, blockchain-based technology has risen to a new dimension. With the advantage of the decentralized identity, data are transferred through decentralized and public ledgers. Those new contracts provide great visibility. However, there is still a need to keep some data private in many cases. Those private data should be encrypted while still benefiting from the decentralized on-chain protocol. Securing those private data in such a decentralized blockchain-based system is thus a critical problem. Our solution provides a decentralized protocol that lets users grant access to their private data with proxy re-encryption in SpartanGold (a blockchain-based cryptocurrency). We implement a third-party …


Codis: Community Detection Via Distributed Seed-Set Expansion On Graph Streams, Austin Anderson Jan 2022

Codis: Community Detection Via Distributed Seed-Set Expansion On Graph Streams, Austin Anderson

Master's Projects

Community detection has been and remains a very important topic in several fields. From marketing and social networking to biological studies, community detec- tion plays a key role in advancing research in many different fields. Research on this topic originally looked at classifying nodes into discrete communities, but eventually moved forward to placing nodes in multiple communities. Unfortunately, community detection has always been a time-inefficient process, and recent data sets have been simply to large to realistically process using traditional methods. Because of this, recent methods have turned to parallelism, but all these methods, while offering sig- nificant decrease in …


Caption And Image Based Next-Word Auto-Completion, Meet Patel Jan 2022

Caption And Image Based Next-Word Auto-Completion, Meet Patel

Master's Projects

With the increasing number of options or choices in terms of entities like products, movies, songs, etc. which are now available to users, they try to save time by looking for an application or system that provides automatic recommendations. Recommender systems are automated computing processes that leverage concepts of Machine Learning, Data Mining and Artificial Intelligence towards generating product recommendations based on a user’s preferences. These systems have given a significant boost to businesses across multiple segments as a result of reduced human intervention. One similar aspect of this is content writing. It would save users a lot of time …


Adversarial Attacks On Android Malware Detection And Classification, Srilekha Nune Jan 2022

Adversarial Attacks On Android Malware Detection And Classification, Srilekha Nune

Master's Projects

Recent years have seen an increase in sales of intelligent gadgets, particularly those using the Android operating system. This popularity has not gone unnoticed by malware writers. Consequently, many research efforts have been made to develop learning models that can detect Android malware. As a countermeasure, malware writers can consider adversarial attacks that disrupt the training or usage of such learning models. In this paper, we train a wide variety of machine learning models using the KronoDroid Android malware dataset, and we consider adversarial attacks on these models. Specifically, we carefully measure the decline in performance when the feature sets …


Empirical Evaluation Of The Shift And Scale Parameters In Batch Normalization, Yashna Peerthum Jan 2022

Empirical Evaluation Of The Shift And Scale Parameters In Batch Normalization, Yashna Peerthum

Master's Projects

Batch Normalization (BatchNorm) is a technique that enables the training of deep neural networks, especially Convolutional Neural Networks (CNN) for computer vision tasks. It has been empirically demonstrated that BatchNorm increases per- formance, stability, and accuracy, although the reasons for these improvements are unclear. BatchNorm consists of a normalization step with trainable shift and scale parameters. In this paper, we examine the role of normalization and the shift and scale parameters in BatchNorm. We implement two new optimizers in PyTorch: a version of BatchNorm that we refer to as AffineLayer, which includes the shift and scale transform without normalization, and …


Enabling Use Of Signal In A Disconnected Village Environment, Evan Chopra Jan 2022

Enabling Use Of Signal In A Disconnected Village Environment, Evan Chopra

Master's Projects

A significant portion of the world still does not have a stable internet connection. Those people should have the ability to communicate with their loved ones who may not live near by or to share ideas with friends. To power this achievable reality, our lab has set out on making infrastructure for enabling delay tolerant applications. This network will communicate using existing smartphones that will relay the information to a connected environment. The proof of concept application our lab is using is Signal as it offers end to end encryption messaging and an open source platform our lab can develop.


5g Mobility Management Using Satellites In Leo (Low Earth Orbit), Peter Knight Jan 2022

5g Mobility Management Using Satellites In Leo (Low Earth Orbit), Peter Knight

Master's Projects

As technology moves forward we find ourselves demanding faster and better quality internet. Towards that end, 5G was developed. While cell towers are already being built with 5G in mind, some of the built-in limitations of 5G mean that we may be seeing satellites with it in the near future. This development would require a drastic change in the current algorithms used for handovers. These algorithms fail to take into account the high relative velocity of the satellites and result in many unnecessary handovers which increases the likelihood of a dropped connection. This paper will propose and evaluate the effectiveness …


Whole File Chunk Based Deduplication Using Reinforcement Learning, Xincheng Yuan Jan 2022

Whole File Chunk Based Deduplication Using Reinforcement Learning, Xincheng Yuan

Master's Projects

Deduplication is the process of removing replicated data content from storage facilities like online databases, cloud datastore, local file systems, etc., which is commonly performed as part of data preprocessing to eliminate redundant data that requires unnecessary storage spaces and computing power. Deduplication is even more specifically essential for file backup systems since duplicated files will presumably consume more storage space, especially with a short backup period like daily [8]. A common technique in this field involves splitting files into chunks whose hashes can be compared using data structures or techniques like clustering. In this project we explore the possibility …


Canvas Autoquiz, Archit Jain Jan 2022

Canvas Autoquiz, Archit Jain

Master's Projects

Online learning management platforms such as Canvas are thriving and quickly replacing traditional classrooms, especially during these pandemic-struck times. As more and more quizzes are administered online, we need tools that make the quiz creation process easier and faster. Canvas Autoquiz is a command-line tool that allows instructors to automatically create and upload quizzes of varying difficulty levels. It also allows instructors to export quizzes from one LMS platform to another. This project explores the need, design, and implementation of the tool, and prospective future work.


Adversarial Attacks On Speech Separation Systems, Kendrick Trinh Jan 2022

Adversarial Attacks On Speech Separation Systems, Kendrick Trinh

Master's Projects

Speech separation is a special form of blind source separation in which the objective is to decouple two or more sources such that they are distinct. The need for such an ability grows as speech activated device usage increases in our every day life. These systems, however, are susceptible to malicious actors. In this work, we repurpose proven adversarial attacks and leverage them against a combination speech separation and speech recognition system. The attack adds adversarial noise to a mixture of two voices such that the two outputs of the speech separation system are similarly transcribed by the speech recognition …


Benchmarking Newsql Database Voltdb, Kevin Schumacher Jan 2022

Benchmarking Newsql Database Voltdb, Kevin Schumacher

Master's Projects

NewSQL is a type of relational database that is able to horizontally scale while retaining linearizable consistency. This is an improvement over a traditional SQL relational database because SQL databases cannot effectively scale across multiple machines. This is also an improvement over NoSQL databases because NewSQL databases are designed from the ground up to be consistent and have ACID guarantees. However, it should be noted that NewSQL databases are not a one size fits all type of database, each specific database is designed to perform well on specific workloads. This project will evaluate a NewSQL database, VoltDB, with a focus …


Improving User Experiences For Wiki Systems, Parth Patel Jan 2022

Improving User Experiences For Wiki Systems, Parth Patel

Master's Projects

Wiki systems are web applications that allow users to collaboratively manage the content. Such systems enable users to read and write information in the form of web pages and share media items like videos, audios, books etc. Yioop is an open-source web portal with features of a search engine, a wiki system and discussion groups. In this project I have enhanced Yioop’s features for improving the user experiences. The preliminary work introduced new features like emoji picker tool for direct messaging system, unit testing framework for automating the UI testing of Yioop and redeeming advertisement credits back into real money. …


Multi-Step Prediction Using Tree Generation For Reinforcement Learning, Kevin Prakash Jan 2022

Multi-Step Prediction Using Tree Generation For Reinforcement Learning, Kevin Prakash

Master's Projects

The goal of reinforcement learning is to learn a policy that maximizes a reward function. In some environments with complete information, search algorithms are highly useful in simulating action sequences in a game tree. However, in many practical environments, such effective search strategies are not applicable since their state transition information may not be available. This paper proposes a novel method to approximate a game tree that enables reinforcement learning to use search strategies even in incomplete information environments. With an approximated game tree, the agent predicts all possible states multiple steps into the future and evaluates the states to …


Graph Neural Networks For Malware Classification, Vrinda Malhotra Jan 2022

Graph Neural Networks For Malware Classification, Vrinda Malhotra

Master's Projects

Malware is a growing threat to the digital world. The first step to managing this threat is malware detection and classification. While traditional techniques rely on static or dynamic analysis of malware, the generation of these features requires expert knowledge. Function call graphs (FCGs) consist of program functions as their nodes and their interprocedural calls as their edges, providing a wealth of knowledge that can be utilized to classify malware without feature extraction that requires experts. This project treats malware classification as a graph classification problem, setting node features using the Local Degree Profile (LDP) model and using different graph …


Jparsec - A Parser Combinator For Javascript, Sida Zhong Jan 2022

Jparsec - A Parser Combinator For Javascript, Sida Zhong

Master's Projects

Parser combinators have been a popular parsing approach in recent years. Compared with traditional parsers, a parser combinator has both readability and maintenance advantages.

This project aims to construct a lightweight parser construct library for Javascript called Jparsec. Based on the modular nature of a parser combinator, the implementation uses higher-order functions. JavaScript provides a friendly and simple way to use higher-order functions, so the main construction method of this project will use JavaScript's lambda functions. In practical applications, a parser combinator is mainly used as a tool, such as parsing JSON files.

In order to verify the utility of …


Using Machine Learning To Maximize First-Generation Student Success A Contribution To The Mission Of Aiding The Underserved, Mustafa Emre Yesilyurt Jan 2022

Using Machine Learning To Maximize First-Generation Student Success A Contribution To The Mission Of Aiding The Underserved, Mustafa Emre Yesilyurt

Master's Projects

The Leadership and Career Accelerator (UNVS 101) is a course offered at San José State University (SJSU) designed to hone industry skills in and provide support to students of underserved backgrounds. The main goal of this study is to determine which features are most significant to identifying the students at risk of failing the course. This will allow faculty to better focus data collection efforts and facilitate an increase in classifier accuracy. The data came as three distinct sets (sources). One contained features describing student demographics and academic history, another described the students’ experience in the course, and a third …


A Study On Human Face Expressions Using Convolutional Neural Networks And Generative Adversarial Networks, Sriramm Muthyala Sudhakar Jan 2022

A Study On Human Face Expressions Using Convolutional Neural Networks And Generative Adversarial Networks, Sriramm Muthyala Sudhakar

Master's Projects

Human beings express themselves via words, signs, gestures, and facial emotions. Previous research using pre-trained convolutional models had been done by freezing the entire network and running the models without the use of any image processing techniques. In this research, we attempt to enhance the accuracy of many deep CNN architectures like ResNet and Senet, using a variety of different image processing techniques like Image Data Generator, Histogram Equalization, and UnSharpMask. We used FER 2013, which is a dataset containing multiple classes of images. While working on these models, we decided to take things to the next level, and we …


City Of Milpitas Trash Capture Device Program: An Evaluation Of System Performance And Compliance With The Municipal Regional Permit, Joseph Aguilera Jan 2022

City Of Milpitas Trash Capture Device Program: An Evaluation Of System Performance And Compliance With The Municipal Regional Permit, Joseph Aguilera

Master's Projects

Water pollution negatively impacts the environment and human population. The problem persists despite various mitigation efforts, strategies, and the implementation of regulatory requirements. It is estimated that Californians dispose of approximately 40 million tons of consumer items and waste materials annually (California Department of Resource Recycling and Recovery, 2019). As the population increases, it is expected that negative impacts of trash on the environment will be exacerbated. To address this, municipalities in California apply various methods to reduce trash before it enters ocean waters.

The primary vehicle for urban trash pollutants to reach ocean waters is through storm water conveyance …


Hard Real-Time Linux On A Raspberry Pi For 3d Printing, Alvin Nguyen Jan 2022

Hard Real-Time Linux On A Raspberry Pi For 3d Printing, Alvin Nguyen

Master's Projects

The project presents how a Raspberry Pi with hard real-time enabled Linux can control stepper motors to operate the kinematics of a 3D (three-dimensional) printer. The consistent performance of the Raspberry Pi with the PREEMPT-RT (real-time) patch can satisfy real hard-time requirements for 3D printing kinematics, without introducing dedicated microcontrollers. The Klipper 3D printer firmware enables one of the Raspberry Pi processors to act as the Klipper MCU, the primary controller for the hardware components. This project introduces a software implementation of the control logic for controlling the stepper motors, which utilizes the PCA9685 pwm driver and TB6612 motor drivers …


Gesture Recognition Using Neural Networks, Ashwini Kurady Jan 2022

Gesture Recognition Using Neural Networks, Ashwini Kurady

Master's Projects

The advances in technology have brought in a lot of changes in the way humans go about their lives. This has enhanced the significance of Artificial Neural Networks and Computer Vision- based interactions with the world. Gesture Recognition is one of the major focus areas in Computer Vision. This involves Human Computer Interfaces (HCI) that would capture and understand human actions. In this project, we will explore how Neural Network concepts can be applied in this challenging field of Computer Vision. By leveraging the latest research for Gesture Recognition, we researched on how to capture the movement across different frames …


Contextualized Vector Embeddings For Malware Detection, Vinay Pandya Jan 2022

Contextualized Vector Embeddings For Malware Detection, Vinay Pandya

Master's Projects

Malware classification is a technique to classify different types of malware which form an integral part of system security. The aim of this project is to use context dependant word embeddings to classify malware. Tansformers is a novel architecture which utilizes self attention to handle long range dependencies. They are particularly effective in many complex natural language processing tasks such as Masked Lan- guage Modelling(MLM) and Next Sentence Prediction(NSP). Different transfomer architectures such as BERT, DistilBert, Albert, and Roberta are used to generate context dependant word embeddings. These embeddings would help in classifying different malware samples based on their similarity …


Poriferal Vision: Deep Transfer Learning-Based Sponge Spicules Identification & Taxonomic Classification, Sudhin Domala Jan 2022

Poriferal Vision: Deep Transfer Learning-Based Sponge Spicules Identification & Taxonomic Classification, Sudhin Domala

Master's Projects

The phylum Porifera includes the aquatic organisms known as sponges. Sponges are classified into four classes: Calcarea, Hexactinellida, Demospongiae, and Homoscleromorpha. Within Demospongiae and Hexactinellida, sponges’ skeletons are needle-like spicules made of silica. With a wide variety of shapes and sizes, these siliceous spicules’ morphology plays a pivotal role in assessing and understanding sponges' taxonomic diversity and evolution. In marine ecosystems, when sponges die their bodies disintegrate over time, but their spicules remain in the sediments as fossilized records that bear ample taxonomic information to reconstruct the evolution of sponge communities and sponge phylogeny.

Traditional methods of identifying spicules from …


Predicting Externally Visible Traits From A Dna Sample For Law Enforcement Applications, Niraj Pandkar Jan 2022

Predicting Externally Visible Traits From A Dna Sample For Law Enforcement Applications, Niraj Pandkar

Master's Projects

A large majority of crimes such as homicides, sexual assaults and missing person cases are not solved within a reasonable timeframe and become cold cases. The ability to predict visual appearance and ancestry from a DNA sample will provide an unprecedented advancement in such criminal investigations. DNA based prediction of craniofacial features, phenotypes and ancestry can be used to reduce the pool of candidates onto which to perform further investigations. To achieve the above goal, it is first essential to substantiate, model and measure the intrinsic relationship between the genomic markers and phenotypic features. The first step is to standardize …


Hidden Markov Models With Momentum, Andrew Miller Jan 2022

Hidden Markov Models With Momentum, Andrew Miller

Master's Projects

Momentum is a popular technique for improving convergence rates during gradient descent. In this research, we experiment with adding momentum to the Baum-Welch expectation-maximization algorithm for training Hidden Markov Models. We compare discrete Hidden Markov Models trained with and without momentum on English text and malware opcode data. The effectiveness of momentum is determined by measuring the changes in model score and classification accuracy due to momentum. Experiments indicate that adding momentum to Baum-Welch can reduce the number of iterations required for initial convergence during HMM training, particularly in cases where the model is slow to converge. However, momentum does …


Generative Adversarial Networks For Image-Based Malware Classification, Huy Nguyen Jan 2022

Generative Adversarial Networks For Image-Based Malware Classification, Huy Nguyen

Master's Projects

Malware detection and analysis are important topics in cybersecurity. For efficient malware removal, determination of malware threat levels, and damage estimation, malware family classification plays a critical role. With the rise in computing power and the advent of cloud computing, deep learning models for malware analysis has gained in popularity. In this paper, we extract features from malware executable files and represent them as images using various approaches. We then focus on Generative Adversarial Networks (GAN) for multiclass classification and compare our GAN results to other popular machine learning techniques, including Support Vector Machine

(SVM), XGBoost, and Restricted Boltzmann Machines …


Darknet Traffic Classification, Nhien Rust-Nguyen Jan 2022

Darknet Traffic Classification, Nhien Rust-Nguyen

Master's Projects

The anonymous nature of darknets is commonly exploited for illegal activities. Previous research has employed machine learning and deep learning techniques to automate the detection of darknet traffic to block these criminal activities. This research aims to improve darknet traffic detection by assessing Support Vector Machines (SVM), Random Forest (RF), Convolutional Neural Networks (CNN) and Auxiliary-Classifier Generative Adversarial Networks (AC-GAN) for classification of network traffic and the underlying application types. We find that our RF model outperforms the state-of-the-art machine learning techniques used by prior work with the CIC-Darknet2020 dataset. To evaluate the robustness of our RF classifier, we degrade …


Faking Sensor Noise Information, Justin Chang Jan 2022

Faking Sensor Noise Information, Justin Chang

Master's Projects

Noise residue detection in digital images has recently been used as a method to classify images based on source camera model type. The meteoric rise in the popularity of using Neural Network models has also been used in conjunction with the concept of noise residuals to classify source camera models. However, many papers gloss over the details on the methods of obtaining noise residuals and instead rely on the self- learning aspect of deep neural networks to implicitly discover this themselves. For this project I propose a method of obtaining noise residuals (“noiseprints”) and denoising an image, as well as …


Robustness Of Image-Based Malware Analysis, Katrina Tran Jan 2022

Robustness Of Image-Based Malware Analysis, Katrina Tran

Master's Projects

Being able to identify malware is important in preventing attacks. Image-based malware analysis is the study of images that are created from malware. Analyzing these images can help identify patterns in malware families. In previous work, "gist descriptor" features extracted from images have been used in malware classification problems and have shown promising results. In this research, we determine whether gist descriptors are robust with respect to malware obfuscation techniques, as compared to Convolutional Neural Networks (CNN) trained directly on malware images. Using the Python Image Library, we create images from malware executables and from malware that we obfuscate. We …