Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Theses/Dissertations

2020

Machine Learning

Discipline
Institution
Publication

Articles 1 - 30 of 48

Full-Text Articles in Computer Sciences

Distributed Load Testing By Modeling And Simulating User Behavior, Chester Ira Parrott Dec 2020

Distributed Load Testing By Modeling And Simulating User Behavior, Chester Ira Parrott

LSU Doctoral Dissertations

Modern human-machine systems such as microservices rely upon agile engineering practices which require changes to be tested and released more frequently than classically engineered systems. A critical step in the testing of such systems is the generation of realistic workloads or load testing. Generated workload emulates the expected behaviors of users and machines within a system under test in order to find potentially unknown failure states. Typical testing tools rely on static testing artifacts to generate realistic workload conditions. Such artifacts can be cumbersome and costly to maintain; however, even model-based alternatives can prevent adaptation to changes in a system …


Cross Dataset Evaluation For Iot Network Intrusion Detection, Anjum Farah Dec 2020

Cross Dataset Evaluation For Iot Network Intrusion Detection, Anjum Farah

Theses and Dissertations

With the advent of Internet of Things (IOT) technology, the need to ensure the security of an IOT network has become important. There are several intrusion detection systems (IDS) that are available for analyzing and predicting network anomalies and threats. However, it is challenging to evaluate them to realistically estimate their performance when deployed. A lot of research has been conducted where the training and testing is done using the same simulated dataset. However, realistically, a network on which an intrusion detection model is deployed will be very different from the network on which it was trained. The aim of …


Approaching Hanabi With Q-Learning And Evolutionary Algorithm, Joseph Palmersten Dec 2020

Approaching Hanabi With Q-Learning And Evolutionary Algorithm, Joseph Palmersten

Culminating Projects in Computer Science and Information Technology

Hanabi is a cooperative card game with hidden information that requires cooperation and communication between the players. For a machine learning agent to be successful at the Hanabi, it will have to learn how to communicate and infer information from the communication of other players. To approach the problem of Hanabi the machine learning methods of Q-learning and Evolutionary algorithm are proposed as potential solutions. The agents that were created using the method are shown to not achieve human levels of communication.


Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li Dec 2020

Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li

Masters Theses

Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …


Attentional Parsing Networks, Marcus Karr Dec 2020

Attentional Parsing Networks, Marcus Karr

Master's Theses

Convolutional neural networks (CNNs) have dominated the computer vision field since the early 2010s, when deep learning largely replaced previous approaches like hand-crafted feature engineering and hierarchical image parsing. Meanwhile transformer architectures have attained preeminence in natural language processing, and have even begun to supplant CNNs as the state of the art for some computer vision tasks.

This study proposes a novel transformer-based architecture, the attentional parsing network, that reconciles the deep learning and hierarchical image parsing approaches to computer vision. We recast unsupervised image representation as a sequence-to-sequence translation problem where image patches are mapped to successive layers …


Using Object Detection Algorithm And Optical Character Recognition To Read Data From Alphanumeric Tags In Text, Ana Bazerque, Davi Moraes, Marcela Souza Oct 2020

Using Object Detection Algorithm And Optical Character Recognition To Read Data From Alphanumeric Tags In Text, Ana Bazerque, Davi Moraes, Marcela Souza

ICT

The present document explores the use of machine learning techniques, specifically supervised learning and classification. It applies those techniques to create a solution for a real world company that provides medical products and services to hospitals. This project will deal with streamlining the calibration of medical weighing scales. The developed application will use object detection and character recognition to identify and classify a digital image of a scale’s tag, and fill in a form with the corresponding data. The main reason for the need of this application is to avoid human errors and automate the collection of data from the …


Machine Learning Applications For Drug Repurposing, Hansaim Lim Sep 2020

Machine Learning Applications For Drug Repurposing, Hansaim Lim

Dissertations, Theses, and Capstone Projects

The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected …


Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis Aug 2020

Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis

Theses and Dissertations

Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease.

This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an …


Dictionary-Based Data Generation For Fine-Tuning Bert For Adverbial Paraphrasing Tasks, Mark Anthony Carthon Aug 2020

Dictionary-Based Data Generation For Fine-Tuning Bert For Adverbial Paraphrasing Tasks, Mark Anthony Carthon

Theses and Dissertations

Recent advances in natural language processing technology have led to the emergence of

large and deep pre-trained neural networks. The use and focus of these networks are on transfer

learning. More specifically, retraining or fine-tuning such pre-trained networks to achieve state

of the art performance in a variety of challenging natural language processing/understanding

(NLP/NLU) tasks. In this thesis, we focus on identifying paraphrases at the sentence level using

the network Bidirectional Encoder Representations from Transformers (BERT). It is well

understood that in deep learning the volume and quality of training data is a determining factor

of performance. The objective of …


A Study Of Information Bots And Knowledge Bots, Amartya Hatua Aug 2020

A Study Of Information Bots And Knowledge Bots, Amartya Hatua

Dissertations

In this dissertation, a study of different aspects of information bots and knowledge bots is done. The research contributes to a better understanding of the various characteristics of information bots as well as the different patterns and factors responsible for the information diffusion in a social network. This research also shows how these factors can be used to predict information diffusion for a particular topic in a social network. The second part of the research is focused on strategies for improving the knowledge base of knowledge bots, where two different approaches are studied. In the first approach, knowledge is transferred …


An Investigation Into Multi-View Error Correcting Output Code Classifiers Applied To Organ Tissue Classification, Daniel Alvarez Aug 2020

An Investigation Into Multi-View Error Correcting Output Code Classifiers Applied To Organ Tissue Classification, Daniel Alvarez

UNLV Theses, Dissertations, Professional Papers, and Capstones

Large amounts of data is being generated constantly each day, so much data that it is difficult to find patterns in order to predict outcomes and make decisions for both humans and machines alike. It would be useful if this data could be simplified using machine learning techniques. For example, biological cell identity is dependent on many factors tied to genetic processes. Such factors include proteins, gene transcription, and gene methylation. Each of these factors are highly complex mechanism with immense amounts of data. Simplifying these can then be helpful in finding patterns in them. Error-Correcting Output Codes (ECOC) does …


Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat Jul 2020

Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat

Electronic Thesis and Dissertation Repository

The rapid growth of the Internet and related technologies has led to the collection of large amounts of data by individuals, organizations, and society in general [1]. However, this often leads to information overload which occurs when the amount of input (e.g. data) a human is trying to process exceeds their cognitive capacities [2]. Machine learning (ML) has been proposed as one potential methodology capable of extracting useful information from large sets of data [1]. This thesis focuses on two applications. The first is education, namely e-Learning environments. Within this field, this thesis proposes different optimized ML ensemble models to …


Automated Anomaly Detection And Localization System For A Microservices Based Cloud System, Priyanka Prakash Naikade Jul 2020

Automated Anomaly Detection And Localization System For A Microservices Based Cloud System, Priyanka Prakash Naikade

Electronic Thesis and Dissertation Repository

Context: With an increasing number of applications running on a microservices-based cloud system (such as AWS, GCP, IBM Cloud), it is challenging for the cloud providers to offer uninterrupted services with guaranteed Quality of Service (QoS) factors. Problem Statement: Existing monitoring frameworks often do not detect critical defects among a large volume of issues generated, thus affecting recovery response times and usage of maintenance human resource. Also, manually tracing the root causes of the issues requires a significant amount of time. Objective: The objective of this work is to: (i) detect performance anomalies, in real-time, through monitoring KPIs (Key Performance …


Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah Jul 2020

Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah

Electronic Thesis and Dissertation Repository

The increasing use of electronic platforms in healthcare has resulted in the generation of unprecedented amounts of data in recent years. The amount of data available to clinical researchers, physicians, and healthcare administrators continues to grow, which creates an untapped resource with the ability to improve the healthcare system drastically. Despite the enthusiasm for adopting electronic health records (EHRs), some recent studies have shown that EHR-based systems hardly improve the ability of healthcare providers to make better decisions. One reason for this inefficacy is that these systems do not allow for human-data interaction in a manner that fits and supports …


Deep Learning Predictive Modeling With Data Challenges (Small, Big, Or Imbalanced), Renhao Liu Jul 2020

Deep Learning Predictive Modeling With Data Challenges (Small, Big, Or Imbalanced), Renhao Liu

USF Tampa Graduate Theses and Dissertations

In the real world, data used to build machine learning models always has different sizes and characteristics. These size and characteristic features, including small datasets, big datasets, imbalanced datasets, often lead to different challenges when training machine learning models. Models trained on a small number of observations tend to overfit the training data and produce inaccurate results. When it comes to big data, efficiently learning from "huge" size data in a short time becomes important. With an imbalanced dataset, learning is usually biased towards the majority class in the data and appropriate measurements are needed to check model performance.

As …


Algorithmic Robot Design: Label Maps, Procrustean Graphs, And The Boundary Of Non-Destructiveness, Shervin Ghasemlou Jul 2020

Algorithmic Robot Design: Label Maps, Procrustean Graphs, And The Boundary Of Non-Destructiveness, Shervin Ghasemlou

Theses and Dissertations

This dissertation is focused on the problem of algorithmic robot design. The process of designing a robot or a team of robots that can reliably accomplish a task in an environment requires several key elements. How the problem is formulated can play a big role in the design process. The ability of the model to correctly reflect the environment, the events, and different pieces of the problem is crucial. Another key element is the ability of the model to show the relationship between different designs of a single system. These two elements can enable design algorithms to navigate through the …


Machine Learning For The Internet Of Things: Applications, Implementation, And Security, Vishalini Laguduva Ramnath Jul 2020

Machine Learning For The Internet Of Things: Applications, Implementation, And Security, Vishalini Laguduva Ramnath

USF Tampa Graduate Theses and Dissertations

Artificial intelligence and ubiquitous sensor systems have seen tremendous advances in recent times, resulting in groundbreaking impact across domains such as healthcare, entertainment, and transportation through a collective ecosystem called the Internet of Things. The advent of 5G and improved wireless networks will further accelerate the research and development of tools in deep learning, sensor systems, and computing platforms by providing improved network latency and bandwidth. While tremendous progress has been made in the Internet of Things, current work has largely focused on building robust applications that leverage the data collected through ubiquitous sensor nodes to provide actionable rules and …


A Hybrid Approach To Procedural Dungeon Generation, Mathias Paul Babin Jun 2020

A Hybrid Approach To Procedural Dungeon Generation, Mathias Paul Babin

Electronic Thesis and Dissertation Repository

This thesis presents a novel approach to the Procedural Content Generation (PCG) of both maze and dungeon environments. The solution we propose in this thesis borrows techniques from both Procedural Content Generation via Machine Learning as well as Constructive PCG methods. The approach we take involves decomposing the problem of level generation into a series of stages which begins with the production of macro-level functional structures and ends with micro-level aesthetic details; specifically, we train a Deep Convolutional Neural Network to produce high-quality mazes, which in turn, are transformed into the rooms of larger dungeon levels using a constructive algorithm. …


Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa Jun 2020

Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa

Electronic Thesis and Dissertation Repository

In the field of bioinformatics, taxonomic classification is the scientific practice of identifying, naming, and grouping of organisms based on their similarities and differences. The problem of taxonomic classification is of immense importance considering that nearly 86% of existing species on Earth and 91% of marine species remain unclassified. Due to the magnitude of the datasets, the need exists for an approach and software tool that is scalable enough to handle large datasets and can be used for rapid sequence comparison and analysis. We propose ML-DSP, a stand-alone alignment-free software tool that uses Machine Learning and Digital Signal Processing to …


Evidence-Based Detection Of Pancreatic Canc, Rajeshwari Deepak Chandratre May 2020

Evidence-Based Detection Of Pancreatic Canc, Rajeshwari Deepak Chandratre

Master's Projects

This study is an effort to develop a tool for early detection of pancreatic cancer using evidential reasoning. An evidential reasoning model predicts the likelihood of an individual developing pancreatic cancer by processing the outputs of a Support Vector Classifier, and other input factors such as smoking history, drinking history, sequencing reads, biopsy location, family and personal health history. Certain features of the genomic data along with the mutated gene sequence of pancreatic cancer patients was obtained from the National Cancer Institute (NIH) Genomic Data Commons (GDC). This data was used to train the SVC. A prediction accuracy of ~85% …


Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund May 2020

Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund

Master's Projects

Lightyears beyond the Planet Earth there exist plenty of unknown and unexplored stars and Galaxies that need to be studied in order to support the Big Bang Theory and also make important astronomical discoveries in quest of knowing the unknown. Sophisticated devices and high-power computational resources are now deployed to make a positive effort towards data gathering and analysis. These devices produce massive amount of data from the astronomical surveys and the data is usually in terabytes or petabytes. It is exhaustive to process this data and determine the findings in short period of time. Many details can be missed …


Network Traffic Based Botnet Detection Using Machine Learning, Anand Ravindra Vishwakarma May 2020

Network Traffic Based Botnet Detection Using Machine Learning, Anand Ravindra Vishwakarma

Master's Projects

The field of information and computer security is rapidly developing in today’s world as the number of security risks is continuously being explored every day. The moment a new software or a product is launched in the market, a new exploit or vulnerability is exposed and exploited by the attackers or malicious users for different motives. Many attacks are distributed in nature and carried out by botnets that cause widespread disruption of network activity by carrying out DDoS (Distributed Denial of Service) attacks, email spamming, click fraud, information and identity theft, virtual deceit and distributed resource usage for cryptocurrency mining. …


Integrated Machine Learning And Bioinformatics Approaches For Prediction Of Cancer-Driving Gene Mutations, Oluyemi Odeyemi May 2020

Integrated Machine Learning And Bioinformatics Approaches For Prediction Of Cancer-Driving Gene Mutations, Oluyemi Odeyemi

Computational and Data Sciences (PhD) Dissertations

Cancer arises from the accumulation of somatic mutations and genetic alterations in cell division checkpoints and apoptosis, this often leads to abnormal tumor proliferation. Proper classification of cancer-linked driver mutations will considerably help our understanding of the molecular dynamics of cancer. In this study, we compared several cancer-specific predictive models for prediction of driver mutations in cancer-linked genes that were validated on canonical data sets of functionally validated mutations and applied to a raw cancer genomics data. By analyzing pathogenicity prediction and conservation scores, we have shown that evolutionary conservation scores play a pivotal role in the classification of cancer …


Achieving Causal Fairness In Machine Learning, Yongkai Wu May 2020

Achieving Causal Fairness In Machine Learning, Yongkai Wu

Graduate Theses and Dissertations

Fairness is a social norm and a legal requirement in today's society. Many laws and regulations (e.g., the Equal Credit Opportunity Act of 1974) have been established to prohibit discrimination and enforce fairness on several grounds, such as gender, age, sexual orientation, race, and religion, referred to as sensitive attributes. Nowadays machine learning algorithms are extensively applied to make important decisions in many real-world applications, e.g., employment, admission, and loans. Traditional machine learning algorithms aim to maximize predictive performance, e.g., accuracy. Consequently, certain groups may get unfairly treated when those algorithms are applied for decision-making. Therefore, it is an imperative …


Knot Flow Classification And Its Applications In Vehicular Ad-Hoc Networks (Vanet), David Schmidt May 2020

Knot Flow Classification And Its Applications In Vehicular Ad-Hoc Networks (Vanet), David Schmidt

Electronic Theses and Dissertations

Intrusion detection systems (IDSs) play a crucial role in the identification and mitigation for attacks on host systems. Of these systems, vehicular ad hoc networks (VANETs) are difficult to protect due to the dynamic nature of their clients and their necessity for constant interaction with their respective cyber-physical systems. Currently, there is a need for a VANET-specific IDS that meets this criterion. To this end, a spline-based intrusion detection system has been pioneered as a solution. By combining clustering with spline-based general linear model classification, this knot flow classification method (KFC) allows for robust intrusion detection to occur. Due its …


Dynamic Fraud Detection Via Sequential Modeling, Panpan Zheng May 2020

Dynamic Fraud Detection Via Sequential Modeling, Panpan Zheng

Graduate Theses and Dissertations

The impacts of information revolution are omnipresent from life to work. The web services have signicantly changed our living styles in daily life, such as Facebook for communication and Wikipedia for knowledge acquirement. Besides, varieties of information systems, such as data management system and management information system, make us work more eciently. However, it is usually a double-edged sword. With the popularity of web services, relevant security issues are arising, such as fake news on Facebook and vandalism on Wikipedia, which denitely impose severe security threats to OSNs and their legitimate participants. Likewise, oce automation incurs another challenging security issue, …


Early Warning Solar Storm Prediction, Ian D. Lumsden, Marvin Joshi, Matthew Smalley, Aiden Rutter, Ben Klein May 2020

Early Warning Solar Storm Prediction, Ian D. Lumsden, Marvin Joshi, Matthew Smalley, Aiden Rutter, Ben Klein

Chancellor’s Honors Program Projects

No abstract provided.


Finding Critical And Gradient-Flat Points Of Deep Neural Network Loss Functions, Charles Gearhart Frye '09 Apr 2020

Finding Critical And Gradient-Flat Points Of Deep Neural Network Loss Functions, Charles Gearhart Frye '09

Doctoral Dissertations

Despite the fact that the loss functions of deep neural networks are highly non-convex, gradient-based optimization algorithms converge to approximately the same performance from many random initial points. This makes neural networks easy to train, which, combined with their high representational capacity and implicit and explicit regularization strategies, leads to machine-learned algorithms of high quality with reasonable computational cost in a wide variety of domains.

One thread of work has focused on explaining this phenomenon by numerically characterizing the local curvature at critical points of the loss function, where gradients are zero. Such studies have reported that the loss functions …


Dynamic Composition Of Functions For Modular Learning, Clemens Gb Rosenbaum Mar 2020

Dynamic Composition Of Functions For Modular Learning, Clemens Gb Rosenbaum

Doctoral Dissertations

Compositionality is useful to reduce the complexity of machine learning models and increase their generalization capabilities, because new problems can be linked to the composition of existing solutions. Recent work has shown that compositional approaches can offer substantial benefits over a wide variety of tasks, from multi-task learning over visual question-answering to natural language inference, among others. A key variant is functional compositionality, where a meta-learner composes different (trainable) functions into complex machine learning models. In this thesis, I generalize existing approaches to functional compositionality under the umbrella of the routing paradigm, where trainable arbitrary functions are 'stacked' to form …


Robust Neural Machine Translation, Abdul Rafae Khan Feb 2020

Robust Neural Machine Translation, Abdul Rafae Khan

Dissertations, Theses, and Capstone Projects

This thesis aims for general robust Neural Machine Translation (NMT) that is agnostic to the test domain. NMT has achieved high quality on benchmarks with closed datasets such as WMT and NIST but can fail when the translation input contains noise due to, for example, mismatched domains or spelling errors. The standard solution is to apply domain adaptation or data augmentation to build a domain-dependent system. However, in real life, the input noise varies in a wide range of domains and types, which is unknown in the training phase. This thesis introduces five general approaches to improve NMT accuracy and …