Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

2020

Machine Learning

Institution
Publication
Publication Type

Articles 1 - 30 of 73

Full-Text Articles in Physical Sciences and Mathematics

Distributed Load Testing By Modeling And Simulating User Behavior, Chester Ira Parrott Dec 2020

Distributed Load Testing By Modeling And Simulating User Behavior, Chester Ira Parrott

LSU Doctoral Dissertations

Modern human-machine systems such as microservices rely upon agile engineering practices which require changes to be tested and released more frequently than classically engineered systems. A critical step in the testing of such systems is the generation of realistic workloads or load testing. Generated workload emulates the expected behaviors of users and machines within a system under test in order to find potentially unknown failure states. Typical testing tools rely on static testing artifacts to generate realistic workload conditions. Such artifacts can be cumbersome and costly to maintain; however, even model-based alternatives can prevent adaptation to changes in a system …


Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao Dec 2020

Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao

Articles

It is often the case with new technologies that it is very hard to predict their long-term impacts and as a result, although new technology may be beneficial in the short term, it can still cause problems in the longer term. This is what happened with oil by-products in different areas: the use of plastic as a disposable material did not take into account the hundreds of years necessary for its decomposition and its related long-term environmental damage. Data is said to be the new oil. The message to be conveyed is associated with its intrinsic value. But as in …


Representational Learning Approach For Predicting Developer Expertise Using Eye Movements, Sumeet Maan Dec 2020

Representational Learning Approach For Predicting Developer Expertise Using Eye Movements, Sumeet Maan

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The thesis analyzes an existing eye-tracking dataset collected while software developers were solving bug fixing tasks in an open-source system. The analysis is performed using a representational learning approach namely, Multi-layer Perceptron (MLP). The novel aspect of the analysis is the introduction of a new feature engineering method based on the eye-tracking data. This is then used to predict developer expertise on the data. The dataset used in this thesis is inherently more complex because it is collected in a very dynamic environment i.e., the Eclipse IDE using an eye-tracking plugin, iTrace. Previous work in this area only worked on …


Cross Dataset Evaluation For Iot Network Intrusion Detection, Anjum Farah Dec 2020

Cross Dataset Evaluation For Iot Network Intrusion Detection, Anjum Farah

Theses and Dissertations

With the advent of Internet of Things (IOT) technology, the need to ensure the security of an IOT network has become important. There are several intrusion detection systems (IDS) that are available for analyzing and predicting network anomalies and threats. However, it is challenging to evaluate them to realistically estimate their performance when deployed. A lot of research has been conducted where the training and testing is done using the same simulated dataset. However, realistically, a network on which an intrusion detection model is deployed will be very different from the network on which it was trained. The aim of …


Approaching Hanabi With Q-Learning And Evolutionary Algorithm, Joseph Palmersten Dec 2020

Approaching Hanabi With Q-Learning And Evolutionary Algorithm, Joseph Palmersten

Culminating Projects in Computer Science and Information Technology

Hanabi is a cooperative card game with hidden information that requires cooperation and communication between the players. For a machine learning agent to be successful at the Hanabi, it will have to learn how to communicate and infer information from the communication of other players. To approach the problem of Hanabi the machine learning methods of Q-learning and Evolutionary algorithm are proposed as potential solutions. The agents that were created using the method are shown to not achieve human levels of communication.


Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li Dec 2020

Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li

Masters Theses

Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …


Attentional Parsing Networks, Marcus Karr Dec 2020

Attentional Parsing Networks, Marcus Karr

Master's Theses

Convolutional neural networks (CNNs) have dominated the computer vision field since the early 2010s, when deep learning largely replaced previous approaches like hand-crafted feature engineering and hierarchical image parsing. Meanwhile transformer architectures have attained preeminence in natural language processing, and have even begun to supplant CNNs as the state of the art for some computer vision tasks.

This study proposes a novel transformer-based architecture, the attentional parsing network, that reconciles the deep learning and hierarchical image parsing approaches to computer vision. We recast unsupervised image representation as a sequence-to-sequence translation problem where image patches are mapped to successive layers …


Defense By Deception Against Stealthy Attacks In Power Grids, Md Hasan Shahriar Nov 2020

Defense By Deception Against Stealthy Attacks In Power Grids, Md Hasan Shahriar

FIU Electronic Theses and Dissertations

Cyber-physical Systems (CPSs) and the Internet of Things (IoT) are converging towards a hybrid platform that is becoming ubiquitous in all modern infrastructures. The integration of the complex and heterogeneous systems creates enormous space for the adversaries to get into the network and inject cleverly crafted false data into measurements, misleading the control center to make erroneous decisions. Besides, the attacker can make a critical part of the system unavailable by compromising the sensor data availability. To obfuscate and mislead the attackers, we propose DDAF, a deceptive data acquisition framework for CPSs' hierarchical communication network. Each switch in the hierarchical …


Towards High Performance Stock Market Prediction Methods, Warren M. Landis, Sangwhan Cha Oct 2020

Towards High Performance Stock Market Prediction Methods, Warren M. Landis, Sangwhan Cha

Other Student Works

Stock markets of today, and will continue to in the future, rely on the metrics of timeliness and efficiency to reach optimal profits. A way stock investors have continued to strive for the best of these two factors of the business is through the use of predictive machine learning systems to help aid in their decision making. However, among the many systems currently in use, it could be said that the myriad of data that they are based on may not be sufficient. In an effort to devise an ensemble learning predictive system that will utilize an array of big …


Using Object Detection Algorithm And Optical Character Recognition To Read Data From Alphanumeric Tags In Text, Ana Bazerque, Davi Moraes, Marcela Souza Oct 2020

Using Object Detection Algorithm And Optical Character Recognition To Read Data From Alphanumeric Tags In Text, Ana Bazerque, Davi Moraes, Marcela Souza

ICT

The present document explores the use of machine learning techniques, specifically supervised learning and classification. It applies those techniques to create a solution for a real world company that provides medical products and services to hospitals. This project will deal with streamlining the calibration of medical weighing scales. The developed application will use object detection and character recognition to identify and classify a digital image of a scale’s tag, and fill in a form with the corresponding data. The main reason for the need of this application is to avoid human errors and automate the collection of data from the …


Co-Design And Evaluation Of An Intelligent Decision Support System For Stroke Rehabilitation Assessment, Min Hun Lee, Daniel P. Siewiorek, Asim Smailagic, Alexandre Bernardino, Sergi Badia Oct 2020

Co-Design And Evaluation Of An Intelligent Decision Support System For Stroke Rehabilitation Assessment, Min Hun Lee, Daniel P. Siewiorek, Asim Smailagic, Alexandre Bernardino, Sergi Badia

Research Collection School Of Computing and Information Systems

Clinical decision support systems have the potential to improve work flows of experts in practice (e.g. therapist's evidence-based rehabilitation assessment). However, the adoption of these systems is challenging, and the gains of these systems have not fully demonstrated yet. In this paper, we identified the needs of therapists to assess patient's functional abilities (e.g. alternative perspectives with quantitative information on patient's exercise motions). As a result, we co-designed and developed an intelligent decision support system that automatically identifies salient features of assessment using reinforcement learning to assess the quality of motion and generate patient-specific analysis. We evaluated this system with …


Forecasting Vegetation Health In The Mena Region By Predicting Vegetation Indicators With Machine Learning Models, Sachi Perera, Wenzhao Li, Erik Linstead, Hesham El-Askary Sep 2020

Forecasting Vegetation Health In The Mena Region By Predicting Vegetation Indicators With Machine Learning Models, Sachi Perera, Wenzhao Li, Erik Linstead, Hesham El-Askary

Mathematics, Physics, and Computer Science Faculty Articles and Research

Machine learning (ML) techniques can be applied to predict and monitor drought conditions due to climate change. Predicting future vegetation health indicators (such as EVI, NDVI, and LAI) is one approach to forecast drought events for hotspots (e.g. Middle East and North Africa (MENA) regions). Recently, ML models were implemented to predict EVI values using parameters such as land types, time series, historical vegetation indices, land surface temperature, soil moisture, evapotranspiration etc. In this work, we collected the MODIS atmospherically corrected surface spectral reflectance imagery with multiple vegetation related indices for modeling and evaluation of drought conditions in the MENA …


Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed Sep 2020

Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed

SMU Data Science Review

Music is incorporated into our daily lives whether intentional or unintentional. It evokes responses and behavior so much so there is an entire study dedicated to the psychology of music. Music creates the mood for dancing, exercising, creative thought or even relaxation. It is a powerful tool that can be used in various venues and through advertisements to influence and guide human reactions. Music is also often "borrowed" in the industry today. The practices of sampling and remixing music in the digital age have made cover song identification an active area of research. While most of this research is focused …


Tag: Automated Image Captioning, Nathan Funckes Sep 2020

Tag: Automated Image Captioning, Nathan Funckes

McNair Scholars Manuscripts

Many websites remain non-ADA compliant, containing images which lack accompanying textual descriptions. This leaves sight-impaired individuals unable to fully enjoy the rich wonders of the web. To address this inequity, our research aims to create an autonomous system capable of generating semantically accurate descriptions of images. This problem involves two tasks: recognizing an image and linguistically describing it. Our solution uses state-of-the-art deep learning: employing a convolutional neural network that "learns" to understand images and extracts their salient features, and a recurrent neural network that learns to generate structured, coherent sentences. These two networks are merged to create a single …


Machine Learning Applications For Drug Repurposing, Hansaim Lim Sep 2020

Machine Learning Applications For Drug Repurposing, Hansaim Lim

Dissertations, Theses, and Capstone Projects

The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected …


Creating A Culture Of Data-Driven Decision-Making, Kevin Bryan Rogers Sep 2020

Creating A Culture Of Data-Driven Decision-Making, Kevin Bryan Rogers

Doctoral Dissertations and Projects

Researchers have consistently shown that a supportive culture is one of the most crucial success factors in the implementation of any big data solution. Creating a culture that supports data-driven decision-making is a difficult but ultimately required step in transforming an organization into one that can readily and successfully adopt business intelligence technologies. The purpose of this qualitative case study was to understand the ways in which organizations can foster a culture of smarter decision-making and accountability so that businesses can improve operational metrics and ultimately profitability. Participants identified three major themes that drive the adoption of a data-driven culture. …


Evaluation Of Standard And Semantically-Augmented Distance Metrics For Neurology Patients, Daniel B. Hier, Jonathan Kopel, Steven U. Brint, Donald C. Wunsch, Gayla R. Olbricht, Sima Azizi, Blaine Allen Aug 2020

Evaluation Of Standard And Semantically-Augmented Distance Metrics For Neurology Patients, Daniel B. Hier, Jonathan Kopel, Steven U. Brint, Donald C. Wunsch, Gayla R. Olbricht, Sima Azizi, Blaine Allen

Electrical and Computer Engineering Faculty Research & Creative Works

Background: Patient distances can be calculated based on signs and symptoms derived from an ontological hierarchy. There is controversy as to whether patient distance metrics that consider the semantic similarity between concepts can outperform standard patient distance metrics that are agnostic to concept similarity. The choice of distance metric can dominate the performance of classification or clustering algorithms. Our objective was to determine if semantically augmented distance metrics would outperform standard metrics on machine learning tasks.

Methods: We converted the neurological findings from 382 published neurology cases into sets of concepts with corresponding machine-readable codes. We calculated patient distances by …


Routing Optimization In Heterogeneous Wireless Networks For Space And Mission-Driven Internet Of Things (Iot) Environments, Sara El Alaoui Aug 2020

Routing Optimization In Heterogeneous Wireless Networks For Space And Mission-Driven Internet Of Things (Iot) Environments, Sara El Alaoui

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

As technological advances have made it possible to build cheap devices with more processing power and storage, and that are capable of continuously generating large amounts of data, the network has to undergo significant changes as well. The rising number of vendors and variety in platforms and wireless communication technologies have introduced heterogeneity to networks compromising the efficiency of existing routing algorithms. Furthermore, most of the existing solutions assume and require connection to the backbone network and involve changes to the infrastructures, which are not always possible -- a 2018 report by the Federal Communications Commission shows that over 31% …


Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis Aug 2020

Machine-Learning-Based Prediction Of Sepsis Events From Vertical Clinical Trial Data: A Naïve Approach, Tyler Michael Gaddis

Theses and Dissertations

Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease.

This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an …


Dictionary-Based Data Generation For Fine-Tuning Bert For Adverbial Paraphrasing Tasks, Mark Anthony Carthon Aug 2020

Dictionary-Based Data Generation For Fine-Tuning Bert For Adverbial Paraphrasing Tasks, Mark Anthony Carthon

Theses and Dissertations

Recent advances in natural language processing technology have led to the emergence of

large and deep pre-trained neural networks. The use and focus of these networks are on transfer

learning. More specifically, retraining or fine-tuning such pre-trained networks to achieve state

of the art performance in a variety of challenging natural language processing/understanding

(NLP/NLU) tasks. In this thesis, we focus on identifying paraphrases at the sentence level using

the network Bidirectional Encoder Representations from Transformers (BERT). It is well

understood that in deep learning the volume and quality of training data is a determining factor

of performance. The objective of …


A Study Of Information Bots And Knowledge Bots, Amartya Hatua Aug 2020

A Study Of Information Bots And Knowledge Bots, Amartya Hatua

Dissertations

In this dissertation, a study of different aspects of information bots and knowledge bots is done. The research contributes to a better understanding of the various characteristics of information bots as well as the different patterns and factors responsible for the information diffusion in a social network. This research also shows how these factors can be used to predict information diffusion for a particular topic in a social network. The second part of the research is focused on strategies for improving the knowledge base of knowledge bots, where two different approaches are studied. In the first approach, knowledge is transferred …


An Investigation Into Multi-View Error Correcting Output Code Classifiers Applied To Organ Tissue Classification, Daniel Alvarez Aug 2020

An Investigation Into Multi-View Error Correcting Output Code Classifiers Applied To Organ Tissue Classification, Daniel Alvarez

UNLV Theses, Dissertations, Professional Papers, and Capstones

Large amounts of data is being generated constantly each day, so much data that it is difficult to find patterns in order to predict outcomes and make decisions for both humans and machines alike. It would be useful if this data could be simplified using machine learning techniques. For example, biological cell identity is dependent on many factors tied to genetic processes. Such factors include proteins, gene transcription, and gene methylation. Each of these factors are highly complex mechanism with immense amounts of data. Simplifying these can then be helpful in finding patterns in them. Error-Correcting Output Codes (ECOC) does …


Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat Jul 2020

Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat

Electronic Thesis and Dissertation Repository

The rapid growth of the Internet and related technologies has led to the collection of large amounts of data by individuals, organizations, and society in general [1]. However, this often leads to information overload which occurs when the amount of input (e.g. data) a human is trying to process exceeds their cognitive capacities [2]. Machine learning (ML) has been proposed as one potential methodology capable of extracting useful information from large sets of data [1]. This thesis focuses on two applications. The first is education, namely e-Learning environments. Within this field, this thesis proposes different optimized ML ensemble models to …


Automated Anomaly Detection And Localization System For A Microservices Based Cloud System, Priyanka Prakash Naikade Jul 2020

Automated Anomaly Detection And Localization System For A Microservices Based Cloud System, Priyanka Prakash Naikade

Electronic Thesis and Dissertation Repository

Context: With an increasing number of applications running on a microservices-based cloud system (such as AWS, GCP, IBM Cloud), it is challenging for the cloud providers to offer uninterrupted services with guaranteed Quality of Service (QoS) factors. Problem Statement: Existing monitoring frameworks often do not detect critical defects among a large volume of issues generated, thus affecting recovery response times and usage of maintenance human resource. Also, manually tracing the root causes of the issues requires a significant amount of time. Objective: The objective of this work is to: (i) detect performance anomalies, in real-time, through monitoring KPIs (Key Performance …


Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah Jul 2020

Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah

Electronic Thesis and Dissertation Repository

The increasing use of electronic platforms in healthcare has resulted in the generation of unprecedented amounts of data in recent years. The amount of data available to clinical researchers, physicians, and healthcare administrators continues to grow, which creates an untapped resource with the ability to improve the healthcare system drastically. Despite the enthusiasm for adopting electronic health records (EHRs), some recent studies have shown that EHR-based systems hardly improve the ability of healthcare providers to make better decisions. One reason for this inefficacy is that these systems do not allow for human-data interaction in a manner that fits and supports …


Deep Learning Predictive Modeling With Data Challenges (Small, Big, Or Imbalanced), Renhao Liu Jul 2020

Deep Learning Predictive Modeling With Data Challenges (Small, Big, Or Imbalanced), Renhao Liu

USF Tampa Graduate Theses and Dissertations

In the real world, data used to build machine learning models always has different sizes and characteristics. These size and characteristic features, including small datasets, big datasets, imbalanced datasets, often lead to different challenges when training machine learning models. Models trained on a small number of observations tend to overfit the training data and produce inaccurate results. When it comes to big data, efficiently learning from "huge" size data in a short time becomes important. With an imbalanced dataset, learning is usually biased towards the majority class in the data and appropriate measurements are needed to check model performance.

As …


Anta: Accelerated Network Traffic Analytics., Matthew Grohotolski, Connor Dileo Jul 2020

Anta: Accelerated Network Traffic Analytics., Matthew Grohotolski, Connor Dileo

Summer Scholarship, Creative Arts and Research Projects (SCARP)

Implementing traditional machine learning models and neural networks has become trivial in detecting malicious network traffic and has sparked interest in many researchers investigating this field. Standard implementations include using the baseline models in packages such as sklearn, tensorflow, and keras. In this paper we seek to advance the field of network detection and produce results which will have great benefits in terms of speed and performance of these models. We take advantage of Intel’s DAAL and OpenVINO packages as they are the two best performance enhancing methods which are publicly available today. Furthermore, comparisons will be made to determine …


Algorithmic Robot Design: Label Maps, Procrustean Graphs, And The Boundary Of Non-Destructiveness, Shervin Ghasemlou Jul 2020

Algorithmic Robot Design: Label Maps, Procrustean Graphs, And The Boundary Of Non-Destructiveness, Shervin Ghasemlou

Theses and Dissertations

This dissertation is focused on the problem of algorithmic robot design. The process of designing a robot or a team of robots that can reliably accomplish a task in an environment requires several key elements. How the problem is formulated can play a big role in the design process. The ability of the model to correctly reflect the environment, the events, and different pieces of the problem is crucial. Another key element is the ability of the model to show the relationship between different designs of a single system. These two elements can enable design algorithms to navigate through the …


What Was Written Vs. Who Read It: News Media Profiling Using Text Analysis And Social Media Context, Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav. Nakov Jul 2020

What Was Written Vs. Who Read It: News Media Profiling Using Text Analysis And Social Media Context, Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav. Nakov

Research Collection School Of Computing and Information Systems

Predicting the political bias and the factuality of reporting of entire news outlets are critical elements of media profiling, which is an understudied but an increasingly important research direction. The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim, either manually or automatically. Thus, it has been proposed to profile entire news outlets and to look for those that are likely to publish fake or biased content. This makes it possible to detect likely “fake news” the moment they are published, by simply checking the reliability of their …


Machine Learning For The Internet Of Things: Applications, Implementation, And Security, Vishalini Laguduva Ramnath Jul 2020

Machine Learning For The Internet Of Things: Applications, Implementation, And Security, Vishalini Laguduva Ramnath

USF Tampa Graduate Theses and Dissertations

Artificial intelligence and ubiquitous sensor systems have seen tremendous advances in recent times, resulting in groundbreaking impact across domains such as healthcare, entertainment, and transportation through a collective ecosystem called the Internet of Things. The advent of 5G and improved wireless networks will further accelerate the research and development of tools in deep learning, sensor systems, and computing platforms by providing improved network latency and bandwidth. While tremendous progress has been made in the Internet of Things, current work has largely focused on building robust applications that leverage the data collected through ubiquitous sensor nodes to provide actionable rules and …