Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2020

Machine learning

Discipline
Institution
Publication
Publication Type

Articles 1 - 30 of 233

Full-Text Articles in Physical Sciences and Mathematics

Countering Internet Packet Classifiers To Improve User Online Privacy, Sina Fathi-Kazerooni Dec 2020

Countering Internet Packet Classifiers To Improve User Online Privacy, Sina Fathi-Kazerooni

Dissertations

Internet traffic classification or packet classification is the act of classifying packets using the extracted statistical data from the transmitted packets on a computer network. Internet traffic classification is an essential tool for Internet service providers to manage network traffic, provide users with the intended quality of service (QoS), and perform surveillance. QoS measures prioritize a network's traffic type over other traffic based on preset criteria; for instance, it gives higher priority or bandwidth to video traffic over website browsing traffic. Internet packet classification methods are also used for automated intrusion detection. They analyze incoming traffic patterns and identify malicious …


Sensitivity Analysis Of An Agent-Based Simulation Model Using Reconstructability Analysis, Andey M. Nunes, Martin Zwick, Wayne Wakeland Dec 2020

Sensitivity Analysis Of An Agent-Based Simulation Model Using Reconstructability Analysis, Andey M. Nunes, Martin Zwick, Wayne Wakeland

Systems Science Faculty Publications and Presentations

Reconstructability analysis, a methodology based on information theory and graph theory, was used to perform a sensitivity analysis of an agent-based model. The NetLogo BehaviorSpace tool was employed to do a full 2k factorial parameter sweep on Uri Wilensky’s Wealth Distribution NetLogo model, to which a Gini-coefficient convergence condition was added. The analysis identified the most influential predictors (parameters and their interactions) of the Gini coefficient wealth inequality outcome. Implications of this type of analysis for building and testing agent-based simulation models are discussed.


Leveraging The Inductive Bias Of Large Language Models For Abstract Textual Reasoning, Christopher Michael Rytting Dec 2020

Leveraging The Inductive Bias Of Large Language Models For Abstract Textual Reasoning, Christopher Michael Rytting

Theses and Dissertations

Large natural language models (such as GPT-2 or T5) demonstrate impressive abilities across a range of general NLP tasks. Here, we show that the knowledge embedded in such models provides a useful inductive bias, not just on traditional NLP tasks, but also in the nontraditional task of training a symbolic reasoning engine. We observe that these engines learn quickly and generalize in a natural way that reflects human intuition. For example, training such a system to model block-stacking might naturally generalize to stacking other types of objects because of structure in the real world that has been partially captured by …


Improving A Wireless Localization System Via Machine Learning Techniques And Security Protocols, Zachary Yorio Dec 2020

Improving A Wireless Localization System Via Machine Learning Techniques And Security Protocols, Zachary Yorio

Masters Theses, 2020-current

The recent advancements made in Internet of Things (IoT) devices have brought forth new opportunities for technologies and systems to be integrated into our everyday life. In this work, we investigate how edge nodes can effectively utilize 802.11 wireless beacon frames being broadcast from pre-existing access points in a building to achieve room-level localization. We explain the needed hardware and software for this system and demonstrate a proof of concept with experimental data analysis. Improvements to localization accuracy are shown via machine learning by implementing the random forest algorithm. Using this algorithm, historical data can train the model and make …


Reasoning About User Feedback Under Identity Uncertainty In Knowledge Base Construction, Ariel Kobren Dec 2020

Reasoning About User Feedback Under Identity Uncertainty In Knowledge Base Construction, Ariel Kobren

Doctoral Dissertations

Intelligent, automated systems that are intertwined with everyday life---such as Google Search and virtual assistants like Amazon’s Alexa or Apple’s Siri---are often powered in part by knowledge bases (KBs), i.e., structured data repositories of entities, their attributes, and the relationships among them. Despite a wealth of research focused on automated KB construction methods, KBs are inevitably imperfect, with errors stemming from various points in the construction pipeline. Making matters more challenging, new data is created daily and must be integrated with existing KBs so that they remain up-to-date. As the primary consumers of KBs, human users have tremendous potential to …


An Assessment Of The Hydrological Trends Using Synergistic Approaches Of Remote Sensing And Model Evaluations Over Global Arid And Semi-Arid Regions, Wenzhao Li, Hesham El-Askary, Rejoice Thomas, Surya Prakash Tiwari, Karuppasamy Manikandan, Thomas Piechota, Daniele Struppa Dec 2020

An Assessment Of The Hydrological Trends Using Synergistic Approaches Of Remote Sensing And Model Evaluations Over Global Arid And Semi-Arid Regions, Wenzhao Li, Hesham El-Askary, Rejoice Thomas, Surya Prakash Tiwari, Karuppasamy Manikandan, Thomas Piechota, Daniele Struppa

Mathematics, Physics, and Computer Science Faculty Articles and Research

Drylands cover about 40% of the world’s land area and support two billion people, most of them living in developing countries that are at risk due to land degradation. Over the last few decades, there has been warming, with an escalation of drought and rapid population growth. This will further intensify the risk of desertification, which will seriously affect the local ecological environment, food security and people’s lives. The goal of this research is to analyze the hydrological and land cover characteristics and variability over global arid and semi-arid regions over the last decade (2010–2019) using an integrative approach of …


A Targeted Adversarial Attack On Support Vector Machine Using The Boundary Line, Yessenia Rodriguez Dec 2020

A Targeted Adversarial Attack On Support Vector Machine Using The Boundary Line, Yessenia Rodriguez

Theses and Dissertations

In this thesis, a targeted adversarial attack is explored on a Support Vector Machine (SVM). SVM is defined by creating a separating boundary between two classes. Using a target class, any input can be modified to cross the “boundary line,” making the model predict the target class. To limit the modification, a percentage of an image of the target class is used to get several random sections. Using these sections, the input will be moved in small steps closer to the boundary point. The section that took the least number of steps to cause the model to predict the target …


Detecting Hacker Threats: Performance Of Word And Sentence Embedding Models In Identifying Hacker Communications, Susan Mckeever, Brian Keegan, Andrei Quieroz Dec 2020

Detecting Hacker Threats: Performance Of Word And Sentence Embedding Models In Identifying Hacker Communications, Susan Mckeever, Brian Keegan, Andrei Quieroz

Conference papers

Abstract—Cyber security is striving to find new forms of protection against hacker attacks. An emerging approach nowadays is the investigation of security-related messages exchanged on deep/dark web and even surface web channels. This approach can be supported by the use of supervised machine learning models and text mining techniques. In our work, we compare a variety of machine learning algorithms, text representations and dimension reduction approaches for the detection accuracies of software-vulnerability-related communications. Given the imbalanced nature of the three public datasets used, we investigate appropriate sampling approaches to boost detection accuracies of our models. In addition, we examine how …


Language-Driven Region Pointer Advancement For Controllable Image Captioning, Annika Lindh, Robert J. Ross, John D. Kelleher Dec 2020

Language-Driven Region Pointer Advancement For Controllable Image Captioning, Annika Lindh, Robert J. Ross, John D. Kelleher

Conference papers

Controllable Image Captioning is a recent sub-field in the multi-modal task of Image Captioning wherein constraints are placed on which regions in an image should be described in the generated natural language caption. This puts a stronger focus on producing more detailed descriptions, and opens the door for more end-user control over results. A vital component of the Controllable Image Captioning architecture is the mechanism that decides the timing of attending to each region through the advancement of a region pointer. In this paper, we propose a novel method for predicting the timing of region pointer advancement by treating the …


Hierarchical Aggregation Of Multidimensional Data For Efficient Data Mining, Safaa Khalil Alwajidi Dec 2020

Hierarchical Aggregation Of Multidimensional Data For Efficient Data Mining, Safaa Khalil Alwajidi

Dissertations

Big data analysis is essential for many smart applications in areas such as connected healthcare, intelligent transportation, human activity recognition, environment, and climate change monitoring. Traditional data mining algorithms do not scale well to big data due to the enormous number of data points and the velocity of their generation. Mining and learning from big data need time and memory efficiency techniques, albeit the cost of possible loss in accuracy. This research focuses on the mining of big data using aggregated data as input. We developed a data structure that is to be used to aggregate data at multiple resolutions. …


Unifying Chemistry And Machine Learning For The Study Of Noncovalent Interactions, Jacob A. Townsend Dec 2020

Unifying Chemistry And Machine Learning For The Study Of Noncovalent Interactions, Jacob A. Townsend

Doctoral Dissertations

Gas separations are in great demand for carbon emission reduction, natural gas purification, oxygen isolation, and much more. Many of these separations rely on cost-prohibitive methods such as cryogenic distillation or strong-binding solvents. As a result, novel materials are being developed to subvert the energetic expense of gas separation processes. These studies focus on improving the performance of alternative materials, including (but not limited to) metal-organic frameworks, covalent organic frameworks, dense polymeric membranes, porous polymers, and ionic liquids.

In this work, the atomistic effects of functional units are explored for gas separations processes using electronic structure theory and machine learning. …


In The Margins: Reconsidering The Range And Contribution Of Diazotrophs In Nearshore Environments, Corday R. Selden Dec 2020

In The Margins: Reconsidering The Range And Contribution Of Diazotrophs In Nearshore Environments, Corday R. Selden

OES Theses and Dissertations

Dinitrogen (N2) fixation enables primary production and, consequently, carbon dioxide drawdown in nitrogen (N) limited marine systems, exerting a powerful influence over the coupled carbon and N cycles. Our understanding of the environmental factors regulating its distribution and magnitude are largely based on the range and sensitivity of one genus, Trichodesmium. However, recent work suggests that the niche preferences of distinct diazotrophic (N2 fixing) clades differ due to their metabolic and ecological diversity, hampering efforts to close the N budget and model N2 fixation accurately. Here, I explore the range of N2 fixation …


Enhanced Traffic Incident Analysis With Advanced Machine Learning Algorithms, Zhenyu Wang Dec 2020

Enhanced Traffic Incident Analysis With Advanced Machine Learning Algorithms, Zhenyu Wang

Computational Modeling & Simulation Engineering Theses & Dissertations

Traffic incident analysis is a crucial task in traffic management centers (TMCs) that typically manage many highways with limited staff and resources. An effective automatic incident analysis approach that can report abnormal events timely and accurately will benefit TMCs in optimizing the use of limited incident response and management resources. During the past decades, significant efforts have been made by researchers towards the development of data-driven approaches for incident analysis. Nevertheless, many developed approaches have shown limited success in the field. This is largely attributed to the long detection time (i.e., waiting for overwhelmed upstream detection stations; meanwhile, downstream stations …


Deep Q Learning Applied To Stock Trading, Agnibh Dasgupta Dec 2020

Deep Q Learning Applied To Stock Trading, Agnibh Dasgupta

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Developing a strategy for stock trading is a vital task for investors. However, it is challenging to obtain an optimal strategy, given the complex and dynamic nature of the stock market. This thesis aims to explore the applications of Reinforcement Learning with the goal of maximizing returns from market investment, keeping in mind the human aspect of trading by utilizing stock prices represented as candlestick graphs. Furthermore, the algorithm studies public interest patterns in form of graphs extracted from Google Trends to make predictions. Deep Q learning has been used to train an agent based on fused images of stock …


Acquisition, Processing, And Analysis Of Video, Audio And Meteorological Data In Multi-Sensor Electronic Beehive Monitoring, Sarbajit Mukherjee Dec 2020

Acquisition, Processing, And Analysis Of Video, Audio And Meteorological Data In Multi-Sensor Electronic Beehive Monitoring, Sarbajit Mukherjee

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

In recent years, a widespread decline has been seen in honey bee population and this is widely attributed to colony collapse disorder. Hence, it is of utmost importance that a system is designed to gather relevant information. This will allow for a deeper understanding of the possible reasons behind the above phenomenon to aid in the design of suitable countermeasures.

Electronic Beehive Monitoring is one such way of gathering critical information regarding a colony’s health and behavior without invasive beehive inspections. In this dissertation, we have presented an electronic beehive monitoring system called BeePi that can be placed on top …


Unsupervised Structural Graph Node Representation Learning, Mikel Joaristi Dec 2020

Unsupervised Structural Graph Node Representation Learning, Mikel Joaristi

Boise State University Theses and Dissertations

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, …


Walls Have Ears: Eavesdropping User Behaviors Via Graphics-Interrupt-Based Side Channel, Haoyu Ma, Jianwen Tian, Debin Gao, Jia Chunfu Dec 2020

Walls Have Ears: Eavesdropping User Behaviors Via Graphics-Interrupt-Based Side Channel, Haoyu Ma, Jianwen Tian, Debin Gao, Jia Chunfu

Research Collection School Of Computing and Information Systems

Graphics Processing Units (GPUs) are now playing a vital role in many devices and systems including computing devices, data centers, and clouds, making them the next target of side-channel attacks. Unlike those targeting CPUs, existing side-channel attacks on GPUs exploited vulnerabilities exposed by application interfaces like OpenGL and CUDA, which can be easily mitigated with software patches. In this paper, we investigate the lower-level and native interface between GPUs and CPUs, i.e., the graphics interrupts, and evaluate the side channel they expose. Being an intrinsic profile in the communication between a GPU and a CPU, the pattern of graphics interrupts …


Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas Dec 2020

Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas

Research Collection School Of Computing and Information Systems

In order to guide our students of machine learning in their statistical thinking, we need conceptually simple and mathematically defensible algorithms. In this paper, we present the Nearest Centroid algorithm (NC) algorithm as a pedagogical tool, combining the key concepts behind two foundational algorithms: K-Means clustering and K Nearest Neighbors (k- NN). In NC, we use the centroid (as defined in the K-Means algorithm) of the observations belonging to each class in our training data set and its distance from a new observation (similar to k-NN) for class prediction. Using this obvious extension, we will illustrate how the concepts of …


Modified-Half-Normal Distribution And Different Methods To Estimate Average Treatment Effect., Jingchao Sun Dec 2020

Modified-Half-Normal Distribution And Different Methods To Estimate Average Treatment Effect., Jingchao Sun

Electronic Theses and Dissertations

This dissertation consists of three projects related to Modified-Half-Normal distribution and causal inference. In my first project, a new distribution called Modified-Half-Normal distribution was introduced. I explored a few of its distributional properties, the procedures for generating random samples based on Bayesian approaches, and the parameter estimation based on the method of moments. The second project deals with the problem of selection bias of average treatment effect (ATE) if we use the observational data. I combined the propensity score based inverse probability of treatment weighting (IPTW) method and the directed acyclic graph (DAG) to solve this problem. The third project …


New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger Nov 2020

New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger

Theses

Background: Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction--a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions …


Multimodal Data Fusion And Attack Detection In Recommender Systems, Mehmet Aktukmak Nov 2020

Multimodal Data Fusion And Attack Detection In Recommender Systems, Mehmet Aktukmak

USF Tampa Graduate Theses and Dissertations

The commercial platforms that use recommender systems can collect relevant information to produce useful recommendations to the platform users. However, these sources usually contain missing values, imbalanced and heterogeneous data, and noisy observations. Such characteristics render the process of exploiting the information nontrivial, as one should carefully address them during the data fusion process. In addition to the degenerative characteristics, some entries can be fake, i.e., they can be the outcomes of malicious intents to manipulate the system. These entries should be eliminated before incorporation to any recommendation task. Detecting such malicious attacks quickly and accurately and then mitigating them …


Groundwater Withdrawal Prediction Using Integrated Multitemporal Remote Sensing Data Sets And Machine Learning, S. Majumdar, Ryan G. Smith, J. J. Butler, V. Lakshmi Nov 2020

Groundwater Withdrawal Prediction Using Integrated Multitemporal Remote Sensing Data Sets And Machine Learning, S. Majumdar, Ryan G. Smith, J. J. Butler, V. Lakshmi

Geosciences and Geological and Petroleum Engineering Faculty Research & Creative Works

Effective monitoring of groundwater withdrawals is necessary to help mitigate the negative impacts of aquifer depletion. In this study, we develop a holistic approach that combines water balance components with a machine learning model to estimate groundwater withdrawals. We use both multitemporal satellite and modeled data from sensors that measure different components of the water balance and land use at varying spatial and temporal resolutions. These remote sensing products include evapotranspiration, precipitation, and land cover. Due to the inherent complexity of integrating these data sets and subsequently relating them to groundwater withdrawals using physical models, we apply random forests -- …


Machine Learning Integrated Design For Additive Manufacturing, Jingchao Jiang, Yi Xiong, Zhiyuan Zhang, David W. Rosen Nov 2020

Machine Learning Integrated Design For Additive Manufacturing, Jingchao Jiang, Yi Xiong, Zhiyuan Zhang, David W. Rosen

Research Collection School Of Computing and Information Systems

For improving manufacturing efficiency and minimizing costs, design for additive manufacturing (AM) has been accordingly proposed. The existing design for AM methods are mainly surrogate model based. Due to the increasingly available data nowadays, machine learning (ML) has been applied to medical diagnosis, image processing, prediction, classification, learning association, etc. A variety of studies have also been carried out to use machine learning for optimizing the process parameters of AM with corresponding objectives. In this paper, a ML integrated design for AM framework is proposed, which takes advantage of ML that can learn the complex relationships between the design and …


Base-Package Recommendation Framework Based On Consumer Behaviours In Iptv Platform, Kuruparan Shanmugalingam, Ruwinda Ranganayanke, Chanka Gunawardhaha, Rajitha Navarathna Nov 2020

Base-Package Recommendation Framework Based On Consumer Behaviours In Iptv Platform, Kuruparan Shanmugalingam, Ruwinda Ranganayanke, Chanka Gunawardhaha, Rajitha Navarathna

Research Collection School Of Computing and Information Systems

Internet Protocol TeleVision (IPTV) provides many services such as live television streaming, time-shifted media, and Video On Demand (VOD). However, many customers do not engage properly with their subscribed packages due to a lack of knowledge and poor guidance. Many customers fail to identify the proper IPTV service package based on their needs and to utilise their current package to the maximum. In this paper, we propose a base-package recommendation model with a novel customer scoring-meter based on customers behaviour. Initially, our paper describes an algorithm to measure customers engagement score, which illustrates a novel approach to track customer engagement …


Using Data Analytics To Predict Students Score, Nang Laik Ma, Gim Hong Chua Nov 2020

Using Data Analytics To Predict Students Score, Nang Laik Ma, Gim Hong Chua

Research Collection School Of Computing and Information Systems

Education is very important to Singapore, and the government has continued to invest heavily in our education system to become one of the world-class systems today. A strong foundation of Science, Technology, Engineering, and Mathematics (STEM) was what underpinned Singapore's development over the past 50 years. PISA is a triennial international survey that evaluates education systems worldwide by testing the skills and knowledge of 15-year-old students who are nearing the end of compulsory education. In this paper, the authors used the PISA data from 2012 and 2015 and developed machine learning techniques to predictive the students' scores and understand the …


A New Efficient Method To Detect Genetic Interactions For Lung Cancer Gwas, Jennifer Luyapan, Xuemei Ji, Siting Li, Xiangjun Xiao, Dakai Zhu, Eric J. Duell, David C. Christiani, Matthew B. Schabath, Susanne M. Arnold, Shanbeh Zienolddiny, Hans Brunnström, Olle Melander, Mark D. Thornquist, Todd A. Mackenzie, Christopher I. Amos, Jiang Gui Oct 2020

A New Efficient Method To Detect Genetic Interactions For Lung Cancer Gwas, Jennifer Luyapan, Xuemei Ji, Siting Li, Xiangjun Xiao, Dakai Zhu, Eric J. Duell, David C. Christiani, Matthew B. Schabath, Susanne M. Arnold, Shanbeh Zienolddiny, Hans Brunnström, Olle Melander, Mark D. Thornquist, Todd A. Mackenzie, Christopher I. Amos, Jiang Gui

Markey Cancer Center Faculty Publications

BACKGROUND: Genome-wide association studies (GWAS) have proven successful in predicting genetic risk of disease using single-locus models; however, identifying single nucleotide polymorphism (SNP) interactions at the genome-wide scale is limited due to computational and statistical challenges. We addressed the computational burden encountered when detecting SNP interactions for survival analysis, such as age of disease-onset. To confront this problem, we developed a novel algorithm, called the Efficient Survival Multifactor Dimensionality Reduction (ES-MDR) method, which used Martingale Residuals as the outcome parameter to estimate survival outcomes, and implemented the Quantitative Multifactor Dimensionality Reduction method to identify significant interactions associated with age of …


Exploring The Potential Of Sparse Coding For Machine Learning, Sheng Yang Lundquist Oct 2020

Exploring The Potential Of Sparse Coding For Machine Learning, Sheng Yang Lundquist

Dissertations and Theses

While deep learning has proven to be successful for various tasks in the field of computer vision, there are several limitations of deep-learning models when compared to human performance. Specifically, human vision is largely robust to noise and distortions, whereas deep learning performance tends to be brittle to modifications of test images, including being susceptible to adversarial examples. Additionally, deep-learning methods typically require very large collections of training examples for good performance on a task, whereas humans can learn to perform the same task with a much smaller number of training examples.

In this dissertation, I investigate whether the use …


Applications Of Ai In Business, Industry, Government, Healthcare, And Environment, University Of Maine Artificial Intelligence Initiative Oct 2020

Applications Of Ai In Business, Industry, Government, Healthcare, And Environment, University Of Maine Artificial Intelligence Initiative

General University of Maine Publications

UMaine AI draws top talent and leverages a distinctive set of capabilities from the University of Maine and other collaborating institutions from across Maine and beyond, while it also recruits world-class talent from across the nation and the world. It is centered at the University of Maine, leveraging the university’s strengths across disciplines, including computing and information sciences, engineering, health and life sciences, business, education, social sciences, and more.


Asymptotically-Optimal Topological Nearest-Neighbor Filtering, Read Sandström, Jory Denny, Nancy M. Amato Oct 2020

Asymptotically-Optimal Topological Nearest-Neighbor Filtering, Read Sandström, Jory Denny, Nancy M. Amato

Department of Math & Statistics Faculty Publications

Nearest-neighbor finding is a major bottleneck for sampling-based motion planning algorithms. The cost of finding nearest neighbors grows with the size of the roadmap, leading to a significant computational bottleneck for problems which require many configurations to find a solution. In this work, we develop a method of mapping configurations of a jointed robot to neighborhoods in the workspace that supports fast search for configurations in nearby neighborhoods. This expedites nearest-neighbor search by locating a small set of the most likely candidates for connecting to the query with a local plan. We show that this filtering technique can preserve asymptotically-optimal …


The Future Of Work Now: Automl At 84.51°And Kroger, Thomas H. Davenport, Steven M. Miller Oct 2020

The Future Of Work Now: Automl At 84.51°And Kroger, Thomas H. Davenport, Steven M. Miller

Research Collection School Of Computing and Information Systems

One of the most frequently-used phrases at business events these days is “the future of work.” It’s increasingly clear that artificial intelligence and other new technologies will bring substantial changes in work tasks and business processes. But while these changes are predicted for the future, they’re already present in many organizations for many different jobs. The job and incumbents described below are an example of this phenomenon.