Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2019

Machine learning

Discipline
Institution
Publication
File Type

Articles 1 - 30 of 96

Full-Text Articles in Physical Sciences and Mathematics

Early Detection Of Fake News On Social Media, Yang Liu Dec 2019

Early Detection Of Fake News On Social Media, Yang Liu

Dissertations

The ever-increasing popularity and convenience of social media enable the rapid widespread of fake news, which can cause a series of negative impacts both on individuals and society. Early detection of fake news is essential to minimize its social harm. Existing machine learning approaches are incapable of detecting a fake news story soon after it starts to spread, because they require certain amounts of data to reach decent effectiveness which take time to accumulate. To solve this problem, this research first analyzes and finds that, on social media, the user characteristics of fake news spreaders distribute significantly differently from those …


Cancer Risk Prediction With Whole Exome Sequencing And Machine Learning, Abdulrhman Fahad M Aljouie Dec 2019

Cancer Risk Prediction With Whole Exome Sequencing And Machine Learning, Abdulrhman Fahad M Aljouie

Dissertations

Accurate cancer risk and survival time prediction are important problems in personalized medicine, where disease diagnosis and prognosis are tuned to individuals based on their genetic material. Cancer risk prediction provides an informed decision about making regular screening that helps to detect disease at the early stage and therefore increases the probability of successful treatments. Cancer risk prediction is a challenging problem. Lifestyle, environment, family history, and genetic predisposition are some factors that influence the disease onset. Cancer risk prediction based on predisposing genetic variants has been studied extensively. Most studies have examined the predictive ability of variants in known …


Detecting Myocardial Infarctions Using Machine Learning Methods, Aniruddh Mathur Dec 2019

Detecting Myocardial Infarctions Using Machine Learning Methods, Aniruddh Mathur

Master's Projects

Myocardial Infarction (MI), commonly known as a heart attack, occurs when one of the three major blood vessels carrying blood to the heart get blocked, causing the death of myocardial (heart) cells. If not treated immediately, MI may cause cardiac arrest, which can ultimately cause death. Risk factors for MI include diabetes, family history, unhealthy diet and lifestyle. Medical treatments include various types of drugs and surgeries which can prove very expensive for patients due to high healthcare costs. Therefore, it is imperative that MI is diagnosed at the right time. Electrocardiography (ECG) is commonly used to detect MI. ECG …


Assessing Wildfire Damage From High Resolution Satellite Imagery Using Classification Algorithms, Ai-Linh Alten Dec 2019

Assessing Wildfire Damage From High Resolution Satellite Imagery Using Classification Algorithms, Ai-Linh Alten

Master's Projects

Wildfire damage assessments are important information for first responders, govern- ment agencies, and insurance companies to estimate the cost of damages and to help provide relief to those affected by a wildfire. With the help of Earth Observation satellite technology, determining the burn area extent of a fire can be done with traditional remote sensing methods like Normalized Burn Ratio. Using Very High Resolution satellites can help give even more accurate damage assessments but will come with some tradeoffs; these satellites can provide higher spatial and temporal resolution at the expense of better spectral resolution. As a wildfire burn area …


An Application Of Deep Learning Models To Automate Food Waste Classification, Alejandro Zachary Espinoza Dec 2019

An Application Of Deep Learning Models To Automate Food Waste Classification, Alejandro Zachary Espinoza

Dissertations and Theses

Food wastage is a problem that affects all demographics and regions of the world. Each year, approximately one-third of food produced for human consumption is thrown away. In an effort to track and reduce food waste in the commercial sector, some companies utilize third party devices which collect data to analyze individual contributions to the global problem. These devices track the type of food wasted (such as vegetables, fruit, boneless chicken, pasta) along with the weight. Some devices also allow the user to leave the food in a kitchen container while it is weighed, so the container weight must also …


Characterizing Dryland Ecosystems Using Remote Sensing And Dynamic Global Vegetation Modeling, Abdolhamid Dashtiahangar Dec 2019

Characterizing Dryland Ecosystems Using Remote Sensing And Dynamic Global Vegetation Modeling, Abdolhamid Dashtiahangar

Boise State University Theses and Dissertations

Drylands include all terrestrial regions where the production of crops, forage, wood and other ecosystem services are limited by water. These ecosystems cover approximately 40% of the earth terrestrial surface and accommodate more than 2 billion people (Millennium Ecosystem Assessment, 2005). Moreover, the interannual variability of the global carbon budget is strongly regulated by vegetation dynamics in drylands. Understanding the dynamics of such ecosystems is significant for assessing the potential for and impacts of natural or anthropogenic disturbances and mitigation planning, and a necessary step toward enhancing the economic and social well-being of dryland communities in a sustainable manner (Global …


Materials Prediction Using High-Throughput And Machine Learning Techniques, Chandramouli Nyshadham Dec 2019

Materials Prediction Using High-Throughput And Machine Learning Techniques, Chandramouli Nyshadham

Theses and Dissertations

Predicting new materials through virtually screening a large number of hypothetical materials using supercomputers has enabled materials discovery at an accelerated pace. However, the innumerable number of possible hypothetical materials necessitates the development of faster computational methods for speedier screening of materials reducing the time of discovery. In this thesis, I aim to understand and apply two computational methods for materials prediction. The first method deals with a computational high-throughput study of superalloys. Superalloys are materials which exhibit high-temperature strength. A combinatorial high-throughput search across 2224 ternary alloy systems revealed 102 potential superalloys of which 37 are brand new, all …


Deep Representation Learning For Clustering And Domain Adaptation, Mohsen Kheirandishfard Dec 2019

Deep Representation Learning For Clustering And Domain Adaptation, Mohsen Kheirandishfard

Computer Science and Engineering Dissertations

Representation learning is a fundamental task in the area of machine learning which can significantly influence the performance of the algorithms used in various applications. The main goal of this task is to capture the relationships between the input data and learn feature representations that contain the most useful information of the original data. Such representations can be further leveraged in many machine learning applications such as clustering, natural language analysis, recommender systems, etc. In this dissertation, we first present a theoretical framework for solving a broad class of non-convex optimization problems. The proposed method is applicable to various tasks …


Use Of Word Embedding To Generate Similar Words And Misspellings For Training Purpose In Chatbot Development, Sanjay Thapa Dec 2019

Use Of Word Embedding To Generate Similar Words And Misspellings For Training Purpose In Chatbot Development, Sanjay Thapa

Computer Science and Engineering Theses

The advancement in the field of Natural Language Processing and Machine Learning has played a significant role in the huge improvement of conversational Artificial Intelligence (AI). The use of text-based conversation AI such as chatbots have increased significantly for the everyday purpose to communicate with real people for a variety of tasks. Chatbots are deployed in almost all popular messaging platforms and channels. The rise of chatbot development frameworks based on machine learning is helping to deploy chatbot easily and promptly. These chatbot development frameworks use machine learning and natural language understanding (NLU) to understand users' messages and intents and …


Feature Extraction In Noise-Diverse Environments For Human Activities Recognition Using Wi-Fi, Sheheryar Arshad Dec 2019

Feature Extraction In Noise-Diverse Environments For Human Activities Recognition Using Wi-Fi, Sheheryar Arshad

Computer Science and Engineering Dissertations

With the rapid development of 802.11 standard and Internet of Things (IoT) applications, Wi-Fi (IEEE 802.11) has emerged as the most widely used wireless communication technology. Wi-Fi based sensing has found widespread use cases involving activity recognition, indoor localization, design of smart spaces and in healthcare applications. This dissertation presents the study of human activities’ sensing and recognition using channel state information (CSI) of Wi-Fi. We highlight the limitations of existing methods and consequently design the frameworks for collecting stable CSI and monitoring different indoor and outdoor environments for human activities. Specifically, this dissertation provide means to define and extract …


Social Media Text Analysis Using Multi-Kernel Convolutional Neural Network, Anna Philips Dec 2019

Social Media Text Analysis Using Multi-Kernel Convolutional Neural Network, Anna Philips

Computer Science and Engineering Theses

Transportation planners and ride hailing platforms such as Uber and Lyft use their riders feedback to assess their services and monitor customer satisfaction. Social media websites such as Facebook, Instagram, LinkedIn and in particular Twitter provides a large dataset of micro-texts by users who regularly post to their social media accounts about their grievances with their ride experience. This data is often unorganized and intractable to process because of it’s extremely large size which is continuously increasing daily. In this project, we collected ride hailing service relevant text data from Twitter around New York and developed a novel Convolutional Neural …


Think2act: Using Multimodal Data To Assess Human Cognitive And Physical Performance, Maher Abujelala Dec 2019

Think2act: Using Multimodal Data To Assess Human Cognitive And Physical Performance, Maher Abujelala

Computer Science and Engineering Dissertations

As computers become more advanced, affordable, and smaller in size, we start to use them in almost every aspect of our daily life. Nowadays, the use of computers is not just limited to accomplish work-related tasks. Instead, we use computers for education, entertainment, healthcare, and in many other areas to facilitate our daily life activities. From here, the Human-Computer Interaction (HCI) field emerged. HCI is a multidisciplinary field of study that focuses on utilizing computers and technology to interact with humans, improve their quality of life, and enhance their performance. The rapid advancements in other related research fields, such as …


Approxml: Efficient Approximate Ad-Hoc Ml Models Through Materialization And Reuse, Faezeh Ghaderi Dec 2019

Approxml: Efficient Approximate Ad-Hoc Ml Models Through Materialization And Reuse, Faezeh Ghaderi

Computer Science and Engineering Theses

Machine Learning (ML) has become an essential tool in answering complex predictive analytic queries. Model building for large scale datasets is one of the most time-consuming parts of the data science pipeline. Often data scientists are willing to sacrifice some accuracy in order to speed up this process during the exploratory phase. In this report, we aim to demonstrate ApproxML, a system that efficiently constructs approximate ML models for new queries from previously constructed ML models using the concepts of model materialization and reuse. ApproxML supports a wide variety of ML models such as generalized linear models for supervised learning …


Learning Robot Manipulation Tasks Via Observation, Michail Theofanidis Dec 2019

Learning Robot Manipulation Tasks Via Observation, Michail Theofanidis

Computer Science and Engineering Dissertations

The coexistence of humans and robots has been the aspiration of many scientific endeavors in the past century. Most anthropomorphic or industrial robots are highly articulated and complex machines, which are designed to carry out tasks that often involve the manipulation of physical objects. Traditionally, robots learn how to perform such tasks with the aid of a human programmer or operator. In this regard, the human acts as a teacher who provides a demonstration of a task. From the data of the demonstration, the robot must learn a state-action mapping that accomplishes the task. This state-action mapping is often addressed …


Countering Cybersecurity Vulnerabilities In The Power System, Fengli Zhang Dec 2019

Countering Cybersecurity Vulnerabilities In The Power System, Fengli Zhang

Graduate Theses and Dissertations

Security vulnerabilities in software pose an important threat to power grid security, which can be exploited by attackers if not properly addressed. Every month, many vulnerabilities are discovered and all the vulnerabilities must be remediated in a timely manner to reduce the chance of being exploited by attackers. In current practice, security operators have to manually analyze each vulnerability present in their assets and determine the remediation actions in a short time period, which involves a tremendous amount of human resources for electric utilities. To solve this problem, we propose a machine learning-based automation framework to automate vulnerability analysis and …


Comparison Of Rl Algorithms For Learning To Learn Problems, Adolfo Gonzalez Iii Dec 2019

Comparison Of Rl Algorithms For Learning To Learn Problems, Adolfo Gonzalez Iii

Theses and Dissertations

Machine learning has been applied to many different problems successfully due to the expressiveness of neural networks and simplicity of first order optimization algorithms. The latter being a vital piece needed for training large neural networks efficiently. Many of these algorithms were produced with behavior produced by experiments and intuition. An interesting question that comes to mind is that rather than observing and then designing algorithms with beneficial behaviors, can these algorithms be learned through a reinforcement learning by modeling optimization as a game. This paper explores several reinforcement learning algorithms which are applied to learn policies suited for optimization.


Toward Self-Reconfigurable Parametric Systems: Reinforcement Learning Approach, Ting-Yu Mu Dec 2019

Toward Self-Reconfigurable Parametric Systems: Reinforcement Learning Approach, Ting-Yu Mu

Dissertations

For the ongoing advancement of the fields of Information Technology (IT) and Computer Science, machine learning-based approaches are utilized in different ways in order to solve the problems that belong to the Nondeterministic Polynomial time (NP)-hard complexity class or to approximate the problems if there is no known efficient way to find a solution. Problems that determine the proper set of reconfigurable parameters of parametric systems to obtain the near optimal performance are typically classified as NP-hard problems with no efficient mathematical models to obtain the best solutions. This body of work aims to advance the knowledge of machine learning …


Habitat Associations And Reproduction Of Fishes On The Northwestern Gulf Of Mexico Shelf Edge, Elizabeth Marie Keller Nov 2019

Habitat Associations And Reproduction Of Fishes On The Northwestern Gulf Of Mexico Shelf Edge, Elizabeth Marie Keller

LSU Doctoral Dissertations

Several of the northwestern Gulf of Mexico (GOM) shelf-edge banks provide critical hard bottom habitat for coral and fish communities, supporting a wide diversity of ecologically and economically important species. These sites may be fish aggregation and spawning sites and provide important habitat for fish growth and reproduction. Already designated as habitat areas of particular concern, many of these banks are also under consideration for inclusion in the expansion of the Flower Garden Banks National Marine Sanctuary. This project aimed to gain a more comprehensive understanding of the communities and fish species on shelf-edge banks by way of gonad histology, …


Detecting Digitally Forged Faces In Online Videos, Neilesh Sambhu Oct 2019

Detecting Digitally Forged Faces In Online Videos, Neilesh Sambhu

USF Tampa Graduate Theses and Dissertations

We use Rossler’s FaceForensics dataset of 1004 online videos and their corresponding forged counterparts [1] to investigate the ability to distinguish digitally forged facial images from original images automatically with deep learning. The proposed convolutional neural network is much smaller than the current state-of-the-art solutions. Nevertheless, the network maintains a high level of accuracy (99.6%), all while using the entire FaceForensics dataset and not including any temporal information. We implement majority voting and show the impact on accuracy (99.67%), where only 1 video of 300 is misclassified. We examine why the model misclassified this one video. In terms of tuning …


Neural Models For Information Retrieval Without Labeled Data, Hamed Zamani Oct 2019

Neural Models For Information Retrieval Without Labeled Data, Hamed Zamani

Doctoral Dissertations

Recent developments of machine learning models, and in particular deep neural networks, have yielded significant improvements on several computer vision, natural language processing, and speech recognition tasks. Progress with information retrieval (IR) tasks has been slower, however, due to the lack of large-scale training data as well as neural network models specifically designed for effective information retrieval. In this dissertation, we address these two issues by introducing task-specific neural network architectures for a set of IR tasks and proposing novel unsupervised or \emph{weakly supervised} solutions for training the models. The proposed learning solutions do not require labeled training data. Instead, …


Extracting And Representing Entities, Types, And Relations, Patrick Verga Oct 2019

Extracting And Representing Entities, Types, And Relations, Patrick Verga

Doctoral Dissertations

Making complex decisions in areas like science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate data sources. While some decisions can be made from a single source of information, others require considering multiple pieces of evidence and how they relate to one another. Knowledge graphs (KGs) provide a natural approach for addressing this type of problem: they can serve as long-term stores of abstracted knowledge organized around concepts and their relationships, and can be populated from heterogeneous sources including databases and text. KGs can facilitate higher level reasoning, influence the interpretation of new data, and …


Adaptive Feature Engineering Modeling For Ultrasound Image Classification For Decision Support, Hatwib Mugasa Oct 2019

Adaptive Feature Engineering Modeling For Ultrasound Image Classification For Decision Support, Hatwib Mugasa

Doctoral Dissertations

Ultrasonography is considered a relatively safe option for the diagnosis of benign and malignant cancer lesions due to the low-energy sound waves used. However, the visual interpretation of the ultrasound images is time-consuming and usually has high false alerts due to speckle noise. Improved methods of collection image-based data have been proposed to reduce noise in the images; however, this has proved not to solve the problem due to the complex nature of images and the exponential growth of biomedical datasets. Secondly, the target class in real-world biomedical datasets, that is the focus of interest of a biopsy, is usually …


Machine Learning Based Ultra High Carbon Steel Image Segmentation, Sumith Kuttiyil Suresh Oct 2019

Machine Learning Based Ultra High Carbon Steel Image Segmentation, Sumith Kuttiyil Suresh

Theses and Dissertations

Mechanical and structural properties of ultra-high carbon steel are determined by their microstructures composed of constituents such as pearlite and spheroidites. Locating micro constituents and quantitatively measuring its presence is key for material researchers to study the physical properties of the carbon steel materials. This micrograph analysis is currently done manually and subjectively by material scientists, which is tedious and time-consuming. Here we propose to apply the image segmentation algorithm called U-Net to achieve automated labeling of steel microstructures on a subset of ultra- high carbon steel image dataset containing pearlite and spheroidite as the primary micro constituents. Our work …


Demonstration Of Visible And Near Infrared Raman Spectrometers And Improved Matched Filter Model For Analysis Of Combined Raman Signals, Alexander Matthew Atkinson Oct 2019

Demonstration Of Visible And Near Infrared Raman Spectrometers And Improved Matched Filter Model For Analysis Of Combined Raman Signals, Alexander Matthew Atkinson

Electrical & Computer Engineering Theses & Dissertations

Raman spectroscopy is a powerful analysis technique that has found applications in fields such as analytical chemistry, planetary sciences, and medical diagnostics. Recent studies have shown that analysis of Raman spectral profiles can be greatly assisted by use of computational models with achievements including high accuracy pure sample classification with imbalanced data sets and detection of ideal sample deviations for pharmaceutical quality control. The adoption of automated methods is a necessary step in streamlining the analysis process as Raman hardware becomes more advanced. Due to limits in the architectures of current machine learning based Raman classification models, transfer from pure …


Machine Learning-Based Models For Assessing Impacts Before, During And After Hurricane Events, Julie L. Harvey Sep 2019

Machine Learning-Based Models For Assessing Impacts Before, During And After Hurricane Events, Julie L. Harvey

Electronic Theses and Dissertations

Social media provides an abundant amount of real-time information that can be used before, during, and after extreme weather events. Government officials, emergency managers, and other decision makers can use social media data for decision-making, preparation, and assistance. Machine learning-based models can be used to analyze data collected from social media. Social media data and cloud cover temperature as physical sensor data was analyzed in this study using machine learning techniques. Data was collected from Twitter regarding Hurricane Florence from September 11, 2018 through September 20, 2018 and Hurricane Michael from October 1, 2018 through October 18, 2018. Natural language …


Semi-Supervised Regression With Generative Adversarial Networks Using Minimal Labeled Data, Greg Olmschenk Sep 2019

Semi-Supervised Regression With Generative Adversarial Networks Using Minimal Labeled Data, Greg Olmschenk

Dissertations, Theses, and Capstone Projects

This work studies the generalization of semi-supervised generative adversarial networks (GANs) to regression tasks. A novel feature layer contrasting optimization function, in conjunction with a feature matching optimization, allows the adversarial network to learn from unannotated data and thereby reduce the number of labels required to train a predictive network. An analysis of simulated training conditions is performed to explore the capabilities and limitations of the method. In concert with the semi-supervised regression GANs, an improved label topology and upsampling technique for multi-target regression tasks are shown to reduce data requirements. Improvements are demonstrated on a wide variety of vision …


The Application Of Synthetic Signals For Ecg Beat Classification, Elliot Morgan Brown Sep 2019

The Application Of Synthetic Signals For Ecg Beat Classification, Elliot Morgan Brown

Theses and Dissertations

A brief overview of electrocardiogram (ECG) properties and the characteristics of various cardiac conditions is given. Two different models are used to generate synthetic ECG signals. Domain knowledge is used to create synthetic examples of 16 different heart beat types with these models. Other techniques for synthesizing ECG signals are explored. Various machine learning models with different combinations of real and synthetic data are used to classify individual heart beats. The performance of the different methods and models are compared, and synthetic data is shown to be useful in beat classification.


Classification With Measurement Error In Covariates Or Response, With Application To Prostate Cancer Imaging Study, Kexin Luo Aug 2019

Classification With Measurement Error In Covariates Or Response, With Application To Prostate Cancer Imaging Study, Kexin Luo

Electronic Thesis and Dissertation Repository

The research is motivated by the prostate cancer imaging study conducted at the University of Western Ontario to classify cancer status using multiple in-vivo images. The prostate cancer histological image and the in-vivo images are subject to misalignment in the co-registration procedure, which can be viewed as measurement error in covariates or response. We investigate methods to correct this problem.

The first proposed method corrects the predicted class probability when the data has misclassified labels. The correction equation is derived from the relationship between the true response and the error-prone response. The probability for the observed class label is adjusted …


Sensory Relevance Models, Walt Woods Aug 2019

Sensory Relevance Models, Walt Woods

Dissertations and Theses

This dissertation concerns methods for improving the reliability and quality of explanations for decisions based on Neural Networks (NNs). NNs are increasingly part of state-of-the-art solutions for a broad range of fields, including biomedical, logistics, user-recommendation engines, defense, and self-driving vehicles. While NNs form the backbone of these solutions, they are often viewed as "black box" solutions, meaning the only output offered is a final decision, with no insight into how or why that particular decision was made. For high-stakes fields, such as biomedical, where lives are at risk, it is often more important to be able to explain a …


The Importance Of Landscape Position Information And Elevation Uncertainty For Barrier Island Habitat Mapping And Modeling, Nicholas Matthew Enwright Aug 2019

The Importance Of Landscape Position Information And Elevation Uncertainty For Barrier Island Habitat Mapping And Modeling, Nicholas Matthew Enwright

LSU Doctoral Dissertations

Barrier islands provide important ecosystem services, including storm protection and erosion control to the mainland, habitat for fish and wildlife, and tourism. As a result, natural resource managers are concerned with monitoring changes to these islands and modeling future states of these environments. Landscape position, such as elevation and distance from shore, influences habitat coverage on barrier islands by regulating exposure to abiotic factors, including waves, tides, and salt spray. Geographers commonly use aerial topographic lidar data for extracting landscape position information. However, researchers rarely consider lidar elevation uncertainty when using automated processes for extracting elevation-dependent habitats from lidar data. …