Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Data Science

2021

Deep learning

Institution
Publication
Publication Type

Articles 1 - 25 of 25

Full-Text Articles in Physical Sciences and Mathematics

Private And Federated Deep Learning: System, Theory, And Applications For Social Good, Han Hu Dec 2021

Private And Federated Deep Learning: System, Theory, And Applications For Social Good, Han Hu

Dissertations

During the past decade, drug abuse continues to accelerate towards becoming the most severe public health problem in the United States. The ability to detect drug­abuse risk behavior at a population scale, such as among the population of Twitter users, can help to monitor the trend of drug­abuse incidents. However, traditional methods do not effectively detect drug­abuse risk behavior in tweets, mainly due to the sparsity of such tweets and the noisy nature of tweets. In the first part of this dissertation work, the task of classifying tweets as containing drug­abuse risk behavior or not, is studied. Millions of public …


A Novel Arabic Corpus For Text Classification Using Deep Learning And Word Embedding, Roua A. Abou Khachfeh, Islam El Kabani, Ziad Osman Dec 2021

A Novel Arabic Corpus For Text Classification Using Deep Learning And Word Embedding, Roua A. Abou Khachfeh, Islam El Kabani, Ziad Osman

BAU Journal - Science and Technology

Over the last years, Natural Language Processing (NLP) for Arabic language has obtained increasing importance due to the massive textual information available online in an unstructured text format, and its capability in facilitating and making information retrieval easier. One of the widely used NLP task is “Text Classification”. Its goal is to employ machine learning technics to automatically classify the text documents into one or more predefined categories. An important step in machine learning is to find suitable and large data for training and testing an algorithm. Moreover, Deep Learning (DL), the trending machine learning research, requires a lot of …


Aspect-Based Sentiment Analysis Of Movie Reviews, Samuel Onalaja, Eric Romero, Bosang Yun Dec 2021

Aspect-Based Sentiment Analysis Of Movie Reviews, Samuel Onalaja, Eric Romero, Bosang Yun

SMU Data Science Review

This study investigates a comparison of classification models used to determine aspect based separated text sentiment and predict binary sentiments of movie reviews with genre and aspect specific driving factors. To gain a broader classification analysis, five machine and deep learning algorithms were compared: Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM), and Recurrent Neural Network Long-Short-Term Memory (RNN LSTM). The various movie aspects that are utilized to separate the sentences are determined through aggregating aspect words from lexicon-base, supervised and unsupervised learning. The driving factors are randomly assigned to various movie aspects and their impact tied to …


Pokégan: P2p (Pet To Pokémon) Stylizer, Michael B. Hedge, Morgan Nelson, Thomas Pengilly, Michael Weatherford Dec 2021

Pokégan: P2p (Pet To Pokémon) Stylizer, Michael B. Hedge, Morgan Nelson, Thomas Pengilly, Michael Weatherford

SMU Data Science Review

This paper covers the development, testing, and implementation of an automatic framework for converting common images of pets into a Pokémon cartoon with the style of a Pokémon trading card. The technique will first implement object detection for common animals to facilitate image segmentation and apply the appropriate style transfer model to ensure the most aesthetic stylization. It explores various methods to address artifacts in the results of common neural style transfer techniques using Generative Adversarial Networks (GANs). This research sets up a framework to create an app that converts user-submitted pet pictures to Pokémon styled images using the most …


Uncertainty-Aware Deep Learning For Prediction Of Remaining Useful Life Of Mechanical Systems, Samuel J. Cornelius Dec 2021

Uncertainty-Aware Deep Learning For Prediction Of Remaining Useful Life Of Mechanical Systems, Samuel J. Cornelius

Theses and Dissertations

Remaining useful life (RUL) prediction is a problem that researchers in the prognostics and health management (PHM) community have been studying for decades. Both physics-based and data-driven methods have been investigated, and in recent years, deep learning has gained significant attention. When sufficiently large and diverse datasets are available, deep neural networks can achieve state-of-the-art performance in RUL prediction for a variety of systems. However, for end users to trust the results of these models, especially as they are integrated into safety-critical systems, RUL prediction uncertainty must be captured. This work explores an approach for estimating both epistemic and heteroscedastic …


The Detection Of Sexual Harassment And Chat Predators Using Artificial Neural Network, Noor Amer Hamzah, Ban N. Dhannoon Dec 2021

The Detection Of Sexual Harassment And Chat Predators Using Artificial Neural Network, Noor Amer Hamzah, Ban N. Dhannoon

Karbala International Journal of Modern Science

The vast increase in using social media sites like Twitter and Facebook led to frequent sexual_harassment on the Internet, which is considered a major societal problem. This paper aims to detect sexual_harassment and cyber_predators in early phase. We used deeplearning like Bidirectionally-long-short-term memory. Word representations are carefully reviewed in text specific to mapping to real number vectors. The chat sexual predators Detection_approach with the proposed_model. The best results obtained by the performance measured with F0.5-score were the result is_0.927 with proposed_models. The accuracy measured is_97.27% in the proposed_model. The comments sexual_harassment Detection_approach the result is_0.925 F0.5-score, and accuracy measured is_99.12%.


Pranayama Breathing Detection With Deep Learning, Bikash Shrestha Dec 2021

Pranayama Breathing Detection With Deep Learning, Bikash Shrestha

Theses

Yoga, a complementary health approach, according to a 2017 National Health Interview Survey by the Center for Disease Control and Prevention (CDC), is a choice of around 14.3% adults in the US. Kapalbhati pranayama, a yoga practice of alternating fast exhales and longer passive inhales, is understood to improve our health. Incorrect and irregular practices, however, can cause injuries and adverse effects. To avoid these undesired effects, it is essential to maintain a pace fit for the practitioner. In the absence of any tools to observe a pace of practice, this work develops a deep learning method that listens to …


Auto-Curation Of Large Evolving Image Datasets, Sara Mousavicheshmehkaboodi Dec 2021

Auto-Curation Of Large Evolving Image Datasets, Sara Mousavicheshmehkaboodi

Doctoral Dissertations

Large image collections are becoming common in many fields and offer tantalizing opportunities to transform how research, work, and education are conducted if the information and associated insights could be extracted from them. However, major obstacles to this vision exist. First, image datasets with associated metadata contain errors and need to be cleaned and organized to be easily explored and utilized. Second, such collections typically lack the necessary context or may have missing attributes that need to be recovered. Third, such datasets are domain-specific and require human expert involvement to make the right interpretation of the image content. Fourth, the …


Comparing Machine Learning Techniques With State-Of-The-Art Parametric Prediction Models For Predicting Soybean Traits, Susweta Ray Dec 2021

Comparing Machine Learning Techniques With State-Of-The-Art Parametric Prediction Models For Predicting Soybean Traits, Susweta Ray

Department of Statistics: Dissertations, Theses, and Student Work

Soybean is a significant source of protein and oil, and also widely used as animal feed. Thus, developing lines that are superior in terms of yield, protein and oil content is important to feed the ever-growing population. As opposed to the high-cost phenotyping, genotyping is both cost and time efficient for breeders while evaluating new lines in different environments (location-year combinations) can be costly. Several Genomic prediction (GP) methods have been developed to use the marker and environment data effectively to predict the yield or other relevant phenotypic traits of crops. Our study compares a conventional GP method (GBLUP), a …


A Transformer-Based Classification System For Volcanic Seismic Signals, Cristian Bravo Roman, Cindy Mora Stock, Alexander James Hemming Aug 2021

A Transformer-Based Classification System For Volcanic Seismic Signals, Cristian Bravo Roman, Cindy Mora Stock, Alexander James Hemming

Undergraduate Student Research Internships Conference

Volcanic seismic signals are a key element in volcano monitoring to assess the state of unrest and a possible eruption style and timing. Different sources such as brittle fracture (volcano-tectonic - VT) or fluid movement (long period - LP) generate signals with distinct characteristics in frequency content and shape, but site effects such as attenuation or background noise make their determination difficult to the untrained eye. In cases of unrest or an eminent eruption, the amount of data would require a fast and reliable source of pre-classification to classify and catalogue to aid in the job usually done by a …


Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu Aug 2021

Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu

Electronic Theses and Dissertations

The Newsvendor problem is a classical supply chain problem used to develop strategies for inventory optimization. The goal of the newsvendor problem is to predict the optimal order quantity of a product to meet an uncertain demand in the future, given that the demand distribution itself is known. The Ice Cream Vendor Problem extends the classical newsvendor problem to an uncertain demand with unknown distribution, albeit a distribution that is known to depend on exogenous features. The goal is thus to estimate the order quantity that minimizes the total cost when demand does not follow any known statistical distribution. The …


Enhancing Microbiome Host Disease Prediction With Variational Autoencoders, Celeste Manughian-Peter Aug 2021

Enhancing Microbiome Host Disease Prediction With Variational Autoencoders, Celeste Manughian-Peter

Computational and Data Sciences (MS) Theses

Advancements in genetic sequencing methods for microbiomes in recent decades have permitted the collection of taxonomic and functional profiles of microbial communities, accelerating the discovery of the functional aspects of the microbiome and generating an increased interest among clinicians in applying these techniques with patients. This advancement has coincided with software and hardware improvements in the field of machine learning and deep learning. Combined, these advancements implicate further potential for progress in disease diagnosis and treatment in humans. The ability to classify a human microbiome profile into a disease category, and additionally identify the differentiating factors within the profile between …


Predicting Severity Of Traumatic Brain Injury: A Residual Learning Model From Magnetic Resonance Images, Dacosta Yeboah Aug 2021

Predicting Severity Of Traumatic Brain Injury: A Residual Learning Model From Magnetic Resonance Images, Dacosta Yeboah

MSU Graduate Theses

One of the most significant frontiers for computational scientists is the engineering of human healthcare delivery based on intelligent analysis of health data. In a variety of neurological disorders such as Traumatic Brain Injury (TBI), neuro-imaging information plays a crucial role in the decision-making regarding patient care and as a potential prognostic marker for outcome. TBI is a heterogeneous neurological disorder. Due to the economic burdens of the disorder, sorting out this heterogeneity could provide more insights and better understanding of TBI recovery trajectories, thus improving overall diagnosis and treatment options. Magnetic Resonance Imaging (MRI) is a non-invasive technique that …


A Deep Learning Approach For Forecasting Global Commodities Prices, Ahmed Saied Elberawi, Mohamed Belal Prof. Jul 2021

A Deep Learning Approach For Forecasting Global Commodities Prices, Ahmed Saied Elberawi, Mohamed Belal Prof.

Future Computing and Informatics Journal

Forecasting future values of time-series data is a critical task in many disciplines including financial planning and decision-making. Researchers and practitioners in statistics apply traditional statistical methods (such as ARMA, ARIMA, ES, and GARCH) for a long time with varying accuracies. Deep learning provides more sophisticated and non-linear approximation that supersede traditional statistical methods in most cases. Deep learning methods require minimal features engineering compared to other methods; it adopts an end-to-end learning methodology. In addition, it can handle a huge amount of data and variables. Financial time series forecasting poses a challenge due to its high volatility and non-stationarity …


Short Term Temperature Forecasting Using Lstms, And Cnn, Darshan Shah May 2021

Short Term Temperature Forecasting Using Lstms, And Cnn, Darshan Shah

Theses

Weather forecasting is a vital application in present times. We can use the predictions to minimize the weather related loss. Use of machine learning and deep learning algorithms for forecasting, can eliminate or reduce the necessity of big data and high computation dependent process of parameterization. Long Short-Term Memory (LSTM) is a widely used deep learning architecture for time series forecasting. In this paper, we aim to predict one day ahead average temperature using a 2-layer neural network consisting of one layer of LSTM and one layer of 1D convolution. The input is pre-processed using a smoothing technique and output …


Federated Learning In Gaze Recognition (Fligr), Arun Gopal Govindaswamy May 2021

Federated Learning In Gaze Recognition (Fligr), Arun Gopal Govindaswamy

College of Computing and Digital Media Dissertations

The efficiency and generalizability of a deep learning model is based on the amount and diversity of training data. Although huge amounts of data are being collected, these data are not stored in centralized servers for further data processing. It is often infeasible to collect and share data in centralized servers due to various medical data regulations. This need for diversely distributed data and infeasible storage solutions calls for Federated Learning (FL). FL is a clever way of utilizing privately stored data in model building without the need for data sharing. The idea is to train several different models locally …


Using Machine Learning Methods To Predict The Movement Trajectories Of The Louisiana Black Bear, Daniel Clark, David Shaw, Armando Vela, Shane Weinstock, John Santerre, Joseph D. Clark May 2021

Using Machine Learning Methods To Predict The Movement Trajectories Of The Louisiana Black Bear, Daniel Clark, David Shaw, Armando Vela, Shane Weinstock, John Santerre, Joseph D. Clark

SMU Data Science Review

In 1992, the Louisiana black bear (Ursus americanus luteolus) was placed on the U.S. Endangered Species List. This was due to bear populations in Louisiana being small and isolated enough where their populations couldn’t intersect with other populations to grow. Interchange of individuals between subpopulations of bears in Louisiana is critical to maintain genetic diversity and avoid inbreeding effects. Utilizing GPS (Global Positioning System) data gathered from 31 radio-collared bears from 2010 through 2012, this research will investigate how bears traverse the landscape, which has implications for gene exchange. This paper will leverage machine learning tools to improve upon existing …


The Effects Of Individual Differences, Non‐Stationarity, And The Importance Of Data Partitioning Decisions For Training And Testing Of Eeg Cross‐Participant Models, Alexander J. Kamrud [*], Brett J. Borghetti, Christine M. Schubert Kabban May 2021

The Effects Of Individual Differences, Non‐Stationarity, And The Importance Of Data Partitioning Decisions For Training And Testing Of Eeg Cross‐Participant Models, Alexander J. Kamrud [*], Brett J. Borghetti, Christine M. Schubert Kabban

Faculty Publications

EEG-based deep learning models have trended toward models that are designed to perform classification on any individual (cross-participant models). However, because EEG varies across participants due to non-stationarity and individual differences, certain guidelines must be followed for partitioning data into training, validation, and testing sets, in order for cross-participant models to avoid overestimation of model accuracy. Despite this necessity, the majority of EEG-based cross-participant models have not adopted such guidelines. Furthermore, some data repositories may unwittingly contribute to the problem by providing partitioned test and non-test datasets for reasons such as competition support. In this study, we demonstrate how improper …


A Fully-Automated, Deep Learning-Based Framework For Ct-Based Localization, Segmentation, Verification And Planning Of Metastatic Vertebrae, Tucker Netherton, Tucker James Netherton May 2021

A Fully-Automated, Deep Learning-Based Framework For Ct-Based Localization, Segmentation, Verification And Planning Of Metastatic Vertebrae, Tucker Netherton, Tucker James Netherton

Dissertations & Theses (Open Access)

Palliative radiotherapy is an effective treatment for the palliation of symptoms caused by vertebral metastases. Visible evidence of disease is localized on medical images as part of the treatment planning process. However, complicating factors such as time pressures, anatomic variants in the spine, and similarities in adjacent vertebrae are associated with wrong level treatments of the spine. In addition, erroneous manual contouring of anatomic structures is a major failure mode in radiotherapy treatment planning.

The purpose of this study is to mitigate the challenges associated with treatment planning of the spine by automating the treatment planning process for three-dimensional conformal …


Improving Treatment Of Local Liver Ablation Therapy With Deep Learning And Biomechanical Modeling, Brian Anderson, Kristy Brock, Laurence Court, Carlos Eduardo Cardenas, Erik Cressman, Ankit Patel May 2021

Improving Treatment Of Local Liver Ablation Therapy With Deep Learning And Biomechanical Modeling, Brian Anderson, Kristy Brock, Laurence Court, Carlos Eduardo Cardenas, Erik Cressman, Ankit Patel

Dissertations & Theses (Open Access)

In the United States, colorectal cancer is the third most diagnosed cancer, and 60-70% of patients will develop liver metastasis. While surgical liver resection of metastasis is the standard of care for treatment with curative intent, it is only avai lable to about 20% of patients. For patients who are not surgical candidates, local percutaneous ablation therapy (PTA) has been shown to have a similar 5-year overall survival rate. However, PTA can be a challenging procedure, largely due to spatial uncertainties in the localization of the ablation probe, and in measuring the delivered ablation margin.

For this work, we hypothesized …


Using Deep Learning For Children Brain Image Analysis, Rafael Toche Pizano May 2021

Using Deep Learning For Children Brain Image Analysis, Rafael Toche Pizano

Computer Science and Computer Engineering Undergraduate Honors Theses

Analyzing the correlation between brain volumetric/morphometry features and cognition/behavior in children is important in the field of pediatrics as identifying such relationships can help identify children who may be at risk for illnesses. Understanding these relationships can not only help identify children who may be at risk of illnesses, but it can also help evaluate strategies that promote brain development in children. Currently, one way to do this is to use traditional statistical methods such as a correlation analysis, but such an approach does not make it easy to generalize and predict how brain volumetric/morphometry will impact cognition/behavior. One of …


Deep Learning For Multi-Tissue Cancer Classification Of Gene Expressions, Tarek Khorshed Jan 2021

Deep Learning For Multi-Tissue Cancer Classification Of Gene Expressions, Tarek Khorshed

Theses and Dissertations

We contribute in saving the lives of cancer patients through early detection and diagnosis, since one of the major challenges in cancer treatment is that patients are diagnosed at very late stages when appropriate medical interventions become less effective and full curative treatment is no longer achievable. Cancer classification using gene expressions is extremely challenging given the complexity and high dimensionality of the data. Current classification methods typically rely on samples collected from a single tissue type and perform a prerequisite of gene feature selection to avoid processing the full set of genes. These methods fall short in taking advantage …


Multi-Modal Classification Using Images And Text, Stuart J. Miller, Justin Howard, Paul Adams, Mel Schwan, Robert Slater Jan 2021

Multi-Modal Classification Using Images And Text, Stuart J. Miller, Justin Howard, Paul Adams, Mel Schwan, Robert Slater

SMU Data Science Review

This paper proposes a method for the integration of natural language understanding in image classification to improve classification accuracy by making use of associated metadata. Traditionally, only image features have been used in the classification process; however, metadata accompanies images from many sources. This study implemented a multi-modal image classification model that combines convolutional methods with natural language understanding of descriptions, titles, and tags to improve image classification. The novelty of this approach was to learn from additional external features associated with the images using natural language understanding with transfer learning. It was found that the combination of ResNet-50 image …


A Multi-Resolution Graph Convolution Network For Contiguous Epitope Prediction, Lisa Oh Jan 2021

A Multi-Resolution Graph Convolution Network For Contiguous Epitope Prediction, Lisa Oh

Dartmouth College Master’s Theses

Computational methods for predicting binding interfaces between antigens and antibodies (epitopes and paratopes) are faster and cheaper than traditional experimental structure determination methods. A sufficiently reliable computational predictor that could scale to large sets of available antibody sequence data could thus inform and expedite many biomedical pursuits, such as better understanding immune responses to vaccination and natural infection and developing better drugs and vaccines. However, current state-of-the-art predictors produce discontiguous predictions, e.g., predicting the epitope in many different spots on an antigen, even though in reality they typically comprise a single localized region. We seek to produce contiguous predicted epitopes, …


Reliable And Interpretable Machine Learning For Modeling Physical And Cyber Systems, Daniel L. Marino Lizarazo Jan 2021

Reliable And Interpretable Machine Learning For Modeling Physical And Cyber Systems, Daniel L. Marino Lizarazo

Theses and Dissertations

Over the past decade, Machine Learning (ML) research has predominantly focused on building extremely complex models in order to improve predictive performance. The idea was that performance can be improved by adding complexity to the models. This approach proved to be successful in creating models that can approximate highly complex relationships while taking advantage of large datasets. However, this approach led to extremely complex black-box models that lack reliability and are difficult to interpret. By lack of reliability, we specifically refer to the lack of consistent (unpredictable) behavior in situations outside the training data. Lack of interpretability refers to the …