Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Statistics and Probability

Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas Dec 2019

Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas

SMU Data Science Review

In the age of hyper-connectivity, 24/7 news cycles, and instant news alerts via social media, mental health researchers don't have a way to automatically detect news content which is associated with triggering anxiety or depression in mental health patients. Using the Associated Press news wire, a semantic network was built with 1,056 news articles containing over 500,000 connections across multiple topics to provide a personalized algorithm which detects problematic news content for a given reader. We make use of Semantic Network Analysis to surface the relationship between news article text and anxiety in readers who struggle with mental health disorders. …


Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan Aug 2019

Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan

SMU Data Science Review

In this paper, we present novel approaches to predicting as- set failure in the electric distribution system. Failures in overhead power lines and their associated equipment in particular, pose significant finan- cial and environmental threats to electric utilities. Electric device failure furthermore poses a burden on customers and can pose serious risk to life and livelihood. Working with asset data acquired from an electric utility in Southern California, and incorporating environmental and geospatial data from around the region, we applied a Random Forest methodology to predict which overhead distribution lines are most vulnerable to fail- ure. Our results provide evidence …


Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels Aug 2019

Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels

SMU Data Science Review

In this paper we find a pattern of aperiodic seismic signals that precede earthquakes at any time in a laboratory earthquake’s cycle using a small window of time. We use a data set that comes from a classic laboratory experiment having several stick-slip displacements (earthquakes), a type of experiment which has been studied as a simulation of seismologic faults for decades. This data exhibits similar behavior to natural earthquakes, so the same approach may work in predicting the timing of them. Here we show that by applying random forest machine learning technique to the acoustic signal emitted by a laboratory …


Longitudinal Analysis With Modes Of Operation For Aes, Dana Geislinger, Cory Thigpen, Daniel W. Engels Aug 2019

Longitudinal Analysis With Modes Of Operation For Aes, Dana Geislinger, Cory Thigpen, Daniel W. Engels

SMU Data Science Review

In this paper, we present an empirical evaluation of the randomness of the ciphertext blocks generated by the Advanced Encryption Standard (AES) cipher in Counter (CTR) mode and in Cipher Block Chaining (CBC) mode. Vulnerabilities have been found in the AES cipher that may lead to a reduction in the randomness of the generated ciphertext blocks that can result in a practical attack on the cipher. We evaluate the randomness of the AES ciphertext using the standard key length and NIST randomness tests. We evaluate the randomness through a longitudinal analysis on 200 billion ciphertext blocks using logistic regression and …


Self-Driving Cars: Evaluation Of Deep Learning Techniques For Object Detection In Different Driving Conditions, Ramesh Simhambhatla, Kevin Okiah, Shravan Kuchkula, Robert Slater May 2019

Self-Driving Cars: Evaluation Of Deep Learning Techniques For Object Detection In Different Driving Conditions, Ramesh Simhambhatla, Kevin Okiah, Shravan Kuchkula, Robert Slater

SMU Data Science Review

Deep Learning has revolutionized Computer Vision, and it is the core technology behind capabilities of a self-driving car. Convolutional Neural Networks (CNNs) are at the heart of this deep learning revolution for improving the task of object detection. A number of successful object detection systems have been proposed in recent years that are based on CNNs. In this paper, an empirical evaluation of three recent meta-architectures: SSD (Single Shot multi-box Detector), R-CNN (Region-based CNN) and R-FCN (Region-based Fully Convolutional Networks) was conducted to measure how fast and accurate they are in identifying objects on the road, such as vehicles, pedestrians, …


Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia May 2019

Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia

SMU Data Science Review

In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory …


Machine Learning Pipeline For Exoplanet Classification, George Clayton Sturrock, Brychan Manry, Sohail Rafiqi May 2019

Machine Learning Pipeline For Exoplanet Classification, George Clayton Sturrock, Brychan Manry, Sohail Rafiqi

SMU Data Science Review

Planet identification has typically been a tasked performed exclusively by teams of astronomers and astrophysicists using methods and tools accessible only to those with years of academic education and training. NASA’s Exoplanet Exploration program has introduced modern satellites capable of capturing a vast array of data regarding celestial objects of interest to assist with researching these objects. The availability of satellite data has opened up the task of planet identification to individuals capable of writing and interpreting machine learning models. In this study, several classification models and datasets are utilized to assign a probability of an observation being an exoplanet. …


Leveraging Natural Language Processing Applications And Microblogging Platform For Increased Transparency In Crisis Areas, Ernesto Carrera-Ruvalcaba, Johnson Ekedum, Austin Hancock, Ben Brock May 2019

Leveraging Natural Language Processing Applications And Microblogging Platform For Increased Transparency In Crisis Areas, Ernesto Carrera-Ruvalcaba, Johnson Ekedum, Austin Hancock, Ben Brock

SMU Data Science Review

Through microblogging applications, such as Twitter, people actively document their lives even in times of natural disasters such as hurricanes and earthquakes. While first responders and crisis-teams are able to help people who call 911, or arrive at a designated shelter, there are vast amounts of information being exchanged online via Twitter that provide real-time, location-based alerts that are going unnoticed. To effectively use this information, the Tweets must be verified for authenticity and categorized to ensure that the proper authorities can be alerted. In this paper, we create a Crisis Message Corpus from geotagged Tweets occurring during 7 hurricanes …


An Evaluation Of Training Size Impact On Validation Accuracy For Optimized Convolutional Neural Networks, Jostein Barry-Straume, Adam Tschannen, Daniel W. Engels, Edward Fine Jan 2019

An Evaluation Of Training Size Impact On Validation Accuracy For Optimized Convolutional Neural Networks, Jostein Barry-Straume, Adam Tschannen, Daniel W. Engels, Edward Fine

SMU Data Science Review

In this paper, we present an evaluation of training size impact on validation accuracy for an optimized Convolutional Neural Network (CNN). CNNs are currently the state-of-the-art architecture for object classification tasks. We used Amazon’s machine learning ecosystem to train and test 648 models to find the optimal hyperparameters with which to apply a CNN towards the Fashion-MNIST (Mixed National Institute of Standards and Technology) dataset. We were able to realize a validation accuracy of 90% by using only 40% of the original data. We found that hidden layers appear to have had zero impact on validation accuracy, whereas the neural …


Comparisons Of Performance Between Quantum And Classical Machine Learning, Christopher Havenstein, Damarcus Thomas, Swami Chandrasekaran Jan 2019

Comparisons Of Performance Between Quantum And Classical Machine Learning, Christopher Havenstein, Damarcus Thomas, Swami Chandrasekaran

SMU Data Science Review

In this paper, we present a performance comparison of machine learning algorithms executed on traditional and quantum computers. Quantum computing has potential of achieving incredible results for certain types of problems, and we explore if it can be applied to machine learning. First, we identified quantum machine learning algorithms with reproducible code and had classical machine learning counterparts. Then, we found relevant data sets with which we tested the comparable quantum and classical machine learning algorithm's performance. We evaluated performance with algorithm execution time and accuracy. We found that quantum variational support vector machines in some cases had higher accuracy …


Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater Jan 2019

Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater

SMU Data Science Review

The problem of forecasting market volatility is a difficult task for most fund managers. Volatility forecasts are used for risk management, alpha (risk) trading, and the reduction of trading friction. Improving the forecasts of future market volatility assists fund managers in adding or reducing risk in their portfolios as well as in increasing hedges to protect their portfolios in anticipation of a market sell-off event. Our analysis compares three existing financial models that forecast future market volatility using the Chicago Board Options Exchange Volatility Index (VIX) to six machine/deep learning supervised regression methods. This analysis determines which models provide best …


Improving Gas Well Economics With Intelligent Plunger Lift Optimization Techniques, Atsu Atakpa, Emmanuel Farrugia, Ryan Tyree, Daniel W. Engels, Charles Sparks Jan 2019

Improving Gas Well Economics With Intelligent Plunger Lift Optimization Techniques, Atsu Atakpa, Emmanuel Farrugia, Ryan Tyree, Daniel W. Engels, Charles Sparks

SMU Data Science Review

In this paper, we present an approach to reducing bottom hole plunger dwell time for artificial lift systems. Lift systems are used in a process to remove contaminants from a natural gas well. A plunger is a mechanical device used to deliquefy natural gas wells by removing contaminants in the form of water, oil, wax, and sand from the wellbore. These contaminants decrease bottom-hole pressure which in turn hampers gas production by forming a physical barrier within the well tubing. As the plunger descends through the well it emits sounds which are recorded at the surface by an echo-meter that …


Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane Jan 2019

Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane

Statistical Science Theses and Dissertations

If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?

We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …