Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (8)
- Engineering (6)
- Applied Statistics (5)
- Numerical Analysis and Scientific Computing (4)
- Artificial Intelligence and Robotics (3)
-
- Computer Engineering (3)
- Social and Behavioral Sciences (3)
- Theory and Algorithms (3)
- Business (2)
- Business Analytics (2)
- Databases and Information Systems (2)
- Earth Sciences (2)
- Other Computer Engineering (2)
- Other Statistics and Probability (2)
- Programming Languages and Compilers (2)
- Aerospace Engineering (1)
- American Politics (1)
- Analysis (1)
- Applied Behavior Analysis (1)
- Applied Mathematics (1)
- Astrodynamics (1)
- Business Intelligence (1)
- Categorical Data Analysis (1)
- Civil and Environmental Engineering (1)
- Cognitive Behavioral Therapy (1)
- Communication (1)
- Data Storage Systems (1)
- Keyword
-
- Machine Learning (4)
- Deep Learning (3)
- Data Science (2)
- AI (1)
- ARIMA (1)
-
- Age imputation (1)
- Analysis (1)
- Applied statistics (1)
- Asset management (1)
- Big Data (1)
- Business analytics (1)
- Business intelligence (1)
- CNN (1)
- Churn (1)
- Classification (1)
- Cloud Computing (1)
- Convolutional Neural Network (1)
- Data science (1)
- Datacube (1)
- Depression (1)
- Driving Conditions (1)
- Electric distribution (1)
- Engineering (1)
- Ensemble (1)
- Fantasy (1)
- Fantasy Football (1)
- Football (1)
- GIS (1)
- Generalized Anxiety Disorder (1)
- Graph Analysis (1)
Articles 1 - 14 of 14
Full-Text Articles in Statistical Models
Identifying Customer Churn In After-Market Operations Using Machine Learning Algorithms, Vitaly Briker, Richard Farrow, William Trevino, Brent Allen
Identifying Customer Churn In After-Market Operations Using Machine Learning Algorithms, Vitaly Briker, Richard Farrow, William Trevino, Brent Allen
SMU Data Science Review
This paper presents a comparative study on machine learning methods as they are applied to product associations, future purchase predictions, and predictions of customer churn in aftermarket operations. Association rules are used help to identify patterns across products and find correlations in customer purchase behaviour. Studying customer behaviour as it pertains to Recency, Frequency, and Monetary Value (RFM) helps inform customer segmentation and identifies customers with propensity to churn. Lastly, Flowserve’s customer purchase history enables the establishment of churn thresholds for each customer group and assists in constructing a model to predict future churners. The aim of this model is …
Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas
Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas
SMU Data Science Review
In the age of hyper-connectivity, 24/7 news cycles, and instant news alerts via social media, mental health researchers don't have a way to automatically detect news content which is associated with triggering anxiety or depression in mental health patients. Using the Associated Press news wire, a semantic network was built with 1,056 news articles containing over 500,000 connections across multiple topics to provide a personalized algorithm which detects problematic news content for a given reader. We make use of Semantic Network Analysis to surface the relationship between news article text and anxiety in readers who struggle with mental health disorders. …
Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan
Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan
SMU Data Science Review
In this paper, we present novel approaches to predicting as- set failure in the electric distribution system. Failures in overhead power lines and their associated equipment in particular, pose significant finan- cial and environmental threats to electric utilities. Electric device failure furthermore poses a burden on customers and can pose serious risk to life and livelihood. Working with asset data acquired from an electric utility in Southern California, and incorporating environmental and geospatial data from around the region, we applied a Random Forest methodology to predict which overhead distribution lines are most vulnerable to fail- ure. Our results provide evidence …
Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey Macvittie, Robert Slater, Daniel W. Engels
Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey Macvittie, Robert Slater, Daniel W. Engels
SMU Data Science Review
In this paper we present a model to predict player performance in fantasy football. In particular, identifying high-performance players can prove to be a difficult problem, as there are on occasion players capable of high performance whose past metrics give no indication of this capacity. These "sleepers"' are often undervalued, and the acquisition of such players can have notable impact on a fantasy football team's overall performance. We constructed a regression model that accounts for players' past performance and athletic metrics to predict their future performance. The model we built performs favorably in predicting athlete performance in relation to other …
Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels
Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels
SMU Data Science Review
In this paper we find a pattern of aperiodic seismic signals that precede earthquakes at any time in a laboratory earthquake’s cycle using a small window of time. We use a data set that comes from a classic laboratory experiment having several stick-slip displacements (earthquakes), a type of experiment which has been studied as a simulation of seismologic faults for decades. This data exhibits similar behavior to natural earthquakes, so the same approach may work in predicting the timing of them. Here we show that by applying random forest machine learning technique to the acoustic signal emitted by a laboratory …
Self-Driving Cars: Evaluation Of Deep Learning Techniques For Object Detection In Different Driving Conditions, Ramesh Simhambhatla, Kevin Okiah, Shravan Kuchkula, Robert Slater
Self-Driving Cars: Evaluation Of Deep Learning Techniques For Object Detection In Different Driving Conditions, Ramesh Simhambhatla, Kevin Okiah, Shravan Kuchkula, Robert Slater
SMU Data Science Review
Deep Learning has revolutionized Computer Vision, and it is the core technology behind capabilities of a self-driving car. Convolutional Neural Networks (CNNs) are at the heart of this deep learning revolution for improving the task of object detection. A number of successful object detection systems have been proposed in recent years that are based on CNNs. In this paper, an empirical evaluation of three recent meta-architectures: SSD (Single Shot multi-box Detector), R-CNN (Region-based CNN) and R-FCN (Region-based Fully Convolutional Networks) was conducted to measure how fast and accurate they are in identifying objects on the road, such as vehicles, pedestrians, …
Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley
Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley
SMU Data Science Review
In this paper, we will explore and present a method of finding characteristics of a restaurant using its reviews through machine learning algorithms. We begin by building models to predict the ratings of individual reviews using text and categorical features. This is to examine the efficacy of the algorithms to the task. Both XGBoost and logistic regression will be examined. With these models, our goal is then to identify key phrases in reviews that are correlated with positive and negative experience. Our analysis makes use of review data publicly made available by Yelp. Key bigrams extracted were non-specific to the …
Repairing Landsat Satellite Imagery Using Deep Machine Learning Techniques, Griffin J. Lane, Patricia Goresen, Robert Slater
Repairing Landsat Satellite Imagery Using Deep Machine Learning Techniques, Griffin J. Lane, Patricia Goresen, Robert Slater
SMU Data Science Review
Satellite Imagery is one of the most widely used sources to analyze geographic features and environments in the world. The data gathered from satellites are used to quantify many vital problems facing our society, such as the impact of natural disasters, shore erosion, rising water levels, and urban growth rates. In this paper, we construct machine learning and deep learning algorithms for repairing anomalies in the Landsat satellite imagery data which arise for various reasons ranging from cloud obstruction to satellite malfunctions. The accuracy of GIS data is crucial to ensuring the models produced from such data are as close …
Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia
Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia
SMU Data Science Review
In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory …
Leveraging Natural Language Processing Applications And Microblogging Platform For Increased Transparency In Crisis Areas, Ernesto Carrera-Ruvalcaba, Johnson Ekedum, Austin Hancock, Ben Brock
Leveraging Natural Language Processing Applications And Microblogging Platform For Increased Transparency In Crisis Areas, Ernesto Carrera-Ruvalcaba, Johnson Ekedum, Austin Hancock, Ben Brock
SMU Data Science Review
Through microblogging applications, such as Twitter, people actively document their lives even in times of natural disasters such as hurricanes and earthquakes. While first responders and crisis-teams are able to help people who call 911, or arrive at a designated shelter, there are vast amounts of information being exchanged online via Twitter that provide real-time, location-based alerts that are going unnoticed. To effectively use this information, the Tweets must be verified for authenticity and categorized to ensure that the proper authorities can be alerted. In this paper, we create a Crisis Message Corpus from geotagged Tweets occurring during 7 hurricanes …
An Evaluation Of Training Size Impact On Validation Accuracy For Optimized Convolutional Neural Networks, Jostein Barry-Straume, Adam Tschannen, Daniel W. Engels, Edward Fine
An Evaluation Of Training Size Impact On Validation Accuracy For Optimized Convolutional Neural Networks, Jostein Barry-Straume, Adam Tschannen, Daniel W. Engels, Edward Fine
SMU Data Science Review
In this paper, we present an evaluation of training size impact on validation accuracy for an optimized Convolutional Neural Network (CNN). CNNs are currently the state-of-the-art architecture for object classification tasks. We used Amazon’s machine learning ecosystem to train and test 648 models to find the optimal hyperparameters with which to apply a CNN towards the Fashion-MNIST (Mixed National Institute of Standards and Technology) dataset. We were able to realize a validation accuracy of 90% by using only 40% of the original data. We found that hidden layers appear to have had zero impact on validation accuracy, whereas the neural …
Political Profiling Using Feature Engineering And Nlp, Chiranjeevi Mallavarapu, Ramya Mandava, Sabitri Kc, Ginger M. Holt
Political Profiling Using Feature Engineering And Nlp, Chiranjeevi Mallavarapu, Ramya Mandava, Sabitri Kc, Ginger M. Holt
SMU Data Science Review
Public surveys are predominantly used when forecasting election outcomes. While the approach has had significant successes, the surveys have had their failures as well, especially when it comes to accuracy and reliability. As a result, it becomes challenging for political parties to spend their campaign budgets in a manner that facilitates the growth of a favorable and verifiable public opinion. Consequently, it is critical that a more accurate methodology to predict election outcome is developed. In this paper, we present an evaluation of the impact of utilizing dynamic public data on predicting the outcome of elections. Our model yielded a …
Pedestrian Safety -- Fundamental To A Walkable City, Joshua Herrera, Patrick Mcdevitt, Preeti Swaminathan, Raghuram Srinivas
Pedestrian Safety -- Fundamental To A Walkable City, Joshua Herrera, Patrick Mcdevitt, Preeti Swaminathan, Raghuram Srinivas
SMU Data Science Review
In this paper, we present a method to identify urban areas with a higher likelihood of pedestrian safety related events. Pedestrian safety related events are pedestrian-vehicle interactions that result in fatalities, injuries, accidents without injury, or near--misses between pedestrians and vehicles. To develop a solution to this problem of identifying likely event locations, we assemble data, primarily from the City of Cincinnati and Hamilton County, that include safety reports from a five year period, geographic information for these events, citizen survey of pedestrian reported concerns, non-emergency requests for service for any cause in the city, property values and public transportation …
Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater
Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater
SMU Data Science Review
The problem of forecasting market volatility is a difficult task for most fund managers. Volatility forecasts are used for risk management, alpha (risk) trading, and the reduction of trading friction. Improving the forecasts of future market volatility assists fund managers in adding or reducing risk in their portfolios as well as in increasing hedges to protect their portfolios in anticipation of a market sell-off event. Our analysis compares three existing financial models that forecast future market volatility using the Chicago Board Options Exchange Volatility Index (VIX) to six machine/deep learning supervised regression methods. This analysis determines which models provide best …