Open Access. Powered by Scholars. Published by Universities.®

Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Models

2018

Institution
Keyword
Publication
Publication Type

Articles 1 - 9 of 9

Full-Text Articles in Probability

Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab Nov 2018

Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab

Department of Civil and Environmental Engineering: Dissertations, Theses, and Student Research

Transportation of hazardous materials (hazmat) in the United States (U.S.) constituted 22.8% of the total tonnage transported in 2012 with an estimated value of more than 2.3 billion dollars. As such, hazmat transportation is a significant economic activity in the U.S. However, hazmat transportation exposes people and environment to the infrequent but potentially severe consequences of incidents resulting in hazmat release. Trucks and trains carried 63.7% of the hazmat in the U.S. in 2012 and are the major foci of this dissertation. The main research objectives were 1) identification and quantification of the effects of different factors on occurrence and …


Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri Oct 2018

Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri

Publications and Research

Water risk management is a ubiquitous challenge faced by stakeholders in the water or agricultural sector. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Pre-season large-scale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest ranked probability skill score and lowest root-mean-squared error in a leave-one-out cross-validation mode. Adaptive forecasts were made in the years …


Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels Aug 2018

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that reviews …


Generalizing Multistage Partition Procedures For Two-Parameter Exponential Populations, Rui Wang Aug 2018

Generalizing Multistage Partition Procedures For Two-Parameter Exponential Populations, Rui Wang

University of New Orleans Theses and Dissertations

ANOVA analysis is a classic tool for multiple comparisons and has been widely used in numerous disciplines due to its simplicity and convenience. The ANOVA procedure is designed to test if a number of different populations are all different. This is followed by usual multiple comparison tests to rank the populations. However, the probability of selecting the best population via ANOVA procedure does not guarantee the probability to be larger than some desired prespecified level. This lack of desirability of the ANOVA procedure was overcome by researchers in early 1950's by designing experiments with the goal of selecting the best …


Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister Aug 2018

Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister

Electronic Theses and Dissertations

A probability distribution is a statistical function that describes the probability of possible outcomes in an experiment or occurrence. There are many different probability distributions that give the probability of an event happening, given some sample size n. An important question in statistics is to determine the distribution of the sum of independent random variables when the sample size n is fixed. For example, it is known that the sum of n independent Bernoulli random variables with success probability p is a Binomial distribution with parameters n and p: However, this is not true when the sample size …


Deep Learning Analysis Of Limit Order Book, Xin Xu May 2018

Deep Learning Analysis Of Limit Order Book, Xin Xu

Arts & Sciences Electronic Theses and Dissertations

In this paper, we build a deep neural network for modeling spatial structure in limit order book and make prediction for future best ask or best bid price based on ideas of (Sirignano 2016). We propose an intuitive data processing method to approximate the data is non-available for us based only on level I data that is more widely available. The model is based on the idea that there is local dependence for best ask or best bid price and sizes of related orders. First we use logistic regression to prove that this approach is reasonable. To show the advantages …


Application Of Remote Sensing And Machine Learning Modeling To Post-Wildfire Debris Flow Risks, Priscilla Addison Jan 2018

Application Of Remote Sensing And Machine Learning Modeling To Post-Wildfire Debris Flow Risks, Priscilla Addison

Dissertations, Master's Theses and Master's Reports

Historically, post-fire debris flows (DFs) have been mostly more deadly than the fires that preceded them. Fires can transform a location that had no history of DFs to one that is primed for it. Studies have found that the higher the severity of the fire, the higher the probability of DF occurrence. Due to high fatalities associated with these events, several statistical models have been developed for use as emergency decision support tools. These previous models used linear modeling approaches that produced subpar results. Our study therefore investigated the application of nonlinear machine learning modeling as an alternative. Existing models …


Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett Jan 2018

Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett

Williams Honors College, Honors Research Projects

The purpose of this Senior Honors Project is to research, study, and demonstrate newfound knowledge of various machine learning statistical techniques that are not covered in the University of Akron’s statistics major curriculum. This report will be an overview of three machine-learning methods that were used to predict NCAA Basketball results, specifically, the March Madness tournament. The variables used for these methods, models, and tests will include numerous variables kept throughout the season for each team, along with a couple variables that are used by the selection committee when tournament teams are being picked. The end goal is to find …


Some New And Generalized Distributions Via Exponentiation, Gamma And Marshall-Olkin Generators With Applications, Hameed Abiodun Jimoh Jan 2018

Some New And Generalized Distributions Via Exponentiation, Gamma And Marshall-Olkin Generators With Applications, Hameed Abiodun Jimoh

Electronic Theses and Dissertations

Three new generalized distributions developed via completing risk, gamma generator, Marshall-Olkin generator and exponentiation techniques are proposed and studied. Structural properties including quantile functions, hazard rate functions, moment, conditional moments, mean deviations, R\'enyi entropy, distribution of order statistics and maximum likelihood estimates are presented. Monte Carlo simulation is employed to examine the performance of the proposed distributions. Applications of the generalized distributions to real lifetime data are presented to illustrate the usefulness of the models.