Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2,937 Full-Text Articles 3,860 Authors 1,062,673 Downloads 128 Institutions

All Articles in Applied Statistics

Faceted Search

2,937 full-text articles. Page 4 of 79.

Using Cyclical Components To Improve The Forecasts Of The Stock Market And Macroeconomic Variables, Kenneth R. Szulczyk, Shibley Sadique 2018 Curtin University Malaysia

Using Cyclical Components To Improve The Forecasts Of The Stock Market And Macroeconomic Variables, Kenneth R. Szulczyk, Shibley Sadique

Journal of Modern Applied Statistical Methods

Economic variables such as stock market indices, interest rates, and national output measures contain cyclical components. Forecasting methods excluding these cyclical components yield inaccurate out-of-sample forecasts. Accordingly, a three-stage procedure is developed to estimate a vector autoregression (VAR) with cyclical components. A Monte Carlo simulation shows the procedure estimates the parameters accurately. Subsequently, a VAR with cyclical components improves the root-mean-square error of out-of-sample forecasts by 50% for a stock market model with macroeconomic variables.


Understanding Sexual Violence Against Women, Maria Martinez 2018 Illinois State University

Understanding Sexual Violence Against Women, Maria Martinez

Annual Symposium on Biomathematics and Ecology: Education and Research

No abstract provided.


Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri 2018 CUNY City College

Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri

Publications and Research

Water risk management is a ubiquitous challenge faced by stakeholders in the water or agricultural sector. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Pre-season large-scale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest ranked probability skill score and lowest root-mean-squared error in a leave-one-out cross-validation mode. Adaptive forecasts were made in the years ...


Statistical Design Of Experiment Techniques In Manufacturing, Caroline M. Kerfonta 2018 University of South Carolina

Statistical Design Of Experiment Techniques In Manufacturing, Caroline M. Kerfonta

Senior Theses

There are many statistical techniques used to design experiments. These techniques are used in many different fields. This thesis will focus on the use of the three most common techniques used to design statistical experiments in manufacturing.

The three techniques that will be investigated are completely randomized design, randomized block design, and factorial design. These techniques will be compared, contrasted, and explained. Research examples will be presented along with sample R code for each technique. These examples will be accompanied by analysis of the techniques as well as an overview of the uses and history of experiments in manufacturing


Statistical Modeling Of Co2 Flux Data, Fang He 2018 The University of Western Ontario

Statistical Modeling Of Co2 Flux Data, Fang He

Electronic Thesis and Dissertation Repository

Carbon dioxide (CO2) flux is important for agriculture and carbon cycle studies. Only a small proportion of the land is currently covered by proper equipment to directly collect CO2 flux data. The CO2 flux data has an obvious annual cycle with the phase changing from year to year. How to build a model to estimate the annual effect and seasonal dynamics is a challenging task. With the help of the Moderate Resolution Imaging Spectroradiometer (MODIS) which is carried by NASA satellites, corresponding data, such as normalized difference vegetation index (NDVI), is freely available from NASA. Our goals are modeling the ...


Comparison Of Multiple Imputation Methods For Categorical Survey Items With High Missing Rates: Application To The Family Life, Activity, Sun, Health And Eating (Flashe) Study, Benmei Liu, Erin Hennessy, April Oh, Laura A. Dwyer, Linda Nebeling 2018 National Cancer Institute

Comparison Of Multiple Imputation Methods For Categorical Survey Items With High Missing Rates: Application To The Family Life, Activity, Sun, Health And Eating (Flashe) Study, Benmei Liu, Erin Hennessy, April Oh, Laura A. Dwyer, Linda Nebeling

Journal of Modern Applied Statistical Methods

Two multiple imputation methods, the Sequential Regression Multivariate Imputation Algorithm and the Cox-Lannacchione Weighted Sequential Hotdeck, were examined and compared to impute highly missing categorical variables from the Family Life, Activity, Sun, Health and Eating (FLASHE) study. This paper describes the imputation approaches and results from the study.


Dealing With Sensitive Quantitative Variables: A Comparison Of Sampling Designs For The Procedure Of Gupta And Thornton, Carlos Narciso Bouza Herrera, Prayas Sharma 2018 University of Havana

Dealing With Sensitive Quantitative Variables: A Comparison Of Sampling Designs For The Procedure Of Gupta And Thornton, Carlos Narciso Bouza Herrera, Prayas Sharma

Journal of Modern Applied Statistical Methods

The use of randomized response procedures allows diminishing the number of non-responses and increasing the accuracy of the responses. A new sampling strategy is developed where the reports are scrambled using the procedure of Gupta and Thornton. The estimator of the mean as well as the errors are developed for the Rao-Hartley-Cochran and Ranked Sets Sampling designs. The proposals are compared with the original model based on the use of simple random sampling.


Bayesian And Semi-Bayesian Estimation Of The Parameters Of Generalized Inverse Weibull Distribution, Kamaljit Kaur, Kalpana K. Mahajan, Sangeeta Arora 2018 Panjab University, Chandigarh, India

Bayesian And Semi-Bayesian Estimation Of The Parameters Of Generalized Inverse Weibull Distribution, Kamaljit Kaur, Kalpana K. Mahajan, Sangeeta Arora

Journal of Modern Applied Statistical Methods

Bayesian and semi-Bayesian estimators of parameters of the generalized inverse Weibull distribution are obtained using Jeffreys’ prior and informative prior under specific assumptions of loss function. Using simulation, the relative efficiency of the proposed estimators is obtained under different set-ups. A real life example is also given.


The Periglacial Landscape Of Mars: Insight Into The 'Decameter-Scale Rimmed Depressions' In Utopia Planitia, Arya Bina 2018 The University of Western Ontario

The Periglacial Landscape Of Mars: Insight Into The 'Decameter-Scale Rimmed Depressions' In Utopia Planitia, Arya Bina

Electronic Thesis and Dissertation Repository

Currently, Mars appears to be in a ‘frozen’ and ‘dry’ state, with the clear majority of the planet’s surface maintaining year-round sub-zero temperatures. However, the discovery of features consistent with landforms found in periglacial environments on Earth, suggests a climate history for Mars that may have involved freeze and thaw cycles. Such landforms include hummocky, polygonised, scalloped, and pitted terrains, as well as ice-rich deposits and gullies, along the mid- to high-latitude bands, typically with no lower than 20o N/S. The detection of near-surface and surface ice via the Phoenix lander, excavation of ice via recent impact cratering ...


Random Forest Vs Logistic Regression: Binary Classification For Heterogeneous Datasets, Kaitlin Kirasich, Trace Smith, Bivin Sadler 2018 Southern Methodist University

Random Forest Vs Logistic Regression: Binary Classification For Heterogeneous Datasets, Kaitlin Kirasich, Trace Smith, Bivin Sadler

SMU Data Science Review

Selecting a learning algorithm to implement for a particular application on the basis of performance still remains an ad-hoc process using fundamental benchmarks such as evaluating a classifier’s overall loss function and misclassification metrics. In this paper we address the difficulty of model selection by evaluating the overall classification performance between random forest and logistic regression for datasets comprised of various underlying structures: (1) increasing the variance in the explanatory and noise variables, (2) increasing the number of noise variables, (3) increasing the number of explanatory variables, (4) increasing the number of observations. We developed a model evaluation tool ...


Predicting National Basketball Association Success: A Machine Learning Approach, Adarsh Kannan, Brian Kolovich, Brandon Lawrence, Sohail Rafiqi 2018 Southern Methodist University

Predicting National Basketball Association Success: A Machine Learning Approach, Adarsh Kannan, Brian Kolovich, Brandon Lawrence, Sohail Rafiqi

SMU Data Science Review

In this paper, we present a machine learning based approach to projecting the success of National Basketball Association (NBA) draft prospects. With the proliferation of data, analytics have increasingly be- come a critical component in the assessment of professional and collegiate basketball players. We leverage player biometric data, college statistics, draft selection order, and positional breakdown as modelling features in our prediction algorithms. We found that a player's draft pick and their college statistics are the best predictors of their longevity in the National Basketball Association.


Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels 2018 Southern Methodist University

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that ...


Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra 2018 Southern Methodist University

Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra

SMU Data Science Review

In this paper, we present a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over \$160 billion dollars in combined value. However, both Bitcoin and Ethereum have experienced significant price swings on both daily and long term valuations. Twitter is increasingly used as a news source influencing purchase decisions by informing users of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide a purchasing and selling advantage to ...


Of Typicality And Predictive Distributions In Discriminant Function Analysis, Lyle W. Konigsberg, Susan R. Frankenberg 2018 Department of Anthropology, University of Illinois at Urbana–Champaign

Of Typicality And Predictive Distributions In Discriminant Function Analysis, Lyle W. Konigsberg, Susan R. Frankenberg

Human Biology Open Access Pre-Prints

While discriminant function analysis is an inherently Bayesian method, researchers attempting to estimate ancestry in human skeletal samples often follow discriminant function analysis with the calculation of frequentist-based typicalities for assigning group membership. Such an approach is problematic in that it fails to account for admixture and for variation in why individuals may be classified as outliers, or non-members of particular groups. This paper presents an argument and methodology for employing a fully Bayesian approach in discriminant function analysis applied to cases of ancestry estimation. The approach requires adding the calculation, or estimation, of predictive distributions as the final step ...


Error Estimates For Projection-Based Dynamic Augmented Lagrangian Boundary Condition Enforcement, With Application To Fluid–Structure Interaction, Yue Yu, David Kamensky, Ming-Chen Hsu, Xin Yang Lu, Yuri Bazilevs, Thomas J.R. Hughes 2018 Lehigh University

Error Estimates For Projection-Based Dynamic Augmented Lagrangian Boundary Condition Enforcement, With Application To Fluid–Structure Interaction, Yue Yu, David Kamensky, Ming-Chen Hsu, Xin Yang Lu, Yuri Bazilevs, Thomas J.R. Hughes

Mechanical Engineering Publications

In this work, we analyze the convergence of the recent numerical method for enforcing fluid–structure interaction (FSI) kinematic constraints in the immersogeometric framework for cardiovascular FSI. In the immersogeometric framework, the structure is modeled as a thin shell, and its influence on the fluid subproblem is imposed as a forcing term. This force has the interpretation of a Lagrange multiplier field supplemented by penalty forces, in an augmented Lagrangian formulation of the FSI kinematic constraints. Because of the non-matching fluid and structure discretizations used, no discrete inf-sup condition can be assumed. To avoid solving (potentially unstable) discrete saddle point ...


Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri 2018 Atlanta University Center

Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri

Electronic Theses & Dissertations Collection for Atlanta University & Clark Atlanta University

This thesis presents various models to the fashion industry to predict the profits for some products. To determine the expected performance of each product in 2016, we used tools of game theory to help us identify the expected value. We went further and performed a simple linear regression and used scatter plots to help us predict further the performance of the products of Prada. We used tools of game theory, analytics, and statistics to help us predict the performance of some of Prada's products. We also used the Tableau platform to visualize an overview of the products' performances. All ...


Creating A Better Technological Piano Practice Aid With Knowledge Tracing, Max Feldkamp 2018 University of Colorado, Boulder

Creating A Better Technological Piano Practice Aid With Knowledge Tracing, Max Feldkamp

Keyboard Graduate Theses & Dissertations

Modern music tutoring software and mobile instructional applications have great potential to help students practice at home effectively. They can offer extensive feedback on what the student is getting right and wrong and have adopted a gamified design with levels, badges, and other game-like elements to help gain wider appeal among students. Despite their advantages for motivating students and creating a safe practice environment, no current music instruction software demonstrates any knowledge about a student’s level of mastery. This can lead to awkward pedagogy and user frustration. Applying Bayesian Knowledge Tracing to tutoring systems provides an ideal way to ...


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor 2018 University of Louisville

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics ...


Clustering Mixed Data: An Extension Of The Gower Coefficient With Weighted L2 Distance, Augustine Oppong 2018 East Tennessee State University

Clustering Mixed Data: An Extension Of The Gower Coefficient With Weighted L2 Distance, Augustine Oppong

Electronic Theses and Dissertations

Sorting out data into partitions is increasing becoming complex as the constituents of data is growing outward everyday. Mixed data comprises continuous, categorical, directional functional and other types of variables. Clustering mixed data is based on special dissimilarities of the variables. Some data types may influence the clustering solution. Assigning appropriate weight to the functional data may improve the performance of the clustering algorithm. In this paper we use the extension of the Gower coefficient with judciously chosen weight for the L2 to cluster mixed data.The benefits of weighting are demonstrated both in in applications to the Buoy data ...


Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina 2018 Washington University in St Louis

Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina

Arts & Sciences Electronic Theses and Dissertations

Multinomial logistic regression model (MNL) is a powerful and easily tractable way for measuring the probabilistic impact of input variables on individual categorical choices. Crucially, the standard MNL assumes that all subjects of the study have the same choice sets. In the meanwhile, especially in political science and economics, this condition is frequently violated. Probably, the most graphical example of varying choice sets (VCS) is partially contested elections. Furthermore, the MNL implicitly implies the Independence of the Irregular Alternatives (IIA) assumption by requiring i.i.d errors that contrasts the MNL and the multinomial probit (MNP) and mixed logit (MXL ...


Digital Commons powered by bepress