# Statistical Models Commons™

## All Articles in Statistical Models

988 full-text articles. Page 1 of 31.

Rfviz: An Interactive Visualization Package For Random Forests In R, 2018 Utah State University

#### Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett

##### All Graduate Plan B and other Reports

Random forests are very popular tools for predictive analysis and data science. They work for both classification (where there is a categorical response variable) and regression (where the response is continuous). Random forests provide proximities, and both local and global measures of variable importance. However, these quantities require special tools to be effectively used to interpret the forest. Rfviz is a sophisticated interactive visualization package and toolkit in R, specially designed for interpreting the results of a random forest in a user-friendly way. Rfviz uses a recently developed R package (loon) from the Comprehensive R Archive Network (CRAN) to create ...

41 - Data Exploration And Analysis For The Hemingway Measure Of Adult Connectedness, 2018 University of North Georgia

#### 41 - Data Exploration And Analysis For The Hemingway Measure Of Adult Connectedness, Gildardo Bautista-Maya, Ping Ye, Diane Cook

##### Georgia Undergraduate Research Conference (GURC)

Abstract:

We analyze the dataset collected from students participating in the Boy With A Ball (BWAB) program, a faith-based community outreach group, through the Hemingway Measure of Adult Connectedness©, a questionnaire measuring the social connectedness of adolescents. First, we approach the data in the conventional method provided by the Hemingway website. We then identify which questions are strong determiners in deciding whether a student has completed the BWAB program or not. With the goal of utilizing the logistic regression, we reduce the set of questions to those only identified as significant in other methods. These methods include linear regression, decision ...

Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, 2018 University of Nebraska-Lincoln

#### Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab

##### Civil Engineering Theses, Dissertations, and Student Research

Transportation of hazardous materials (hazmat) in the United States (U.S.) constituted 22.8% of the total tonnage transported in 2012 with an estimated value of more than 2.3 billion dollars. As such, hazmat transportation is a significant economic activity in the U.S. However, hazmat transportation exposes people and environment to the infrequent but potentially severe consequences of incidents resulting in hazmat release. Trucks and trains carried 63.7% of the hazmat in the U.S. in 2012 and are the major foci of this dissertation. The main research objectives were 1) identification and quantification of the effects ...

2018 University of Windsor

#### Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song

##### Major Papers

In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We ...

2018 University of Windsor

#### Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao

##### Major Papers

In this major paper, we use high-dimensional models to analyze macroeconomic data which is in influenced by the break point. In particular, we consider to detect the break point and study the changes of the number of factors and the factor loadings with the structural instability.

Concretely, we propose two factor models which explain the processes of pre- and post- break periods. Then, we consider the break point as known or unknown. In both situations, we derive the shrinkage estimators by minimizing the penalized least square function and calculate the estimators of the numbers of pre- and post- break factors ...

2018 Illinois State University

#### Snakebite Dynamics Of Colombia: Effects Of Precipitation Seasonality Of Incidence, Carlos Cruz

##### Annual Symposium on Biomathematics and Ecology: Education and Research

No abstract provided.

Use Of Structural Equation Models To Predict Dengue Illness Phenotype, 2018 Brown University

#### Use Of Structural Equation Models To Predict Dengue Illness Phenotype, Sangshin Park, Anon Srikiatkhachorn, Siripen Kalayanarooj, Louis Macareo, Sharone Green, Jennifer F. Friedman, Alan L. Rothman

##### Open Access Articles

BACKGROUND: Early recognition of dengue, particularly patients at risk for plasma leakage, is important to clinical management. The objective of this study was to build predictive models for dengue, dengue hemorrhagic fever (DHF), and dengue shock syndrome (DSS) using structural equation modelling (SEM), a statistical method that evaluates mechanistic pathways.

METHODS/FINDINGS: We performed SEM using data from 257 Thai children enrolled within 72 h of febrile illness onset, 156 with dengue and 101 with non-dengue febrile illnesses. Models for dengue, DHF, and DSS were developed based on data obtained three and one day(s) prior to fever resolution (fever ...

2018 Iowa State University

#### Identifying Treatment Effects In The Presence Of Confounded Types, Desire Kedagni

##### Economics Working Papers

In this paper, I consider identification of treatment effects when
the treatment is endogenous. The use of instrumental variables is a popular
solution to deal with endogeneity, but this may give misleading answers when
the instrument is invalid. I show that when the instrument is invalid due to
correlation with the first stage unobserved heterogeneity, a second (also
possibly invalid) instrument allows to partially identify not only the local
average treatment effect but also the entire potential outcomes distributions
for compliers. I exploit the fact that the distribution of the observed
outcome in each group defined by the treatment and ...

Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, 2018 Southern Methodist University

#### Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, Alfeo Sabay, Laurie Harris, Vivek Bejugama, Karen Jaceldo-Siegl

##### SMU Data Science Review

In this paper, we present a heart disease prediction use case showing how synthetic data can be used to address privacy concerns and overcome constraints inherent in small medical research data sets. While advanced machine learning algorithms, such as neural networks models, can be implemented to improve prediction accuracy, these require very large data sets which are often not available in medical or clinical research. We examine the use of surrogate data sets comprised of synthetic observations for modeling heart disease prediction. We generate surrogate data, based on the characteristics of original observations, and compare prediction accuracy results achieved from ...

Minimizing The Perceived Financial Burden Due To Cancer, 2018 Southern Methodist University

#### Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

##### SMU Data Science Review

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age ...

Yelp’S Review Filtering Algorithm, 2018 Southern Methodist University

#### Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

##### SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that ...

Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, 2018 Southern Methodist University

#### Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra

In this paper, we present a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over \$160 billion dollars in combined value. However, both Bitcoin and Ethereum have experienced significant price swings on both daily and long term valuations. Twitter is increasingly used as a news source influencing purchase decisions by informing users of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide a purchasing and selling advantage to ... 2018 The University of Southern Mississippi #### Coastal Wetland Dynamics Under Sea-Level Rise And Wetland Restoration In The Northern Gulf Of Mexico Using Bayesian Multilevel Models And A Web Tool, Tyler Hardy ##### Master's Theses There is currently a lack of modeling framework to predict how relative sea-level rise (SLR), combined with restoration activities, affects landscapes of coastal wetlands with uncertainties accounted for at the entire northern Gulf of Mexico (NGOM). I developed such a modeling framework – Bayesian multi-level models to study the spatial pattern of wetland loss in the NGOM, driven by relative RSLR, vegetation productivity, tidal range, coastal slope, and wave height – all interacting with river-borne sediment availability, indicated by hydrological regimes. These interactions have not been comprehensively investigated before. I further modified this model to assess the efficacy of restoration projects from ... A Bayesian State-Space Model Using Age-At-Harvest Data For Estimating The Population Of Black Bears (Ursus Americanus) In Wisconsin, 2018 University of Illinois at Urbana-Champaign #### A Bayesian State-Space Model Using Age-At-Harvest Data For Estimating The Population Of Black Bears (Ursus Americanus) In Wisconsin, Maximilian L. Allen, Andrew S. Norton, Glenn Stauffer, Nathan M. Roberts, Yanshi Luo, Qing Li, David Macfarland, Timothy R. Van Deelen ##### Industrial and Manufacturing Systems Engineering Publications Population estimation is essential for the conservation and management of fish and wildlife, but accurate estimates are often difficult or expensive to obtain for cryptic species across large geographical scales. Accurate statistical models with manageable financial costs and field efforts are needed for hunted populations and using age-at-harvest data may be the most practical foundation for these models. Several rigorous statistical approaches that use age-at-harvest and other data to accurately estimate populations have recently been developed, but these are often dependent on (a) accurate prior knowledge about demographic parameters of the population, (b) auxiliary data, and (c) initial population size ... Testing Hypotheses Of Covariance Structure In Multivariate Data, 2018 NOVA University of Lisbon #### Testing Hypotheses Of Covariance Structure In Multivariate Data, Miguel Fonseca, Arkadiusz Koziol, Roman Zmyslony ##### Electronic Journal of Linear Algebra In this paper there is given a new approach for testing hypotheses on the structure of covariance matrices in double multivariate data. It is proved that ratio of positive and negative parts of best unbiased estimators (BUE) provide an F-test for independence of blocks variables in double multivariate models. Application Of Jordan Algebra For Testing Hypotheses About Structure Of Mean Vector In Model With Block Compound Symmetric Covariance Structure, 2018 Faculty of Mathematics, Computer Science and Econometrics, University of Zielona Góra, #### Application Of Jordan Algebra For Testing Hypotheses About Structure Of Mean Vector In Model With Block Compound Symmetric Covariance Structure, Roman Zmyślony, Ivan Zezula, Arkadiusz Kozioł ##### Electronic Journal of Linear Algebra In this article authors derive test for structure of mean vector in model with block compound symmetric covariance structure for two-level multivariate observations. One possible structure is so called structured mean vector when its components remain constant over sites or over time points, so that mean vector is of the form$\boldsymbol{1}_{u}\otimes\boldsymbol{\mu}$with$\boldsymbol{\mu}=(\mu_1,\mu_2,\ldots,\mu_m)'\in\mathbb{R}^m\$. This hypothesis is tested against alternative of unstructured mean vector, which can change over sites or over time points.

2018 Atlanta University Center

#### Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri

##### Electronic Theses & Dissertations Collection for Atlanta University & Clark Atlanta University

This thesis presents various models to the fashion industry to predict the profits for some products. To determine the expected performance of each product in 2016, we used tools of game theory to help us identify the expected value. We went further and performed a simple linear regression and used scatter plots to help us predict further the performance of the products of Prada. We used tools of game theory, analytics, and statistics to help us predict the performance of some of Prada's products. We also used the Tableau platform to visualize an overview of the products' performances. All ...

Deep Machine Learning For Mechanical Performance And Failure Prediction, 2018 Penn State University

#### Deep Machine Learning For Mechanical Performance And Failure Prediction, Elijah Reber, Nickolas D. Winovich, Guang Lin

##### The Summer Undergraduate Research Fellowship (SURF) Symposium

Deep learning has provided opportunities for advancement in many fields. One such opportunity is being able to accurately predict real world events. Ensuring proper motor function and being able to predict energy output is a valuable asset for owners of wind turbines. In this paper, we look at how effective a deep neural network is at predicting the failure or energy output of a wind turbine. A data set was obtained that contained sensor data from 17 wind turbines over 13 months, measuring numerous variables, such as spindle speed and blade position and whether or not the wind turbine experienced ...

Efvs Effects On Pilot Performance, 2018 Purdue University

#### Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry

##### The Summer Undergraduate Research Fellowship (SURF) Symposium

Flight tests have been conducted at Purdue University using a computer-based flying simulator in an attempt to determine and measure the effects of Enhanced Flight Vision Systems (EFVS) on the performance of pilots during landing. Knowledge of these effects could help guide future design and implementation of EFVS in modern commercial aircraft, and further increase pilots’ ability to control the aircraft in low-visibility conditions. The problem that has faced researchers in the past has revolved around the difficulty in interpreting the data which is generated by these tests. The difficulty in making a generalized conclusion based on the large amount ...

Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, 2018 Stephen F Austin State University

#### Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, Nishantha Janith Chandrasena Poddiwala Hewage

##### Electronic Theses and Dissertations

This thesis is based on a Poisson model that uses both error-free data and error-prone data subject to misclassification in the form of false-negative and false-positive counts. We present maximum likelihood estimators (MLEs), Fisher's Information, and Wald statistics for Poisson rate parameter and the two misclassification parameters. Next, we invert the Wald statistics to get asymptotic confidence intervals for Poisson rate parameter and false-negative rate parameter. The coverage and width properties for various sample size and parameter configurations are studied via a simulation study. Finally, we apply the MLEs and confidence intervals to one real data set and another ...