Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

1,398 Full-Text Articles 2,461 Authors 556,935 Downloads 133 Institutions

All Articles in Statistical Models

Faceted Search

1,398 full-text articles. Page 9 of 50.

Sample Size Requirements And Considerations For Models To Assess Human-Machine System Performance, Jennifer S. G. Lopez 2019 Air Force Institute of Technology

Sample Size Requirements And Considerations For Models To Assess Human-Machine System Performance, Jennifer S. G. Lopez

Theses and Dissertations

Hierarchical Linear Models (HLMs), also known as multi-level models, are an extension of multiple regression analysis and can aid in the understanding of human and machine workloads of a system. These models allow for prediction and testing in systems with hierarchies of two or more levels. The complex interrelated variability of these multi-level models exists in operational settings, such as the Air Force Distributed Common Ground System Full Motion Video (AF DCGS FMV) community which is composed of individuals (Level-1), groups (Level-2), units (Level-3), and organizations (Level-4). Through the development of sample size requirements and considerations for multi-level models, this ...


Joint Estimation Of Growth And Survival From Mark‐Recapture Data To Improve Estimates Of Senescence In Wild Populations, Beth A. Reinke, Luke Hoekstra, Anne M. Bronikowski, Fredric J. Janzen, David Miller 2019 Pennsylvania State University

Joint Estimation Of Growth And Survival From Mark‐Recapture Data To Improve Estimates Of Senescence In Wild Populations, Beth A. Reinke, Luke Hoekstra, Anne M. Bronikowski, Fredric J. Janzen, David Miller

Ecology, Evolution and Organismal Biology Publications

Understanding age‐dependent patterns of survival is fundamental to predicting population dynamics, understanding selective pressures, and estimating rates of senescence. However, quantifying age‐specific survival in wild populations poses significant logistical and statistical challenges. Recent work has helped to alleviate these constraints by demonstrating that age‐specific survival can be estimated using mark‐recapture data even when age is unknown for all or some individuals. However, previous approaches do not incorporate auxiliary information that can improve age estimates of individuals. We introduce a survival estimator that combines a von Bertalanffy growth model, age‐specific hazard functions, and a Cormack‐Jolly ...


Fully Bayesian Analysis Of Allele-Specific Rna-Seq Data, Ignacio Alvarez-Castro, Jarad Niemi 2019 Universidad de la Republica

Fully Bayesian Analysis Of Allele-Specific Rna-Seq Data, Ignacio Alvarez-Castro, Jarad Niemi

Statistics Publications

Diploid organisms have two copies of each gene, called alleles, that can be separately transcribed. The RNA abundance associated to any particular allele is known as allele-specific expression (ASE). When two alleles have polymorphisms in transcribed regions, ASE can be studied using RNA-seq read count data. ASE has characteristics different from the regular RNA-seq expression: ASE cannot be assessed for every gene, measures of ASE can be biased towards one of the alleles (reference allele), and ASE provides two measures of expression for a single gene for each biological samples with leads to additional complications for single-gene models. We present ...


Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan 2019 Southern Methodist University

Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan

SMU Data Science Review

In this paper, we present novel approaches to predicting as- set failure in the electric distribution system. Failures in overhead power lines and their associated equipment in particular, pose significant finan- cial and environmental threats to electric utilities. Electric device failure furthermore poses a burden on customers and can pose serious risk to life and livelihood. Working with asset data acquired from an electric utility in Southern California, and incorporating environmental and geospatial data from around the region, we applied a Random Forest methodology to predict which overhead distribution lines are most vulnerable to fail- ure. Our results provide evidence ...


Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey MacVittie, Robert Slater, Daniel W. Engels 2019 Southern Methodist University

Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey Macvittie, Robert Slater, Daniel W. Engels

SMU Data Science Review

In this paper we present a model to predict player performance in fantasy football. In particular, identifying high-performance players can prove to be a difficult problem, as there are on occasion players capable of high performance whose past metrics give no indication of this capacity. These "sleepers"' are often undervalued, and the acquisition of such players can have notable impact on a fantasy football team's overall performance. We constructed a regression model that accounts for players' past performance and athletic metrics to predict their future performance. The model we built performs favorably in predicting athlete performance in relation to ...


Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels 2019 Southern Methodist University

Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels

SMU Data Science Review

In this paper we find a pattern of aperiodic seismic signals that precede earthquakes at any time in a laboratory earthquake’s cycle using a small window of time. We use a data set that comes from a classic laboratory experiment having several stick-slip displacements (earthquakes), a type of experiment which has been studied as a simulation of seismologic faults for decades. This data exhibits similar behavior to natural earthquakes, so the same approach may work in predicting the timing of them. Here we show that by applying random forest machine learning technique to the acoustic signal emitted by a ...


Texture-Based Deep Neural Network For Histopathology Cancer Whole Slide Image (Wsi) Classification, NELSON Zange TSAKU 2019 Kennesaw State University

Texture-Based Deep Neural Network For Histopathology Cancer Whole Slide Image (Wsi) Classification, Nelson Zange Tsaku

Master of Science in Computer Science Theses

Automatic histopathological Whole Slide Image (WSI) analysis for cancer classification has been highlighted along with the advancements in microscopic imaging techniques. However, manual examination and diagnosis with WSIs is time-consuming and tiresome. Recently, deep convolutional neural networks have succeeded in histopathological image analysis. In this paper, we propose a novel cancer texture-based deep neural network (CAT-Net) that learns scalable texture features from histopathological WSIs. The innovation of CAT-Net is twofold: (1) capturing invariant spatial patterns by dilated convolutional layers and (2) Reducing model complexity while improving performance. Moreover, CAT-Net can provide discriminative texture patterns formed on cancerous regions of histopathological ...


Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood 2019 Duquesne University

Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood

Electronic Theses and Dissertations

Premature birth has been identified as the single greatest cause of death worldwide in children under the age of five. This thesis will implement binary logistic regression and proportional odds ordinal logistic regression to predict different levels of premature birth and identify associated risk factors. The models will be built from the Center for Disease Control and Prevention's 2014 Vital Statistics Natality Birth Data containing nearly 4 million live births within the United States. Odds ratios and confidence intervals on risk factors were produced utilizing binary logistic regression.


Garch Modeling Of Value At Risk And Expected Shortfall Using Bayesian Model Averaging, Ismail Kheir 2019 CUNY Hunter College

Garch Modeling Of Value At Risk And Expected Shortfall Using Bayesian Model Averaging, Ismail Kheir

Theses and Dissertations

This thesis conducts Value at Risk (VaR) and Expected Shortfall (ES) estimation using GARCH modeling and Bayesian Model Averaging (BMA). BMA considers multiple models weighted by some information criterion. Through BMA, this thesis finds that VaR and ES estimates can be improved through enhanced modeling of the data generation process.


Pass-Through Of The Policy-Induced E85 Subsidy: Insights From Hotelling's Model, Jinjing Luo, Giancarlo Moschini 2019 Iowa State University

Pass-Through Of The Policy-Induced E85 Subsidy: Insights From Hotelling's Model, Jinjing Luo, Giancarlo Moschini

Economics Publications

We build a structural model of imperfect competition for a retail market that supplies both low-ethanol (E10) and high-ethanol (E85) gasoline blends. The model permits us to study some impacts of the E85 subsidy induced by the U.S. Renewable Fuel Standard, specifically how the pass-through of this subsidy to retail prices is affected by market power. The model is rooted in Hotelling's horizontal differentiation framework, which is extended to also represent the imperfect substitutability between E10 and E85 (a vertical product differentiation attribute). The model naturally captures two sources of imperfect competition in the fuel market—refueling stations ...


Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan 2019 University of Missouri

Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan

John E. Sawyer

Nitrogen (N) fertilizer recommendation tools could be improved for estimating corn (Zea mays L.) N needs by incorporating site-specific soil and weather information. However, an evaluation of analytical methods is needed to determine the success of incorporating this information. The objectives of this research were to evaluate statistical and machine learning (ML) algorithms for utilizing soil and weather information for improving corn N recommendation tools. Eight algorithms [stepwise, ridge regression, least absolute shrinkage and selection operator (Lasso), elastic net regression, principal component regression (PCR), partial least squares regression (PLSR), decision tree, and random forest] were evaluated using a dataset containing ...


Exploring The Estimability Of Mark-Recapture Models With Individual, Time-Varying Covariates Using The Scaled Logit Link Function, Jiaqi Mu 2019 The University of Western Ontario

Exploring The Estimability Of Mark-Recapture Models With Individual, Time-Varying Covariates Using The Scaled Logit Link Function, Jiaqi Mu

Electronic Thesis and Dissertation Repository

Mark-recapture studies are often used to estimate the survival of individuals in a population and identify factors that affect survival in order to understand how the population might be affected by changing conditions. Factors that vary between individuals and over time, like body mass, present a challenge because they can only be observed when an individual is captured. Several models have been proposed to deal with the missing-covariate problem and commonly impose a logit link function which implies that the survival probability varies between 0 and 1. In this thesis I explore the estimability of four possible models when survival ...


Split Credibility: A Two-Dimensional Semi-Linear Credibility Model, Jingbing Qiu 2019 The University of Western Ontario

Split Credibility: A Two-Dimensional Semi-Linear Credibility Model, Jingbing Qiu

Electronic Thesis and Dissertation Repository

In the thesis, we introduce a two-dimensional semi-linear credibility model, which is an extension of the classical credibility or split credibility models used by practicing actuaries. Our model predicts the future expected losses of a policyholder by considering its historical primary and excess losses. The optimal split point is derived based on the mean squared error criterion. We show when and why splitting a policyholder’s historical losses into primary and excess parts work analytically. In addition, we derived formulas for estimating our model parameters nonparametrically. Finally, we show the application of our model through three examples.


Successful Shot Locations And Shot Types Used In Ncaa Men’S Division I Basketball, Olivia D. Perrin 2019 Northern Michigan University

Successful Shot Locations And Shot Types Used In Ncaa Men’S Division I Basketball, Olivia D. Perrin

All NMU Master's Theses

The primary purpose of the current study was to investigate the effect of court location (distance and angle from basket) and shot types used on shot success in NCAA Men’s DI basketball during the 2017-18 season. A secondary purpose was to further expand the analysis based on two additional factors: player position (guard, forward, or center) and team ranking. All statistical analyses were completed in RStudio and three binomial logistic regression analyses were performed to evaluate factors that influence shot success; one for all two and three point shot attempts, one for only two point attempts, and one for ...


Development Of A Statistical Shape-Function Model Of The Implanted Knee For Real-Time Prediction Of Joint Mechanics, Kalin Gibbons 2019 Boise State University

Development Of A Statistical Shape-Function Model Of The Implanted Knee For Real-Time Prediction Of Joint Mechanics, Kalin Gibbons

Boise State University Theses and Dissertations

Outcomes of total knee arthroplasty (TKA) are dependent on surgical technique, patient variability, and implant design. Non-optimal design or alignment choices may result in undesirable contact mechanics and joint kinematics, including poor joint alignment, instability, and reduced range of motion. Implant design and surgical alignment are modifiable factors with potential to improve patient outcomes, and there is a need for robust implant designs that can accommodate patient variability. Our objective was to develop a statistical shape-function model (SFM) of a posterior stabilized implant knee to instantaneously predict output mechanics in an efficient manner. Finite element methods were combined with Latin ...


Variability In The Northern North Atlantic And Arctic Oceans Across The Last Two Millennia: A Review, P. Moffa‐Sánchez, E. Moreno‐Chamarro, D. J. Reynolds, P. Ortega, L. Cunningham, D. Swingedouw, D. E. Amrhein, J. Halfar, L. Jonkers, J. H. Jungclaus, K. Perner, A. Wanamaker, S. Yeager 2019 Cardiff University

Variability In The Northern North Atlantic And Arctic Oceans Across The Last Two Millennia: A Review, P. Moffa‐Sánchez, E. Moreno‐Chamarro, D. J. Reynolds, P. Ortega, L. Cunningham, D. Swingedouw, D. E. Amrhein, J. Halfar, L. Jonkers, J. H. Jungclaus, K. Perner, A. Wanamaker, S. Yeager

Geological and Atmospheric Sciences Publications

The climate of the last two millennia was characterized by decadal to multicentennial variations, which were recorded in terrestrial records and had important societal impacts. The cause of these climatic events is still under debate, but changes in the North Atlantic circulation have often been proposed to play an important role. In this review we compile available high‐resolution paleoceanographic data sets from the northern North Atlantic and Nordic Seas. The records are grouped into regions related to modern ocean conditions, and their variability is discussed. We additionally discuss our current knowledge from modeling studies, with a specific focus on ...


Spatio-Temporal Prediction Of Arkansas Gubernatorial Election, Michael Harris 2019 University of Arkansas, Fayetteville

Spatio-Temporal Prediction Of Arkansas Gubernatorial Election, Michael Harris

Graduate Theses and Dissertations

Our goal is to create spatio-temporal models for predicting future gubernatorial elections. For a concrete example of how well our models work we use past data to predict the 2018 Arkansas gubernatorial election and use the existing 2018 election data to check our models predictive accuracy. Gubernatorial election data was collected from the Arkansas Secretary of State website while related covariate data was collected from the website for the Federal Reserve Bank of St. Louis. The data we collect is on the county level. For predictive purposes we fit multiple models to the data using Markov chain Monte Carlo and ...


Best Management Practices And Nutrient Reduction: An Integrated Economic-Hydrological Model Of The Western Lake Erie Basin, Hongxing Liu, Wendong Zhang, Elena Irwin, Jeffrey Kast, Noel Aloysius, Jay Martin, Margaret Kalcic 2019 Lafayette College

Best Management Practices And Nutrient Reduction: An Integrated Economic-Hydrological Model Of The Western Lake Erie Basin, Hongxing Liu, Wendong Zhang, Elena Irwin, Jeffrey Kast, Noel Aloysius, Jay Martin, Margaret Kalcic

Economics Working Papers

We develop the first spatially integrated economic-hydrological model of the western Lake Erie basin that explicitly links economic models of farmers' field-level Best Management Practice (BMP) adoption choices with the Soil and Water Assessment Tool (SWAT) model to evaluate the cost-effectiveness of nutrient management policies. We quantify the tradeoffs between phosphorus reduction and policy costs and find that a hybrid policy that couples a fertilizer tax with cost-share payments for subsurface placement is the most cost-effective. We also find that economic adoption models can overstate the potential for nutrient reduction by ignoring biophysical complexities and thus demonstrate the importance of ...


Robustness Of Semi-Parametric Survival Model: Simulation Studies And Application To Clinical Data, Isaac Nwi-Mozu 2019 East Tennessee State University

Robustness Of Semi-Parametric Survival Model: Simulation Studies And Application To Clinical Data, Isaac Nwi-Mozu

Electronic Theses and Dissertations

An efficient way of analyzing survival clinical data such as cancer data is a great concern to health experts. In this study, we investigate and propose an efficient way of handling survival clinical data. Simulation studies were conducted to compare performances of various forms of survival model techniques using an R package ``survsim". Models performance was conducted with varying sample sizes as small ($n5000$). For small and mild samples, the performance of the semi-parametric outperform or approximate the performance of the parametric model. However, for large samples, the parametric model outperforms the semi-parametric model. We compared the effectiveness and reliability ...


Adjusting For Spatial Effects In Genomic Prediction, Xiaojun Mao, Somak Dutta, Raymond K. W. Wong, Dan Nettleton 2019 Fudan University

Adjusting For Spatial Effects In Genomic Prediction, Xiaojun Mao, Somak Dutta, Raymond K. W. Wong, Dan Nettleton

Statistics Publications

This paper investigates the problem of adjusting for spatial effects in genomic prediction. Despite being seldomly considered in genome-wide association studies (GWAS), spatial effects often affect phenotypic measurements of plants. We consider a Gaussian random field (GRF) model with an additive covariance structure that incorporates genotype effects, spatial effects and subpopulation effects. An empirical study shows the existence of spatial effects and heterogeneity across different subpopulation families while simulations illustrate the improvement in selecting genotypically superior plants by adjusting for spatial effects in genomic prediction.


Digital Commons powered by bepress