Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

11,524 Full-Text Articles 17,091 Authors 3,366,482 Downloads 235 Institutions

All Articles in Statistics and Probability

Faceted Search

11,524 full-text articles. Page 1 of 343.

What Have Long-Term Field Studies Taught Us About Population Dynamics?, Beth A. Reinke, David A. W. Miller, Fredric J. Janzen 2019 Pennsylvania State University

What Have Long-Term Field Studies Taught Us About Population Dynamics?, Beth A. Reinke, David A. W. Miller, Fredric J. Janzen

Ecology, Evolution and Organismal Biology Publications

Long-term studies have been crucial to the advancement of population biology, especially our understanding of population dynamics. We argue that this progress arises from three key characteristics of long-term research. First, long-term data are necessary to observe the heterogeneity that drives most population processes. Second, long-term studies often inherently lead to novel insights. Finally, long-term field studies can serve as model systems for population biology, allowing for theory and methods to be tested under well-characterized conditions. We illustrate these ideas in three long-term field systems that have made outsized contributions to our understanding of population ecology, evolution, and conservation biology ...


An Optimal Edg Method For Distributed Control Of Convection Diffusion Pdes, X. Zhang, Y. Zhang, John R. Singler 2019 Missouri University of Science and Technology

An Optimal Edg Method For Distributed Control Of Convection Diffusion Pdes, X. Zhang, Y. Zhang, John R. Singler

Mathematics and Statistics Faculty Research & Creative Works

We propose an embedded discontinuous Galerkin (EDG) method to approximate the solution of a distributed control problem governed by convection diffusion PDEs, and obtain optimal a priori error estimates for the state, dual state, their uxes, and the control. Moreover, we prove the optimize-then-discretize (OD) and discrtize-then-optimize (DO) approaches coincide. Numerical results confirm our theoretical results.


Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan 2019 University of Missouri

Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan

Agronomy Publications

Nitrogen (N) fertilizer recommendation tools could be improved for estimating corn (Zea mays L.) N needs by incorporating site-specific soil and weather information. However, an evaluation of analytical methods is needed to determine the success of incorporating this information. The objectives of this research were to evaluate statistical and machine learning (ML) algorithms for utilizing soil and weather information for improving corn N recommendation tools. Eight algorithms [stepwise, ridge regression, least absolute shrinkage and selection operator (Lasso), elastic net regression, principal component regression (PCR), partial least squares regression (PLSR), decision tree, and random forest] were evaluated using a dataset containing ...


Principal Component Neural Networks For Modeling, Prediction, And Optimization Of Hot Mix Asphalt Dynamics Modulus, Parnian Ghasemi, Mohamad Aslani, Derrick K. Rollins, R. Christopher Williams 2019 Iowa State University

Principal Component Neural Networks For Modeling, Prediction, And Optimization Of Hot Mix Asphalt Dynamics Modulus, Parnian Ghasemi, Mohamad Aslani, Derrick K. Rollins, R. Christopher Williams

Derrick K Rollins, Sr.

The dynamic modulus of hot mix asphalt (HMA) is a fundamental material property that defines the stress-strain relationship based on viscoelastic principles and is a function of HMA properties, loading rate, and temperature. Because of the large number of efficacious predictors (factors) and their nonlinear interrelationships, developing predictive models for dynamic modulus can be a challenging task. In this research, results obtained from a series of laboratory tests including mixture dynamic modulus, aggregate gradation, dynamic shear rheometer (on asphalt binder), and mixture volumetric are used to create a database. The created database is used to develop a model for estimating ...


Optimizing Ensemble Weights And Hyperparameters Of Machine Learning Models For Regression Problems, Mohsen Shahhosseini, Guiping Hu, Hieu Pham 2019 Iowa State University

Optimizing Ensemble Weights And Hyperparameters Of Machine Learning Models For Regression Problems, Mohsen Shahhosseini, Guiping Hu, Hieu Pham

Guiping Hu

Aggregating multiple learners through an ensemble of models aims to make better predictions by capturing the underlying distribution more accurately. Different ensembling methods, such as bagging, boosting and stacking/blending, have been studied and adopted extensively in research and practice. While bagging and boosting intend to reduce variance and bias, respectively, blending approaches target both by finding the optimal way to combine base learners to find the best trade-off between bias and variance. In blending, ensembles are created from weighted averages of multiple base learners. In this study, a systematic approach is proposed to find the optimal weights to create ...


Reporting And Analysis Of Repeated Measurements In Preclinical Animals Experiments, Jing Zhao, Chong Wang, Sarah C. Totton, Jonah N. Cullen, Annette M. O'Connor 2019 Iowa State University

Reporting And Analysis Of Repeated Measurements In Preclinical Animals Experiments, Jing Zhao, Chong Wang, Sarah C. Totton, Jonah N. Cullen, Annette M. O'Connor

Annette O'Connor

A common feature of preclinical animal experiments is repeated measurement of the outcome, e.g., body weight measured in mice pups weekly for 20 weeks. Separate time point analysis or repeated measures analysis approaches can be used to analyze such data. Each approach requires assumptions about the underlying data and violations of these assumptions have implications for estimation of precision, and type I and type II error rates. Given the ethical responsibilities to maximize valid results obtained from animals used in research, our objective was to evaluate approaches to reporting repeated measures design used by investigators and to assess how ...


What Have Long-Term Field Studies Taught Us About Population Dynamics?, Beth A. Reinke, David A. W. Miller, Fredric J. Janzen 2019 Pennsylvania State University

What Have Long-Term Field Studies Taught Us About Population Dynamics?, Beth A. Reinke, David A. W. Miller, Fredric J. Janzen

Fredric Janzen

Long-term studies have been crucial to the advancement of population biology, especially our understanding of population dynamics. We argue that this progress arises from three key characteristics of long-term research. First, long-term data are necessary to observe the heterogeneity that drives most population processes. Second, long-term studies often inherently lead to novel insights. Finally, long-term field studies can serve as model systems for population biology, allowing for theory and methods to be tested under well-characterized conditions. We illustrate these ideas in three long-term field systems that have made outsized contributions to our understanding of population ecology, evolution, and conservation biology ...


Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey MacVittie, Robert Slater, Daniel W. Engels 2019 Southern Methodist University

Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey Macvittie, Robert Slater, Daniel W. Engels

SMU Data Science Review

In this paper we present a model to predict player performance in fantasy football. In particular, identifying high-performance players can prove to be a difficult problem, as there are on occasion players capable of high performance whose past metrics give no indication of this capacity. These "sleepers"' are often undervalued, and the acquisition of such players can have notable impact on a fantasy football team's overall performance. We constructed a regression model that accounts for players' past performance and athletic metrics to predict their future performance. The model we built performs favorably in predicting athlete performance in relation to ...


Declining Liquidity In Iowa Farms: 2014–2017, Alejandro Plastina 2019 Iowa State University

Declining Liquidity In Iowa Farms: 2014–2017, Alejandro Plastina

Alejandro Plastina

The goal of the present study is to describe the evolution of financial liquidity in Iowa farms for 2014–2017, using a unique panel of 220 mid-scale commercial farms. Farms with vulnerable liquidity ratings increased from 33.2 percent in December 2014 to 45.0 percent in December 2017. On average, farms lost $244 of working capital per acre over that period, but farms with vulnerable liquidity ratings in December 2017 lost almost 60 percent more than that, or $388. Average farm size, machinery investment per acre, farm net worth per acre, debt-to-asset ratio, and age of operator were not ...


Semiparametric Fractional Imputation Using Gaussian Mixture Models For Handling Multivariate Missing Data, Hejian Sang, Jae Kwang Kim 2019 Google Inc

Semiparametric Fractional Imputation Using Gaussian Mixture Models For Handling Multivariate Missing Data, Hejian Sang, Jae Kwang Kim

Jae Kwang Kim

Item nonresponse is frequently encountered in practice. Ignoring missing data can lose efficiency and lead to misleading inference. Fractional imputation is a frequentist approach of imputation for handling missing data. However, the parametric fractional imputation of Kim (2011) may be subject to bias under model misspecification. In this paper, we propose a novel semiparametric fractional imputation method using Gaussian mixture models. The proposed method is computationally efficient and leads to robust estimation. The proposed method is further extended to incorporate the categorical auxiliary information. The asymptotic model consistency and √n- consistency of the semiparametric fractional imputation estimator are also established ...


Integration Of Survey Data And Big Observational Data For Finite Population Inference Using Mass Imputation, Shu Yang, Jae Kwang Kim 2019 North Carolina State University

Integration Of Survey Data And Big Observational Data For Finite Population Inference Using Mass Imputation, Shu Yang, Jae Kwang Kim

Jae Kwang Kim

Multiple data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we consider an imputation approach to combining a probability sample with big observational data. Unlike the usual imputation for missing data analysis, we create imputed values for the whole elements in the probability sample. Such mass imputation is attractive in the context of survey data integration (Kim and Rao, 2012). We extend mass imputation as a tool for data integration of survey data and big non-survey data. The mass imputation methods and their statistical properties are presented ...


Bootstrap Inference For The Finite Population Total Under Complex Sampling Designs, Zhonglei Wang, Jae Kwang Kim, Liuhua Peng 2019 Xiamen University

Bootstrap Inference For The Finite Population Total Under Complex Sampling Designs, Zhonglei Wang, Jae Kwang Kim, Liuhua Peng

Jae Kwang Kim

Bootstrap is a useful tool for making statistical inference, but it may provide erroneous results under complex survey sampling. Most studies about bootstrap-based inference are developed under simple random sampling and stratified random sampling. In this paper, we propose a unified bootstrap method applicable to some complex sampling designs, including Poisson sampling and probability-proportional-to-size sampling. Two main features of the proposed bootstrap method are that studentization is used to make inference, and the finite population is bootstrapped based on a multinomial distribution by incorporating the sampling information. We show that the proposed bootstrap method is second-order accurate using the Edgeworth ...


Bottom-Up Estimation And Top-Down Prediction: Solar Energy Prediction Combining Information From Multiple Sources, Youngdeok Hwang, Siyuan Lu, Jae Kwang Kim 2019 Sungkyunkwan University

Bottom-Up Estimation And Top-Down Prediction: Solar Energy Prediction Combining Information From Multiple Sources, Youngdeok Hwang, Siyuan Lu, Jae Kwang Kim

Jae Kwang Kim

Accurately forecasting solar power using the data from multiple sources is an important but challenging problem. Our goal is to combine two different physics model forecasting outputs with real measurements from an automated monitoring network so as to better predict solar power in a timely manner. To this end, we propose a new approach of analyzing large-scale multilevel models with great computational efficiency requiring minimum monitoring and intervention. This approach features a division of the large scale data set into smaller ones with manageable sizes, based on their physical locations, and fit a local model in each area. The local ...


Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels 2019 Southern Methodist University

Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels

SMU Data Science Review

In this paper we find a pattern of aperiodic seismic signals that precede earthquakes at any time in a laboratory earthquake’s cycle using a small window of time. We use a data set that comes from a classic laboratory experiment having several stick-slip displacements (earthquakes), a type of experiment which has been studied as a simulation of seismologic faults for decades. This data exhibits similar behavior to natural earthquakes, so the same approach may work in predicting the timing of them. Here we show that by applying random forest machine learning technique to the acoustic signal emitted by a ...


Longitudinal Analysis With Modes Of Operation For Aes, Dana Geislinger, Cory Thigpen, Daniel W. Engels 2019 Southern Methodist University

Longitudinal Analysis With Modes Of Operation For Aes, Dana Geislinger, Cory Thigpen, Daniel W. Engels

SMU Data Science Review

In this paper, we present an empirical evaluation of the randomness of the ciphertext blocks generated by the Advanced Encryption Standard (AES) cipher in Counter (CTR) mode and in Cipher Block Chaining (CBC) mode. Vulnerabilities have been found in the AES cipher that may lead to a reduction in the randomness of the generated ciphertext blocks that can result in a practical attack on the cipher. We evaluate the randomness of the AES ciphertext using the standard key length and NIST randomness tests. We evaluate the randomness through a longitudinal analysis on 200 billion ciphertext blocks using logistic regression and ...


Principal Component Neural Networks For Modeling, Prediction, And Optimization Of Hot Mix Asphalt Dynamics Modulus, Parnian Ghasemi, Mohamad Aslani, Derrick K. Rollins, R. Christopher Williams 2019 Iowa State University

Principal Component Neural Networks For Modeling, Prediction, And Optimization Of Hot Mix Asphalt Dynamics Modulus, Parnian Ghasemi, Mohamad Aslani, Derrick K. Rollins, R. Christopher Williams

Chemical and Biological Engineering Publications

The dynamic modulus of hot mix asphalt (HMA) is a fundamental material property that defines the stress-strain relationship based on viscoelastic principles and is a function of HMA properties, loading rate, and temperature. Because of the large number of efficacious predictors (factors) and their nonlinear interrelationships, developing predictive models for dynamic modulus can be a challenging task. In this research, results obtained from a series of laboratory tests including mixture dynamic modulus, aggregate gradation, dynamic shear rheometer (on asphalt binder), and mixture volumetric are used to create a database. The created database is used to develop a model for estimating ...


Age Structure Of Horn Fly (Diptera: Muscidae) Populations Estimated By Pterin Concentrations, E. S. Krafsur, A. L. Rosales, J. F. Robison-Cox, J. P. Turner 2019 Iowa State University

Age Structure Of Horn Fly (Diptera: Muscidae) Populations Estimated By Pterin Concentrations, E. S. Krafsur, A. L. Rosales, J. F. Robison-Cox, J. P. Turner

Elliot Krafsur

Pterins accumulate in the head capsules of horn flies, Haematobia irritans irritans (L.), as a linear function of time and temperature. Pterin concentrations were used to estimate chronological ages and to establish correlations between chronological age and ovarian development and reproductive success in 12 horn fly populations in 1988 and 1989. Male ages were estimated spectrofluorometrically. There were statistically significant differences between years in population age structure measured by pterins. Survival rates estimated from pterin concentration distributions were consistent with a one-parameter exponential model with constant survival rate. Mean daily survival rates were 0.81 for females and 0.84 ...


Classification With The Matrix-Variate-T Distribution, Geoffrey Z. Thompson, Ranjan Maitra, William Q. Meeker, Ashraf Bastawros 2019 Iowa State University

Classification With The Matrix-Variate-T Distribution, Geoffrey Z. Thompson, Ranjan Maitra, William Q. Meeker, Ashraf Bastawros

William Q Meeker

Matrix-variate distributions can intuitively model the dependence structure of matrix-valued observations that arise in applications with multivariate time series, spatio-temporal or repeated measures. This paper develops an Expectation-Maximization algorithm for discriminant analysis and classification with matrix-variate t-distributions. The methodology shows promise on simulated datasets or when applied to the forensic matching of fractured surfaces or the classification of functional Magnetic Resonance, satellite or hand gestures images.


A Latent Spatial Piecewise Exponential Model For Interval-Censored Disease Surveillance Data With Time-Varying Covariates And Misclassification, Yaxuan Sun, Chong Wang, William Q. Meeker, Max Morris, Marisa L. Rotolo, Jeffery Zimmerman 2019 Iowa State University

A Latent Spatial Piecewise Exponential Model For Interval-Censored Disease Surveillance Data With Time-Varying Covariates And Misclassification, Yaxuan Sun, Chong Wang, William Q. Meeker, Max Morris, Marisa L. Rotolo, Jeffery Zimmerman

William Q Meeker

Understanding the dynamics of disease spread is critical to achieving effective animal disease surveillance. A major challenge in modeling disease spread is the fact that the true disease status cannot be known with certainty due to the imperfect diagnostic sensitivity and specificity of the tests used to generate the disease surveillance data. Other challenges in modeling such data include interval censoring, relating disease spread to distance between units, and incorporating time-varying covariates, which are the unobserved disease statuses. We propose a latent spatial piecewise exponential model (PEX) with misclassification of events to address the challenges in modeling such disease surveillance ...


Seasonal Warranty Prediction Based On Recurrent Event Data, Qianqian Shan, Yili Hong, William Q. Meeker Jr. 2019 Iowa State University

Seasonal Warranty Prediction Based On Recurrent Event Data, Qianqian Shan, Yili Hong, William Q. Meeker Jr.

William Q Meeker

Warranty return data from repairable systems, such as vehicles, usually result in recurrent event data. The non-homogeneous Poisson process (NHPP) model is used widely to describe such data. Seasonality in the repair frequencies and other variabilities, however, complicate the modeling of recurrent event data. Not much work has been done to address the seasonality, and this paper provides a general approach for the application of NHPP models with dynamic covariates to predict seasonal warranty returns. A hierarchical clustering method is used to stratify the population into groups that are more homogeneous than the than the overall population. The stratification facilitates ...


Digital Commons powered by bepress