Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Kennesaw State University (4)
- University of Kentucky (4)
- Virginia Commonwealth University (3)
- Misericordia University (2)
- University of Nebraska - Lincoln (2)
-
- Western University (2)
- California Polytechnic State University, San Luis Obispo (1)
- Claremont Colleges (1)
- Illinois State University (1)
- Missouri State University (1)
- Murray State University (1)
- Otterbein University (1)
- SUNY Geneseo (1)
- Southern Methodist University (1)
- Syracuse University (1)
- Technological University Dublin (1)
- University of Arkansas, Fayetteville (1)
- University of Louisville (1)
- University of Massachusetts Amherst (1)
- University of Texas Rio Grande Valley (1)
- University of Washington Tacoma (1)
- Wilfrid Laurier University (1)
- Keyword
-
- Statistics (3)
- Risk modeling (2)
- Time series (2)
- AUC (1)
- Aggregate loss (1)
-
- Alpha (1)
- Analytics (1)
- Antimicrobial Resistance (1)
- Bankcard response modeling (1)
- Bayesian Linear Model (1)
- Bayesian inference (1)
- Bayesian modeling (1)
- Bayesian tree-structured Parzen estimator (1)
- Binomial thinning (1)
- Biomarker (1)
- Biometrical genetics (1)
- Bootstrap Calibration (1)
- CART (1)
- CHAID (1)
- CV (1)
- Caloric intake (1)
- Child labor (1)
- Child welfare (1)
- Classification and regression trees (1)
- Classification trees (1)
- Compositional data (1)
- Computational modeling (1)
- Computational neuroscience (1)
- Confidence Interval (1)
- Control systems (1)
- Publication
-
- Published and Grey Literature from PhD Candidates (4)
- Theses and Dissertations--Statistics (4)
- Theses and Dissertations (3)
- Electronic Thesis and Dissertation Repository (2)
- Student Research Poster Presentations 2020 (2)
-
- Access*: Interdisciplinary Journal of Student Research and Scholarship (1)
- Annual Symposium on Biomathematics and Ecology Education and Research (1)
- Articles (1)
- CMC Senior Theses (1)
- Department of Statistics: Dissertations, Theses, and Student Work (1)
- Doctoral Dissertations (1)
- Electronic Theses and Dissertations (1)
- English Language Institute (1)
- GREAT Day Posters (1)
- Graduate Theses and Dissertations (1)
- MSU Graduate Theses (1)
- Master's Theses (1)
- Murray State Theses and Dissertations (1)
- SMU Data Science Review (1)
- School of Mathematical and Statistical Sciences Faculty Publications and Presentations (1)
- The Nebraska Educator: A Student-Led Journal (1)
- Theses and Dissertations (Comprehensive) (1)
- Undergraduate Honors Thesis Projects (1)
- Publication Type
Articles 1 - 30 of 33
Full-Text Articles in Statistical Models
Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das
Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das
Electronic Theses and Dissertations
Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …
Quantifying The Simultaneous Effect Of Socio-Economic Predictors And Build Environment On Spatial Crime Trends, Alfieri Daniel Ek
Quantifying The Simultaneous Effect Of Socio-Economic Predictors And Build Environment On Spatial Crime Trends, Alfieri Daniel Ek
Graduate Theses and Dissertations
Proper allocation of law enforcement agencies falls under the umbrella of risk terrainmodeling (Caplan et al., 2011, 2015; Drawve, 2016) that primarily focuses on crime prediction and prevention by spatially aggregating response and predictor variables of interest. Although mental health incidents demand resource allocation from law enforcement agencies and the city, relatively less emphasis has been placed on building spatial models for mental health incidents events. Analyzing spatial mental health events in Little Rock, AR over 2015 to 2018, we found evidence of spatial heterogeneity via Moran’s I statistic. A spatial modeling framework is then built using generalized linear models, …
Incorporating Shear Resistance Into Debris Flow Triggering Model Statistics, Noah J. Lyman
Incorporating Shear Resistance Into Debris Flow Triggering Model Statistics, Noah J. Lyman
Master's Theses
Several regions of the Western United States utilize statistical binary classification models to predict and manage debris flow initiation probability after wildfires. As the occurrence of wildfires and large intensity rainfall events increase, so has the frequency in which development occurs in the steep and mountainous terrain where these events arise. This resulting intersection brings with it an increasing need to derive improved results from existing models, or develop new models, to reduce the economic and human impacts that debris flows may bring. Any development or change to these models could also theoretically increase the ease of collection, processing, and …
Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi
Statistical Methods With A Focus On Joint Outcome Modeling And On Methods For Fire Science, Da Zhong Xi
Electronic Thesis and Dissertation Repository
Understanding the dynamics of wildfires contributes significantly to the development of fire science. Challenges in the analysis of historical fire data include defining fire dynamics within existing statistical frameworks, modeling the duration and size of fires as joint outcomes, identifying the how fires are grouped into clusters of subpopulations, and assessing the effect of environmental variables in different modeling frameworks. We develop novel statistical methods to consider outcomes related to fire science jointly. These methods address these challenges by linking univariate models for separate outcomes through shared random effects, an approach referred to as joint modeling. Comparisons with existing …
Stochastic Analysis And Statistical Inference For Seir Models Of Infectious Diseases, Andrés Ríos-Gutiérrez, Viswanathan Arunachalam, Anuj Mubayi
Stochastic Analysis And Statistical Inference For Seir Models Of Infectious Diseases, Andrés Ríos-Gutiérrez, Viswanathan Arunachalam, Anuj Mubayi
Annual Symposium on Biomathematics and Ecology Education and Research
No abstract provided.
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Access*: Interdisciplinary Journal of Student Research and Scholarship
The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …
Interval Estimation Of Proportion Of Second-Level Variance In Multi-Level Modeling, Steven Svoboda
Interval Estimation Of Proportion Of Second-Level Variance In Multi-Level Modeling, Steven Svoboda
The Nebraska Educator: A Student-Led Journal
Physical, behavioral and psychological research questions often relate to hierarchical data systems. Examples of hierarchical data systems include repeated measures of students nested within classrooms, nested within schools and employees nested within supervisors, nested within organizations. Applied researchers studying hierarchical data structures should have an estimate of the intraclass correlation coefficient (ICC) for every nested level in their analyses because ignoring even relatively small amounts of interdependence is known to inflate Type I error rate in single-level models. Traditionally, researchers rely upon the ICC as a point estimate of the amount of interdependency in their data. Recent methods utilizing an …
A Geochemical And Statistical Investigation Of The Big Four Springs Region In Southern Missouri, Jordan Jasso Vega
A Geochemical And Statistical Investigation Of The Big Four Springs Region In Southern Missouri, Jordan Jasso Vega
MSU Graduate Theses
The Big Four Springs region hosts four major first-order magnitude springs in southern Missouri and northern Arkansas. These springs are Big Spring (Carter County, MO), Greer Spring (Oregon County, MO), Mammoth Spring (Fulton County, AR), and Hodgson Mill Spring (Ozark County, MO). Based on historic dye traces and hydrogeological investigations, these springs drain an area of approximately 1500 square miles and collectively discharge an average of 780 million gallons of water per day. The rocks from youngest to oldest that are found in Big Four Springs region are the Cotter and Jefferson City Dolomite (Ordovician), Roubidoux Formation (Ordovician), Gasconade Dolomite …
Statistical Methodology To Establish A Benchmark For Evaluating Antimicrobial Resistance Genes Through Real Time Pcr Assay, Enakshy Dutta
Statistical Methodology To Establish A Benchmark For Evaluating Antimicrobial Resistance Genes Through Real Time Pcr Assay, Enakshy Dutta
Department of Statistics: Dissertations, Theses, and Student Work
Novel diagnostic tests are usually compared with gold standard tests for evaluating diagnostic accuracy. For assessing antimicrobial resistance (AMR) to bovine respiratory disease (BRD) pathogens, phenotypic broth microdilution method is used as gold standard (GS). The objective of the thesis is to evaluate the optimal cycle threshold (Ct) generated by real-time polymerase chain reaction (rtPCR) to genes that confer resistance that will translate to the phenotypic classification of AMR. Data from two different methodologies are assessed to identify Ct that will discriminate between resistance (R) and susceptibility (S). First, the receiver operating characteristic (ROC) curve was used to determine the …
Latent Class Models For At-Risk Populations, Shuaimin Kang
Latent Class Models For At-Risk Populations, Shuaimin Kang
Doctoral Dissertations
Clustering Network Tree Data From Respondent-Driven Sampling With Application to Opioid Users in New York City There is great interest in finding meaningful subgroups of attributed network data. There are many available methods for clustering complete network. Unfortunately, much network data is collected through sampling, and therefore incomplete. Respondent-driven sampling (RDS) is a widely used method for sampling hard-to-reach human populations based on tracing links in the underlying unobserved social network. The resulting data therefore have tree structure representing a sub-sample of the network, along with many nodal attributes. In this paper, we introduce an approach to adjust mixture models …
Working Children On Java Island 2017, Yuniarti
Working Children On Java Island 2017, Yuniarti
English Language Institute
Children's wellbeing has currently become a global concern as many of them are engaged in the labor force. A small area estimation (SAE) technique, EBLUP under Fey Herriot model, is employed to reveal their number in regencies of Java Island. Statistics have been disaggregated by geographical location (urban/rural) and gender. These statistics are required by the government as the basis for policy making.
Point Process Modelling Of Objects In The Star Formation Complexes Of The M33 Galaxy, Dayi Li
Point Process Modelling Of Objects In The Star Formation Complexes Of The M33 Galaxy, Dayi Li
Electronic Thesis and Dissertation Repository
In this thesis, Gibbs point process (GPP) models are constructed to study the spatial distribution of objects in the star formation complexes of the M33 galaxy. The GPP models circumvent the limitations of the two-point correlation function employed in the current astronomy literature by naturally accounting for the inhomogeneous distribution of these objects. The spatial distribution of these objects serves as a sensitive probe in understanding the star formation process, which is crucial in understanding the formation of galaxies and the Universe. The objects under study include the CO filament structure, giant molecular clouds (GMCs) and young stellar cluster candidates …
484— Modeling Social Distancing Methods And Their Effectiveness In Combating The Spread Of Ebola, Rachel Fair
484— Modeling Social Distancing Methods And Their Effectiveness In Combating The Spread Of Ebola, Rachel Fair
GREAT Day Posters
Ebola Virus Disease (EVD) is a rare but severe disease that is transmitted among humans through direct-contact with, and close proximity to, infected bodily fluids. From 2014-16, West Africa experienced the largest Ebola outbreak ever recorded, infecting over 28,000 people, and killing over 11,000. Although the symptoms of EVD are treatable, the disease can be extremely deadly, with an average of 50% EVD cases resulting in fatality. In areas where healthcare is scarce and vaccinations are not readily available, the practices of social distancing and self-quarantining have been shown to be highly effective in combating the spread of EVD. To …
An Automatic Interaction Detection Hybrid Model For Bankcard Response Classification, Yan Wang, Sherry Ni, Brian Stone
An Automatic Interaction Detection Hybrid Model For Bankcard Response Classification, Yan Wang, Sherry Ni, Brian Stone
Published and Grey Literature from PhD Candidates
Data mining techniques have numerous applications in bankcard response modeling. Logistic regression has been used as the standard modeling tool in the financial industry because of its almost always desirable performance and its interpretability. In this paper, we propose a hybrid bankcard response model, which integrates decision tree-based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possible potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logistic regression. The motivation of the proposed hybrid model …
A Two-Stage Hybrid Model By Using Artificial Neural Networks As Feature Construction Algorithms, Yan Wang, Sherry Ni, Brian Stone
A Two-Stage Hybrid Model By Using Artificial Neural Networks As Feature Construction Algorithms, Yan Wang, Sherry Ni, Brian Stone
Published and Grey Literature from PhD Candidates
We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The hybrid model uses a very simple neural network structure as the new feature construction tool in the first stage, then the newly created features are used as the additional input variables in logistic regression in the second stage. The model is compared with the traditional one-stage model in credit customer response classification. It is observed that the proposed two-stage model outperforms the one-stage model in terms of accuracy, the area under the ROC curve, and KS statistic. By creating new …
Predicting Class-Imbalanced Business Risk Using Resampling, Regularization, And Model Ensembling Algorithms, Yan Wang, Sherry Ni
Predicting Class-Imbalanced Business Risk Using Resampling, Regularization, And Model Ensembling Algorithms, Yan Wang, Sherry Ni
Published and Grey Literature from PhD Candidates
We aim at developing and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, and ensembling techniques. Area Under the Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross-validation. Two undersampling strategies including random undersampling (RUS) and cluster centroid undersampling (CCUS), as well as two oversampling methods including random oversampling (ROS) and Synthetic Minority Oversampling Technique (SMOTE), are applied. Three highly interpretable classifiers, including logistic regression without regularization (LR), L1-regularized LR (L1LR), and decision tree (DT) are implemented. Two ensembling techniques, including Bagging and Boosting, are …
A Xgboost Risk Model Via Feature Selection And Bayesian Hyper-Parameter Optimization, Yan Wang, Sherry Ni
A Xgboost Risk Model Via Feature Selection And Bayesian Hyper-Parameter Optimization, Yan Wang, Sherry Ni
Published and Grey Literature from PhD Candidates
This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, weight by Chi-square, hierarchical variable clustering, weight by correlation, and weight by information are applied to alleviate the effect of redundant features. Two hyper-parameter optimization approaches, random search (RS) and Bayesian tree-structuredParzen Estimator (TPE), are applied in XGBoost. The effect of different FS and hyper-parameter optimization methods on the model performance are investigated by the Wilcoxon Signed Rank …
Quantitative Model For Setting Manufacturer's Suggested Retail Price, Peter Byrd, Jonathan Knowles, Dmitry Andreev, Jacob Turner, Brian Mente, Laroux Wallace
Quantitative Model For Setting Manufacturer's Suggested Retail Price, Peter Byrd, Jonathan Knowles, Dmitry Andreev, Jacob Turner, Brian Mente, Laroux Wallace
SMU Data Science Review
In this paper, we present a quantitative approach to model the manufacturer’s suggested retail price (MSRP) for children’s doll- houses and establish relationships among key features that contribute most to establishing MSRP. Determination of the MSRP is a critical step in how consumers respond with their wallets when purchasing an item. KidKraft, a global leader in toys and juvenile products, sets MSRP subjectively using product experts. The process is arduous and time consuming requiring the focus of specialized resources and knowledge of the interaction between key attributes and their impact on consumer value. An accurate prediction of MSRP during the …
Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown
Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown
Murray State Theses and Dissertations
Data and algorithmic modeling are two different approaches used in predictive analytics. The models discussed from these two approaches include the proportional odds logit model (POLR), the vector generalized linear model (VGLM), the classification and regression tree model (CART), and the random forests model (RF). Patterns in the data were analyzed using trigonometric polynomial approximations and Fast Fourier Transforms. Predictive modeling is used frequently in statistics and data science to find the relationship between the explanatory (input) variables and a response (output) variable. Both approaches prove advantageous in different cases depending on the data set. In our case, the data …
Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford
Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford
Student Research Poster Presentations 2020
The purpose of this analysis is to determine the sample size needed for a study that will be used to discover if there is a difference in the caloric intake of children who help with meal preparation and children who do not help with meal preparation.
Predicting Diabetes Diagnoses, Sarah Netchert
Predicting Diabetes Diagnoses, Sarah Netchert
Student Research Poster Presentations 2020
This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened for diabetes out of 1,046 …
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
Theses and Dissertations
Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.
Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Theses and Dissertations
Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …
Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana
Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana
Articles
Illegal markets are notoriously difficult to study. Police data offer an increasingly exploited source of evidence. However, their secondary nature poses challenges for researchers. A key issue is that researchers often have to deal with two sets of actors: targeted and non-targeted. This work develops a latent space model for interdependent ego-networks purposely created to deal with the targeted nature of police evidence. By treating targeted offenders as egos and their contacts as alters, the model (a) leverages on the full information available and (b) mirrors the specificity of the data collection strategy. The paper then applies this approach to …
Statistical Analysis Of Demographic Effects On Insurance Coverage Of Perinatal And Neonatal Morbidity, Madeline Durbin
Statistical Analysis Of Demographic Effects On Insurance Coverage Of Perinatal And Neonatal Morbidity, Madeline Durbin
Undergraduate Honors Thesis Projects
In the United States of America, Ohio has one of the worst neonatal and perinatal death rates. Within Ohio, Montgomery County has an above average neonatal and perinatal death rate. This statistic can be lowered if more women in Montgomery County have health insurance. They would be more likely to seek out prenatal health care, since they would no longer have to pay as much money out-of-pocket. This would allow medical professionals to be able to diagnose and treat any potential issues in the mother or child earlier. Having health insurance would also prevent mothers-to-be from seeking out other potentially …
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Theses and Dissertations--Statistics
Comparing the distribution of biomarker measurements between two groups under either an unpaired or paired design is a common goal in many biomarker studies. However, analyzing biomarker data is sometimes challenging because the data may not be normally distributed and contain a large fraction of zero values or missing values. Although several statistical methods have been proposed, they either require data normality assumption, or are inefficient. We proposed a novel two-part semiparametric method for data under an unpaired setting and a nonparametric method for data under a paired setting. The semiparametric method considers a two-part model, a logistic regression for …
Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee
Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee
Theses and Dissertations
Within-person data can exhibit a virtually limitless variety of statistical patterns, but it can be difficult to distinguish meaningful features from statistical artifacts. Studies of complex traits have previously used genetic signals like twin-based heritability to distinguish between the two. This dissertation is a collection of studies applying state-space modeling to conceptualize and estimate novel phenotypic constructs for use in psychiatric research and further biometrical genetic analysis. The aims are to: (1) relate control theoretic concepts to health-related phenotypes; (2) design statistical models that formally define those phenotypes; (3) estimate individual phenotypic values from time series data; (4) consider hierarchical …
Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen
Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen
Theses and Dissertations (Comprehensive)
The aggregate loss model has applications in various areas such as financial risk management and actuarial science. The aggregate loss is the summation of all random losses occurred in a period, and it is governed by both the loss severity and the loss frequency. While the impact of the loss severity on aggregate loss is well studied, less focus is paid on the influence of loss frequency on aggregate loss, which motivates our study. In this thesis, we enrich the aggregate loss framework by introducing the Poisson-Tweedie distribution as a candidate for modelling loss frequency, prove the closedness of Poisson-Tweedie …
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Theses and Dissertations--Statistics
A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …
Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou
Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou
Theses and Dissertations--Statistics
Statistical intervals (e.g., confidence, prediction, or tolerance) are widely used to quantify uncertainty, but complex settings can create challenges to obtain such intervals that possess the desired properties. My thesis will address diverse data settings and approaches that are shown empirically to have good performance. We first introduce a focused treatment on using a single-layer bootstrap calibration to improve the coverage probabilities of two-sided parametric tolerance intervals for non-normal distributions. We then turn to zero-inflated data, which are commonly found in, among other areas, pharmaceutical and quality control applications. However, the inference problem often becomes difficult in the presence of …