Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 31

Full-Text Articles in Entire DC Network

The Strong Law Of Large Numbers For U-Statistics Under Random Censorship, Jan Höft Dec 2018

The Strong Law Of Large Numbers For U-Statistics Under Random Censorship, Jan Höft

Theses and Dissertations

We introduce a semi-parametric U-statistics estimator for randomly right censored data. We will study the strong law of large numbers for this estimator under proper assumptions about the conditional expectation of the censoring indicator with re- spect to the observed life times. Moreover we will conduct simulation studies, where the semi-parametric estimator is compared to a U-statistic based on the Kaplan- Meier product limit estimator in terms of bias, variance and mean squared error, under different censoring models.


Network Analysis Of Scientific Collaboration And Co-Authorship Of The Trifecta Of Malaria, Tuberculosis And Hiv/Aids In Benin., Gbedegnon Roseric Azondekon Aug 2018

Network Analysis Of Scientific Collaboration And Co-Authorship Of The Trifecta Of Malaria, Tuberculosis And Hiv/Aids In Benin., Gbedegnon Roseric Azondekon

Theses and Dissertations

Despite the international mobilization and increase in research funding, Malaria, Tuberculosis and HIV/AIDS are three infectious diseases that have claimed more lives in sub Saharan Africa than any other place in the World. Consortia, research network and research centers both in Africa and around the world team up in a multidisciplinary and transdisciplinary approach to boost efforts to curb these diseases. Despite the progress in research, very little is known about the dynamics of research collaboration in the fight of these Infectious Diseases in Africa resulting in a lack of information on the relationship between African research collaborators. This dissertation …


Applying Conditional Distributions To Individuals: Using Latent Variable Models, Feng Ji Jun 2018

Applying Conditional Distributions To Individuals: Using Latent Variable Models, Feng Ji

Theses and Dissertations

This study proposes a new method to interpret individual results of psychological test batteries. The Mahalanobis distance is a commonly-used measure of how unusual an individual’s profile of scores is compared to a population of score profiles. In models in which there is a set of predictors and a set of dependent variables (e.g., cognitive abilities predicting academic abilities), it is useful to distinguish between a profile of dependent scores that is unusual because its profile of predictor scores is unusual and a profile of dependent scores that is unusual even after controlling for the predictors. The conditional Mahalanobis distance …


Robust Estimation Of Parametric Models For Insurance Loss Data, Chudamani Poudyal Jun 2018

Robust Estimation Of Parametric Models For Insurance Loss Data, Chudamani Poudyal

Theses and Dissertations

Parametric statistical models for insurance claims severity are continuous, right-skewed, and frequently heavy-tailed. The data sets that such models are usually fitted to contain outliers that

are difficult to identify and separate from genuine data. Moreover, due to commonly used actuarial “loss control strategies,” the random variables we observe and wish to model are affected by truncation (due to deductibles), censoring (due to policy limits), scaling

(due to coinsurance proportions) and other transformations. In the current practice, statistical inference for loss models is almost exclusively likelihood (MLE) based, which typically results in non-robust parameter estimators, pricing models, and risk measures. …


Determining The Relationship Between Aging And Oxidative Stress In A Drosophila Melanogaster P38 Kb Framework, Aleksandra J. Majewski May 2018

Determining The Relationship Between Aging And Oxidative Stress In A Drosophila Melanogaster P38 Kb Framework, Aleksandra J. Majewski

Theses and Dissertations

Aging is inevitable for all organisms and can be characterized by degeneration of tissue, adecrease in motor function, and impaired stress response. In humans, it is often accompanied by an increased propensity for age related diseases. While all adults experience biological aging (senescence) not all adults experience age-associated disease. Thus, we claim these are not normal prospects of aging. Although the implications of aging are well understood, the molecular underpinnings for these processes remain elusive. As advances in medical science have been successful at prolonging lifespan, they concurrently extend the amount of time spend in diseased states. If we wish …


Fitting A Complex Markov Chain Model For Firm And Market Productivity, Julia Ruth Valder May 2018

Fitting A Complex Markov Chain Model For Firm And Market Productivity, Julia Ruth Valder

Theses and Dissertations

This thesis develops a methodology of estimating parameters for a complex Markov chain model for firm productivity. The model consists of two Markov chains, one describing firm-level productivity and the other modeling the productivity of the whole market. If applicable, the model can be used to help with optimal decision making problems for labor demand. The need for such a model is motivated and the economical background of this research is shown. A brief introduction to the concept of Markov chains and their application in this context is given. The simulated data that is being used for the estimation is …


Calibration Of A Stochastic Price Model For American Electricity Markets, Oliver G. Meister May 2018

Calibration Of A Stochastic Price Model For American Electricity Markets, Oliver G. Meister

Theses and Dissertations

This thesis discusses models for electricity spot prices from the Midwestern American and Manitoba market. The models are based on experiences in European markets and rely on a superposition model with several jump components. The methodology of Bayesian Inference solved with a Markov chain Monte Carlo algorithm has been applied to find estimators for the processes of the model. The specific Markov chain Monte Carlo algorithm applied a Random Walk Metropolis combined with a Gibbs sampler. The different estimators of the models are evaluated with the posterior predictive value and simulations of the electricity spot prices.

We have modified this …


Clustering Biological Data With Self-Adjusting High-Dimensional Sieve, Josselyn Gonzalez Apr 2018

Clustering Biological Data With Self-Adjusting High-Dimensional Sieve, Josselyn Gonzalez

Theses and Dissertations

Data classification as a preprocessing technique is a crucial step in the analysis and understanding of numerical data. Cluster analysis, in particular, provides insight into the inherent patterns found in data which makes the interpretation of any follow-up analyses more meaningful. A clustering algorithm groups together data points according to a predefined similarity criterion. This allows the data set to be broken up into segments which, in turn, gives way for a more targeted statistical analysis. Cluster analysis has applications in numerous fields of study and, as a result, countless algorithms have been developed. However, the quantity of options makes …


The Impact Of Changing Requirements, James C. Ellis Mar 2018

The Impact Of Changing Requirements, James C. Ellis

Theses and Dissertations

The fundamental purpose of an Engineering Change Proposal (ECP) is to change the requirements of a contract. To build in flexibility, the acquisition practice is to estimate a dollar value to hold in reserve after the contract is awarded. There appears to be no empirical-based method for estimating this ECP withhold in the literature. Using the Cost Assessment Data Enterprise (CADE) database, 533 contracts were randomly selected to build two regression models: one to predict the likelihood of a contract experiencing an ECP, and the other to determine the expected median percent increase in baseline contract cost if an ECP …


Satellite Communications In The V And W Band: Tropospheric Effects, Bertus A. Shelters Mar 2018

Satellite Communications In The V And W Band: Tropospheric Effects, Bertus A. Shelters

Theses and Dissertations

An investigation into the use of Weather Cubes compiled by the atmospheric characterization package, Laser Environmental Effects Definition and Reference (LEEDR), to develop accurate, long-term attenuation statistics for link-budget analysis is presented. A Weather Cube is a three-dimensional mesh of numerical weather prediction (NWP) data plus LEEDR calculations that allows for the quantification of rain, cloud, aerosol, and molecular effects at any UV to RF wavelength on any path contained within the cube. The development of this methodology is motivated by the potential use of V (40-75 GHz) and W (75-110 GHz) band frequencies for the satellite communication application, as …


Analysis Of A Voting Method For Ranking Network Centrality Measures On A Node-Aligned Multiplex Network, Kyle S. Wilkinson Mar 2018

Analysis Of A Voting Method For Ranking Network Centrality Measures On A Node-Aligned Multiplex Network, Kyle S. Wilkinson

Theses and Dissertations

Identifying relevant actors using information gleaned from multiple networks is a key goal within the context of human aspects of military operations. The application of a voting theory methodology for determining nodes of critical importance—in ranked order of importance—for a node-aligned multiplex network is demonstrated. Both statistical and qualitative analyses on the differences of ranking outcomes under this methodology is provided. As a corollary, a multilayer network reduction algorithm is investigated within the context of the proposed ranking methodology. The application of the methodology detailed in this thesis will allow meaningful rankings of relevant actors to be produced on a …


Modeling Multimodal Failure Effects Of Complex Systems Using Polyweibull Distribution, Daniel A. Timme Mar 2018

Modeling Multimodal Failure Effects Of Complex Systems Using Polyweibull Distribution, Daniel A. Timme

Theses and Dissertations

The Department of Defense (DoD) enlists multiple complex systems across each of their departments. Between the aging systems going through an overhaul and emerging new systems, quality assurance to complete the mission and secure the nation‘s objectives is an absolute necessity. The U.S. Air Force‘s increased interest in Remotely Piloted Aircraft (RPA) and the Space Warfighting domain are current examples of complex systems that must maintain high reliability and sustainability in order to complete missions moving forward. DoD systems continue to grow in complexity with an increasing number of components and parts in more complex arrangements. Bathtub-shaped hazard functions arise …


Looking Past The Spark To Find The Fuel Of The Arab Spring Fire, Luke M. Brantley Mar 2018

Looking Past The Spark To Find The Fuel Of The Arab Spring Fire, Luke M. Brantley

Theses and Dissertations

The field of statistical conflict prediction addresses region-wide analysis in eras of stable conflict and peace. This study improves upon those prediction rates in times of volatile conflict and peace seen during the Arab Spring of 2011 to 2015. During this time, higher rates of conflict transition in certain Middle Eastern and North African countries occurred than normally observed in previous studies. Due to the fact that previous prediction models decrease in accuracy during times of volatile conflict transition and since the proper strategy for handling the Arab Spring has been highly debated, this study considers alterations to previous studies …


Characterization Of Ambient Noise, Rachel C. Ramirez Mar 2018

Characterization Of Ambient Noise, Rachel C. Ramirez

Theses and Dissertations

An Air Force sponsor is interested in improving an acoustic detection model by providing better estimates on how to characterize the background noise of various environments. This would inform decision makers on the probability of acoustic detection of different systems of interest given different levels of noise. Data mining and statistical learning techniques are applied to a National Park Service acoustic summary data set to find overall trends over varying environments. Linear regression, conditional inference trees, and random forest techniques are discussed. Findings indicate only sixteen geospatial variables at different resolutions are necessary to characterize the first ten ⅓ octave …


Forecasting Country Conflict Within Modified Combatant Command Regions Using Statistical Learning Methods, Sarah Neumann Mar 2018

Forecasting Country Conflict Within Modified Combatant Command Regions Using Statistical Learning Methods, Sarah Neumann

Theses and Dissertations

Conflict forecasts are crucial to Combatant Commanders’ understanding of the dynamic environment encompassing countries within their area of responsibility. The current structure of the Combatant Commands (COCOMs) is rooted in geography by grouping nations in geographic proximity to the same regional command. However, leaders today question the effectiveness of the current structure. A novel modified k-means clustering algorithm is developed and implemented that groups countries based on data similarities and geographic proximity resulting in new COCOM groupings that improve conflict forecasts. The data spans various political, military, economic, and social characteristics of countries, and is used to develop conditional logistic …


They're Only Nuclear Weapons: An Exploratory Analysis Of Safety Climate Within The Nuclear Enterprise, Brandon M. Clements Mar 2018

They're Only Nuclear Weapons: An Exploratory Analysis Of Safety Climate Within The Nuclear Enterprise, Brandon M. Clements

Theses and Dissertations

By possessing nuclear weapons, the United States Air Force is inherently exposed to extreme safety concerns. With multiple setbacks in recent years (e.g., unauthorized transport of nuclear weapons, cheating scandals, and career dissatisfaction), some have begun to wonder how safe the nuclear enterprise truly is. Building upon the concept of safety climate, this study explores safety climate constructs and trends associated with current nuclear maintenance safety climate survey data.


Adjusting For Mis-Reporting In Count Data, Gelareh Rahimighazikalayeh Jan 2018

Adjusting For Mis-Reporting In Count Data, Gelareh Rahimighazikalayeh

Theses and Dissertations

Any counting system is prone to recording errors including underreporting and overreporting. Ignoring the misreporting pattern in count data can give rise to bias in the estimation of model parameters. Accordingly, Poisson, negative binomial and generalized Poisson regression have been expanded in some instances to capture reporting biases. However, to our knowledge, no program has been developed to allow users to apply all of these models when needed. In the first part of the dissertation, we review the available models for underreported counts and develop a Stata command to estimate Poisson, negative binomial and generalized Poisson regression models for underreported …


Goodness Of Fit Via Residual Plots In Item Response Theory, Bryonna Bowen Jan 2018

Goodness Of Fit Via Residual Plots In Item Response Theory, Bryonna Bowen

Theses and Dissertations

Goodness-of-fit criteria developed for the evaluation of item response functions have been examined by many scholars using different theories and criteria. A number of potential graphical analysis approaches, such as residual plots, have been described in literature, but have received little attention from researchers. While many tests of goodness-of-fit are available, those that incorporate the analysis of residuals may be most useful. The unmistakable presence of a pattern in the residual plot for the logistic model item response functions even when we know the model fits raises a red flag up and calls for greater analysis. This study explores different …


Discovery Of Community Structures In Static And Dynamic Networks, Shiwen Shen Jan 2018

Discovery Of Community Structures In Static And Dynamic Networks, Shiwen Shen

Theses and Dissertations

With the development of computer technology, researchers are able to observe and collect enormous amount of data, where the independent and identical distributed assumption is violated. For example, in sociology, individuals in an organization interact with each other to change the underlying social structure; in biology, understanding the gene-gene interaction helps researchers to detect potential diseases; in politics, voters are mutually influenced before the election via private/public speeches and parades, which might ultimately change the election results. It is crucial to study how individuals interact with each other from the data, which would lead to tremendous contributions to the society. …


Semiparametric Regression In The Presence Of Measurement Error, Xiang Li Jan 2018

Semiparametric Regression In The Presence Of Measurement Error, Xiang Li

Theses and Dissertations

The error-in-covariates problem has received great attention among researchers who study semiparametric and nonparametric inference for regression models over the past two decades. Without correcting for the measurement error in covariates, estimators for covariate effect usually contain bias. To account for measurement error, much research have been done in mean regression (Liang et al., 1999; Fuller, 2009; Carroll et al., 2006) and quantile regression (He and Liang, 2000; Hardle et al., 2000; Wei and Carroll, 2009). In contrast, there is little research in mode regression and this motivates us to propose semiparametric methods to address this error-incovariates problem in Chapters …


A Rotatable Asymmetric Variable Compensation Mirt Model, Xinchu Zhao Jan 2018

A Rotatable Asymmetric Variable Compensation Mirt Model, Xinchu Zhao

Theses and Dissertations

The purpose of this study is to develop, estimate, and interpret a new variable compensation multidimensional item response theory (MIRT) model, named the Rotatable Asymmetric Variable Compensation Model (RAVCM), that allows for transformation between different correlation structures. Since the model is rotatable like the common compensatory models (CM), it is not necessary to specify or estimate the correlation of abilities to recover the model. Also, it can approximate the existing MIRT models well. In simulation, the RAVCM is shown to estimate the parameters with small error, especially when the non-compensatory model (NCM) is the true model and the correlation of …


Classification Of High-Dimensional Data Based On Multiple Testing Methods, Chong Ma Jan 2018

Classification Of High-Dimensional Data Based On Multiple Testing Methods, Chong Ma

Theses and Dissertations

Supervised and unsupervised classification are common topics in machine learning in both scientific and industrial fields, which usually involve three tasks: prediction, exploration, and explanation. False discovery rate (FDR) theory has a close connection to classical classification theory, which must be employed in a sophisticated way to achieve good performance in various contexts. The study aims to explore novel supervised classifiers and unsupervised classification approaches for functional data and high-dimensional data in genome study by using FDR, respectively. One work develops a novel classifier for functional data by casting the classification problem into a multiple testing task, which involves using …


Comparison Of The Performance Of Simple Linear Regression And Quantile Regression With Non-Normal Data: A Simulation Study, Marjorie Howard Jan 2018

Comparison Of The Performance Of Simple Linear Regression And Quantile Regression With Non-Normal Data: A Simulation Study, Marjorie Howard

Theses and Dissertations

Linear regression is a widely used method for analysis that is well understood across a wide variety of disciplines. In order to use linear regression, a number of assumptions must be met. These assumptions, specifically normality and homoscedasticity of the error distribution can at best be met only approximately with real data. Quantile regression requires fewer assumptions, which offers a potential advantage over linear regression. In this simulation study, we compare the performance of linear (least squares) regression to quantile regression when these assumptions are violated, in order to investigate under what conditions quantile regression becomes the more advantageous method …


Estimation Procedures For Complex Survival Models And Their Applications In Epidemiology Studies, Jie Zhou Jan 2018

Estimation Procedures For Complex Survival Models And Their Applications In Epidemiology Studies, Jie Zhou

Theses and Dissertations

In this dissertation, we aim to address three important questions in practice, which can be solved through complex survival models. The first project focuses on studying the longitudinal fitness effect on cardiovascular disease (CVD) mortality. In the second project, we study the disease-death relation between CVD and all-cause mortality and evaluate important covariate effects on the disease or death transitions. In the third project, we compare antiretroviral treatment (ART) for HIV patients and consider both treatment effect and side effect of the drugs. The first two projects are motivated by the Aerobics Center Longitudinal Study (ACLS) datasets and the third …


The South Carolina Safety Belt Study: Large-Scale Location Sampling, Stephanie Jones Jan 2018

The South Carolina Safety Belt Study: Large-Scale Location Sampling, Stephanie Jones

Theses and Dissertations

The South Carolina Safety Belt Study is a statewide survey completed yearly to assess the prevalence of safety belt usage on of South Carolina roads through observations from different locations across the state. Every five years the sites for observation are resampled. This thesis breaks down the most recent sampling done for the years of 2018 through 2022. Both the methodology of large scale location sampling and the mathematical idea behind the strategy employed are covered. Further, three different software packages were utilized: R, SAS, and ArcGIS. The steps that were taken and the written function code run for each …


Semiparametric Statistical Estimation And Inference With Latent Information, Qianqian Wang Jan 2018

Semiparametric Statistical Estimation And Inference With Latent Information, Qianqian Wang

Theses and Dissertations

In Chapter 1, we predicted disease risk by transformation models in the presence of missing subgroup identifiers. When a discrete covariate defining subgroup membership is missing for some of the subjects in a study, the distribution of the outcome follows a mixture distribution of the subgroup-specific distributions. Taking into account the uncertain distribution of the group membership and the covariates, we model the relation between the disease onset time and the covariates through transformation models in each sub-population, and develop a nonparametric maximum likelihood based estimation implemented through EM algorithm along with its inference procedure. We further propose methods to …


Bayesian Semiparametric Methods For Analyzing Panel Count Data, Jianhong Wang Jan 2018

Bayesian Semiparametric Methods For Analyzing Panel Count Data, Jianhong Wang

Theses and Dissertations

Panel count data commonly arise in epidemiological, social science, medical studies, in which subjects have repeated measurements on the recurrent events of interest at different observation times. Since the subjects are not under continuous monitoring, the exact times of those recurrent events are not observed but the counts of such events within the adjacent observation times are known. Panel count data can be considered as a special type of longitudinal data with a count response variable in the literature. Compared to the frequentist literature, very limited Bayesian approaches have been developed to analyze panel count data. In this dissertation, several …


Dimension Reduction For Classification With Many Covariates And Pathway Activity Level Estimation, Seungchul Baek Jan 2018

Dimension Reduction For Classification With Many Covariates And Pathway Activity Level Estimation, Seungchul Baek

Theses and Dissertations

The development of science and technology has enabled the use of more covariates. As a result, it has become more difficult to identify dependencies among many covariates. Dimension reduction provides an efficient way to handle this issue by summarizing the effect of covariates via a few linear combinations of covariates. In this dissertation, two methodologies for real-life problems are provided by using dimension reduction equipped with semiparametric theory. The use of semiparametrics allows maximal flexibility of the model by letting some features of the model completely unspecified, while we still enjoy the interpretability of the model through estimating the parameters …


Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu Jan 2018

Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu

Theses and Dissertations

Structural Equation Modeling (SEM) is a framework of statistical methods that allows us to represent complex relationships between variables. SEM is widely used in economics, genetics and the behavioral sciences (e.g. psychology, psychobiology, sociology and medicine). Model complexity is defined as a model’s ability to fit different data patterns and it plays an important role in model selection when applying SEM. As in linear regression, the number of free model parameters is typically used in traditional SEM model fit indices as a measure of the model complexity. However, only using number of free model parameters to indicate SEM model complexity …


Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry Jan 2018

Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry

Theses and Dissertations

The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and …