Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Graduate Theses and Dissertations

Pure sciences

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang May 2017

A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang

Graduate Theses and Dissertations

This thesis first describes the general idea behind Bayes Inference, various sampling methods based on Bayes theorem and many examples. Then a Bayes approach to model selection, called Stochastic Search Variable Selection (SSVS) is discussed. It was originally proposed by George and McCulloch (1993). In a normal regression model where the number of covariates is large, only a small subset tend to be significant most of the times. This Bayes procedure specifies a mixture prior for each of the unknown regression coefficient, the mixture prior was originally proposed by Geweke (1996). This mixture prior will be updated as data becomes …


Analysis Of Break-Points In Financial Time Series, Jean Remy Habimana Dec 2016

Analysis Of Break-Points In Financial Time Series, Jean Remy Habimana

Graduate Theses and Dissertations

A time series is a set of random values collected at equal time intervals; this randomness makes these types of series not easy to predict because the structure of the series may change at any time. As discussed in previous research, the structure of time series may change at any time due to the change in mean and/or variance of the series. Consequently, based on this structure, it is wise not to assume that these series are stationary. This paper, discusses, a method of analyzing time series by considering the entire series non-stationary, assuming there is random change in unconditional …


Monte Carlo Methods In Bayesian Inference: Theory, Methods And Applications, Huarui Zhang Dec 2016

Monte Carlo Methods In Bayesian Inference: Theory, Methods And Applications, Huarui Zhang

Graduate Theses and Dissertations

Monte Carlo methods are becoming more and more popular in statistics due to the fast development of efficient computing technologies. One of the major beneficiaries of this advent is the field of Bayesian inference. The aim of this thesis is two-fold: (i) to explain the theory justifying the validity of the simulation-based schemes in a Bayesian setting (why they should work) and (ii) to apply them in several different types of data analysis that a statistician has to routinely encounter. In Chapter 1, I introduce key concepts in Bayesian statistics. Then we discuss Monte Carlo Simulation methods in detail. Our …


Risk Estimation Toward A Natural History Model For Low Grade Glioma Patients, Anh Thi Hoang Pham May 2016

Risk Estimation Toward A Natural History Model For Low Grade Glioma Patients, Anh Thi Hoang Pham

Graduate Theses and Dissertations

Glioma is a common type of primary brain tumor that represents 28% of all brain tumors and 80% of malignant tumors. According to a recent study by the Centers for Disease Control and Prevention (CDC), gliomas account for 53%, 35% and 29% of all brain tumors (68%, 74% and 81% of malignant brain tumors) among children (aged 0-14), teenagers (aged 15-19) and young adults, respectively. Gliomas are often diagnosed through radiological imaging and histopathology. There are two main groups of gliomas following World Health Organization’s classification: Low grade gliomas (LGG), or grade I and II gliomas; and high grade gliomas …


Spread Trading In Corn Futures Market, Ryan D. Napier May 2016

Spread Trading In Corn Futures Market, Ryan D. Napier

Graduate Theses and Dissertations

The non-linear relationship between old crop – new crop year spreads in corn futures market and stock-to-use (S-U) ratios published by the United States Department of Agriculture is analyzed. Using a non-linear logarithmic smooth transition regression (LSTR) model, we capture asymmetric market behaviors in high and low S-U regimes. Capturing this relationship and understanding the non-linear aspects of the relationship is of interest of grain merchandizers and speculators in the market. A spread trading strategy is simulated for the sample period, January 1985 through April 2015, to determine if the non-linear relationship is a profitable arbitrage opportunity in the market.


Statistical Modeling Of The Temporal Dynamics In A Large Scale-Citation Network, Luis Javier Ek Jr. May 2016

Statistical Modeling Of The Temporal Dynamics In A Large Scale-Citation Network, Luis Javier Ek Jr.

Graduate Theses and Dissertations

Citation Networks of papers are vast networks that grow over time. The manner or the form a citation network grows is not entirely a random process, but a preferential attachment relationship; highly cited papers are more likely to be cited by newly published papers. The result is a network whose degree distribution follows a power law. This growth of citation network of papers will be modeled with a negative binomial regression coupled with logistic growth and/or Cauchy distribution curve. Then a Barabasi-Albert model, based on the negative binomial models, and a combination of the Dirichlet distribution and multinomial will be …


Identification Of Biomarkers For The Overall Survival Of Ovarian Cancer Patients, Kristi Mai May 2016

Identification Of Biomarkers For The Overall Survival Of Ovarian Cancer Patients, Kristi Mai

Graduate Theses and Dissertations

Rapid advance in sequencing technology has led to genome-wide analysis of genetic and epigenetic features simultaneously, making it possible to understand the biological mechanisms underlying cancer initiation and progression. However, how to identify important prognostic features poses a great challenge for both statistical modeling and computing. In this thesis, a network-based approach is applied to the Cancer Genome Atlas (TCGA) ovarian cancer data to identify important genes related to the overall survival of ovarian cancer patients. In the first step, a stepwise correlation-based selector is used to reduce the dimensionality of TCGA data, by filtering out a large number of …


Probabilistic Graphical Modeling On Big Data, Ming-Hua Chung Dec 2015

Probabilistic Graphical Modeling On Big Data, Ming-Hua Chung

Graduate Theses and Dissertations

The rise of Big Data in recent years brings many challenges to modern statistical analysis and modeling. In toxicogenomics, the advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on key word search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past …


Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva Dec 2015

Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva

Graduate Theses and Dissertations

Indirect calorimetry to study heat production (HP) and dual energy X-ray absorptiometry (DEXA) for body composition (BC) are powerful techniques to study the dynamics of energy and protein utilization in poultry. The first two chapters present the BC (dry matter, lean, protein, and fat, bone mineral, calcium and phosphorus) of modern broilers from 1 – 60 d of age analyzed by chemical analysis and DEXA. DEXA has been validated for precision, standardized for position, and equations and validations developed for chickens under two different feeding levels. These equations are unique to the machine and software in use. Research in broilers …


Analytical Comparison Of Contrasting Approaches To Estimating Competing Risks Models, Brian Stephen Rickard May 2015

Analytical Comparison Of Contrasting Approaches To Estimating Competing Risks Models, Brian Stephen Rickard

Graduate Theses and Dissertations

Survival analysis is a commonly used tool in many fields but has seen little use in education research despite a common number of research questions for which it is well suited. Researchers often use logistic regression instead; however, this omits useful information. In research on retention and graduation for example, the timing of the event is an important piece of information omitted when using logistic regression. A simulation study was conducted to evaluate four methods of analyzing competing risks survival data, Cox proportional hazards regression, Weibull regression, Fine and Gray's Method, and Cox proportional hazards regression with frailty. College student …


Online Detection Of Outliers And Structural Breaks Using Sequential Monte Carlo Methods, Richard Wanjohi Dec 2014

Online Detection Of Outliers And Structural Breaks Using Sequential Monte Carlo Methods, Richard Wanjohi

Graduate Theses and Dissertations

Outliers and structural breaks occur quite frequently in time series data. Whereas outliers often contain valuable information

about the process under study, they are known to have serious negative impact on statistical data analysis. Most obvious effect is model misspecification and biased parameter estimation which results in wrong conclusions and inaccurate predictions. Structural time series consist of underlying features such as level, slope, cycles or seasonal components. Structural breaks are permanent disruptions of one or more of these components and might be a signal of serious changes in the observed process.

Detecting outliers and estimating the location of structural breaks …


Poisson Distributed Individuals Control Charts With Optimal Limits, Negin Enayaty Ahangar May 2014

Poisson Distributed Individuals Control Charts With Optimal Limits, Negin Enayaty Ahangar

Graduate Theses and Dissertations

The conventional method used in attribute control charts is the Shewhart three sigma limits. The implicit assumption of the Normal distribution in this approach is not appropriate for skewed distributions such as Poisson, Geometric and Negative Binomial. Normal approximations perform poorly in the tail area of the these distributions. In this research, a type of attribute control chart is introduced to monitor the processes that provide count data. The economic objective of this chart is to minimize the cost of its errors which is determined by the designer. This objective is a linear function of type I and II errors. …


An Economic Alternative To The C Chart, Ryan William Black Dec 2012

An Economic Alternative To The C Chart, Ryan William Black

Graduate Theses and Dissertations

Because the probability of Type I error is not evenly distributed beyond upper and lower three-sigma limits the c chart is theoretically inappropriate for a monitor of Poisson distributed phenomena. Furthermore, the normal approximation to the Poisson is of little use when c is small. These practical and theoretical concerns should motivate the computation of true error rates associated with individuals control assuming the Poisson distribution. An economic alternative to the c chart is described as a statistical model of upward shift from c0 to c1 and the two charts are compared in theory. For a range of c chart …


Investigating The Sensitivity Of Goodness-Of-Fit Indices To Detect Measurement Invariance In The Bifactor Model, Jam Khojasteh Dec 2012

Investigating The Sensitivity Of Goodness-Of-Fit Indices To Detect Measurement Invariance In The Bifactor Model, Jam Khojasteh

Graduate Theses and Dissertations

A Monte Carlo simulation study was conducted to evaluate the sensitivities of five commonly used goodness-of-fit indices to detect metric invariance properties of the bifactor model. The fit indices that performed the best in terms of power were Gamma and Mc. In addition, Gamma, Mc, CFI, and RMSEA all held Type I error to a minimum. However, only Gamma and CFI are recommended to use in the bifactor model because the other GOF indices have cutoff values that are too large. For Gamma and CFI values of -.026 to -.045 and -.004 to -.009, respectively indicate a lack of metric …