Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Pure sciences (14)
- Bayesian (5)
- Statistics (5)
- Applied sciences (4)
- Bayesian Analysis (2)
-
- Biological sciences (2)
- Differential Item Functioning (2)
- Distance Correlation (2)
- Horseshoe (2)
- Item response theory (2)
- MCMC (2)
- Measurement invariance (2)
- Ovarian cancer (2)
- Poisson (2)
- Psychometrics (2)
- Quality control (2)
- Regression (2)
- Statistical Learning (2)
- Variable selection (2)
- 5-fluorouracil (1)
- Aberrant responding (1)
- Adaptation (1)
- Aggregate failure-time data (1)
- Analytics (1)
- Appraisals (1)
- Arkansas (1)
- Autoregressive Model Parameter (1)
- Azoxymethane (1)
- Ba model (1)
- Backtracking simulations (1)
Articles 31 - 60 of 70
Full-Text Articles in Physical Sciences and Mathematics
Assessing Differential Item Functioning In The Perceived Stress Scale, Nana Amma Berko Asamoah
Assessing Differential Item Functioning In The Perceived Stress Scale, Nana Amma Berko Asamoah
Graduate Theses and Dissertations
When an item on a test functions differently for subgroups of respondents with respect to an exogenous variable (or covariate) after conditioning on the latent variable of interest, the item is said to exhibit Differential Item Functioning (DIF). The 10-item Perceived Stress Scale (PSS10) is administered to respondents via MTurk to quantify “perceived stress” and identify if items on the scale function differently for specific subgroups defined by age, sex, race, marital status, number of children, employment status and social media usage.
The purpose of this study was to compare traditional DIF detection approaches (Mantel-Haenszel, logistic regression, likelihood ratio test …
Learning Networks With Categorical Data Using Distance Correlation, And A Novel Graph-Based Multivariate Test, Jian Tinker
Learning Networks With Categorical Data Using Distance Correlation, And A Novel Graph-Based Multivariate Test, Jian Tinker
Graduate Theses and Dissertations
We study the use of distance correlation for statistical inference on categorical data, especially the induction of probability networks. Szekely et al. first defined distance correlation for continuous variables in [42], and Zhang translated the concept into the categorical setting in [57] by defining dCor(X,Y) for categorical variables X = (x1,...,xI) and Y = (y1,...,yJ) where P(X=xi)=[pi]i and P(Y=yi)=[pi]j with the formula [Please open the document]
Part I of the dissertation covers the background we need to understand this formula, and prepares us to analyze the properties and performance of its applications.
Part II then presents the main results of …
Structural Analysis Of The Multifunctional Spoiie Regulatory Protein Of Clostridioides Difficile., Blythe Emily Bunkers
Structural Analysis Of The Multifunctional Spoiie Regulatory Protein Of Clostridioides Difficile., Blythe Emily Bunkers
Graduate Theses and Dissertations
Clostridioides (formally Clostridium) difficile is a medically relevant pathogen pertinent to infectious disease research. C. difficile is distinctly known for its ability to produce two toxins, enterotoxin A and cytotoxin B, and the propensity to colonize the mammalian gastrointestinal tract. It is known that metabolism is tightly correlated with sporulation in endospore producers such as C. difficile, but an interesting and novel regulatory relationship found by the Ivey lab has yet to be understood. The relationship explored in this study is observed between the sporulation factor, SpoIIE, which represses expression of an ABC peptide transporter, app. In this study, two …
Measuring Sexual Excitation And Sexual Inhibition In A Dutch-Speaking Sample, Malachi Willis
Measuring Sexual Excitation And Sexual Inhibition In A Dutch-Speaking Sample, Malachi Willis
Graduate Theses and Dissertations
Background: Individual differences in sexual excitation and sexual inhibition are important predictors of sexual functioning. Psychometric instruments for these aspects of sexual response were originally developed separately for men (Sexual Inhibition /Sexual Excitation Scales [SIS/SES]) and women (Sexual Excitation/Sexual Inhibition Inventory for Women [SESII-W]). These measures were then adapted to function similarly in samples comprising both men and women (Sexual Inhibition/Sexual Excitation Scales-Short Form [SIS/SES-SF] and Sexual Excitation/Sexual Inhibition Inventory for Women and Men [SESII-W/M], respectively). No published study to our knowledge has administered the SIS/SES and SESII-W/M questionnaires to a sample of both women and men. In the present …
Detecting Differentially Co-Expressed Gene Modules Via The Edge-Count Test, Anne Gratius Lin
Detecting Differentially Co-Expressed Gene Modules Via The Edge-Count Test, Anne Gratius Lin
Graduate Theses and Dissertations
Background
Gene expression profiling by microarray has been used to uncover molecular variations in many different diseases. Complementary to conventional differential expression analysis, differential co-expression analysis can identify gene markers from the systematic and granular level. There are three aspects for differential co-expression network analysis, including the network global topological comparison, differential co-expression cluster identification, and differential co-expressed genes and gene pair identification. To date, most of the methods available still rely on Pearson’s correlation coefficient despite its nonlinear insensitivity.
Results
Here we present an approach that is robust to nonlinearity by using the edge-count test for differential co-expression analysis. …
Effect Of Cross-Validation On The Output Of Multiple Testing Procedures, Josh Dallas Price
Effect Of Cross-Validation On The Output Of Multiple Testing Procedures, Josh Dallas Price
Graduate Theses and Dissertations
High dimensional data with sparsity is routinely observed in many scientific disciplines. Filtering out the signals embedded in noise is a canonical problem in such situations requiring multiple testing. The Benjamini--Hochberg procedure using False Discovery Rate control is the gold standard in large scale multiple testing. In Majumder et al. (2009) an internally cross-validated form of the procedure is used to avoid a costly replicate study and the complications that arise from population selection in such studies (i.e. extraneous variables). I implement this procedure and run extensive simulation studies under increasing levels of dependence among parameters and different data generating …
Spatio-Temporal Analysis Of Tree Ring Chronology And Precipitation, Ruizhe Yin
Spatio-Temporal Analysis Of Tree Ring Chronology And Precipitation, Ruizhe Yin
Graduate Theses and Dissertations
Tree ring chronology data is known to reflect regional climate due to the strong impact of rainfall and temperature. Therefore, tree ring data can be used to reconstruct historical climate in order to understand how climate changed in the past and make prediction about the future behavior of the climate. For simplicity, this research only considers the influence of precipitation on tree ring growth within the New England area. A total of 94 measurement sites are used to record tree ring width over 881 years and corresponding precipitation data are given at some locations for 121 years. We developed a …
Spatio-Temporal Prediction Of Arkansas Gubernatorial Election, Michael Harris
Spatio-Temporal Prediction Of Arkansas Gubernatorial Election, Michael Harris
Graduate Theses and Dissertations
Our goal is to create spatio-temporal models for predicting future gubernatorial elections. For a concrete example of how well our models work we use past data to predict the 2018 Arkansas gubernatorial election and use the existing 2018 election data to check our models predictive accuracy. Gubernatorial election data was collected from the Arkansas Secretary of State website while related covariate data was collected from the website for the Federal Reserve Bank of St. Louis. The data we collect is on the county level. For predictive purposes we fit multiple models to the data using Markov chain Monte Carlo and …
Probabilistic Models For Order-Picking Operations With Multiple In-The-Aisle Pick Positions, Jingming Liu
Probabilistic Models For Order-Picking Operations With Multiple In-The-Aisle Pick Positions, Jingming Liu
Graduate Theses and Dissertations
The development of probability density functions (pdfs) for travel time of a narrow aisle lift truck (NALT) and an automated storage and retrieval (AS/R) machine is the focus of the dissertation. The multiple in-the-aisle pick positions (MIAPP) order picking system can be modeled as an M/G/1 queueing problem in which storage and retrieval requests are the customers and the vehicle (NALT or AS/R machine) is the server. Service time is the sum of travel time and the deterministic time to pick up and deposit a pallet (TPD).
Our first contribution is the development of travel time pdfs for retrieval operations …
A Hidden Markov Factor Analysis Framework For Seizure Detection In Epilepsy Patients, Mahboubeh Madadi
A Hidden Markov Factor Analysis Framework For Seizure Detection In Epilepsy Patients, Mahboubeh Madadi
Graduate Theses and Dissertations
Approximately 1% of the world population suffers from epilepsy. Continuous long-term electroencephalographic (EEG) monitoring is the gold-standard for recording epileptic seizures and assisting in the diagnosis and treatment of patients with epilepsy. Detection of seizure from the recorded EEG is a laborious, time consuming and expensive task. In this study, we propose an automated seizure detection framework to assist electroencephalographers and physicians with identification of seizures in recorded EEG signals. In addition, an automated seizure detection algorithm can be used for treatment through automatic intervention during the seizure activity and on time triggering of the injection of a radiotracer to …
Advanced Statistics In Arkansas Sports Reporting, Andrew Lee Epperson
Advanced Statistics In Arkansas Sports Reporting, Andrew Lee Epperson
Graduate Theses and Dissertations
This study seeks to analyze how Arkansas’ sports journalists are adapting to the recent surge in available advanced statistics that are being used by certain national news organizations. Using in-depth qualitative research that includes in-depth interviews with a number of individuals in the print, broadcast, and athletics side of sports coverage, we discover how journalists and coaches use these next-generation analytics, what they fundamentally mean for the evolution of each respective path, and why so few Arkansas reporters and writers use them at the time of this paper’s defense. We see how budgets and deadlines restrict the use of these …
Comparing Elo, Glicko, Irt, And Bayesian Irt Statistical Models For Educational And Gaming Data, Breanna Morrison
Comparing Elo, Glicko, Irt, And Bayesian Irt Statistical Models For Educational And Gaming Data, Breanna Morrison
Graduate Theses and Dissertations
Statistical models used for estimating skill or ability levels often vary by field, however their underlying mathematical models can be very similar. Differences in the underlying models can be due to the need to accommodate data with different underlying formats and structure. As the models from varying fields increase in complexity, their ability to be applied to different types of data may have the ability to increase. Models that are applied to educational or psychological data have advanced to accommodate a wide range of data formats, including increased estimation accuracy with sparsely populated data matrices. Conversely, the field of online …
A Bayesian Framework For Estimating Seismic Wave Arrival Time, Hua Zhong
A Bayesian Framework For Estimating Seismic Wave Arrival Time, Hua Zhong
Graduate Theses and Dissertations
Because earthquakes have a large impact on human society, statistical methods for better studying earthquakes are required. One characteristic of earthquakes is the arrival time of seismic waves at a seismic signal sensor. Once we can estimate the earthquake arrival time accurately, the earthquake location can be triangulated, and assistance can be sent to that area correctly. This study presents a Bayesian framework to predict the arrival time of seismic waves with associated uncertainty. We use a change point framework to model the different conditions before and after the seismic wave arrives. To evaluate the performance of the model, we …
Sequential Inference For Hidden Markov Models, Michael Ellis
Sequential Inference For Hidden Markov Models, Michael Ellis
Graduate Theses and Dissertations
In many applications data are collected sequentially in time with very short time intervals between observations. If one is interested in using new observations as they arrive in time then non-sequential Bayesian inference methods, such as Markov Chain Monte Carlo (MCMC) sampling, can be too slow. Increasingly, state space models are being used to model nonlinear and non-Gaussian systems. The structure of state space models allows for sequential Bayesian inference so that an approximation to the posterior distribution of interest can be updated as new observations arrive. In special cases, the exact posterior distribution can be updated through conjugate Bayesian …
Budget-Constrained Regression Model Selection Using Mixed Integer Nonlinear Programming, Jingying Zhang
Budget-Constrained Regression Model Selection Using Mixed Integer Nonlinear Programming, Jingying Zhang
Graduate Theses and Dissertations
Regression analysis fits predictive models to data on a response variable and corresponding values for a set of explanatory variables. Often data on the explanatory variables come at a cost from commercial databases, so the available budget may limit which ones are used in the final model.
In this dissertation, two budget-constrained regression models are proposed for continuous and categorical variables respectively using Mixed Integer Nonlinear Programming (MINLP) to choose the explanatory variables to be included in solutions. First, we propose a budget-constrained linear regression model for continuous response variables. Properties such as solvability and global optimality of the proposed …
Quantitative Microbial Risk Assessment For Parts, Ground, And Msc Poultry Product Including Intervention Analysis And Exploration Of Enterobacteriaceae As An Indicator Organism In Poultry Processing, Leigh Ann Parette
Graduate Theses and Dissertations
Samples collected at five different large bird poultry processing facilities over a period of 7 months from prescald to post debone locations were enumerated for Enterobacteriaceae, Salmonella spp., and Campylobacter spp. and the results were used to create Quantitative Microbial Risk Analyses (QMRA) models for parts, ground, and mechanically separated chicken (MSC) products. Sensitivity analyses indicated the points in the process at which reductions would be most advantageous to the endpoint and simulation models were run to test reductions required to meet the current USDA performance standards.
These data were analyzed to determine the reductions from one node (location) to …
A Generative Statistical Approach For Data Classification In A Biologically Inspired Design Tool, Marvin Manuel Arroyo Rujano
A Generative Statistical Approach For Data Classification In A Biologically Inspired Design Tool, Marvin Manuel Arroyo Rujano
Graduate Theses and Dissertations
The objective of the research this thesis describes is to find a way to classify text-based descriptions of biological adaption to support Biologically Inspired design. Biologically inspired design is a fairly new field with ongoing research. There are different tools to assist designers and biologists in bio-inspired design. Some of the most common are BioTRIZ and AskNature. In recent years, more tools have been proposed to aid and make research in the field easier, for example, the Biologically Inspired Adaptive System Design (BIASD) tool. This tool was designed with the goal of helping designers in early design stages generate more …
Spatio-Temporal Reconstruction Of Remote Sensing Observations, Kamrul Khan
Spatio-Temporal Reconstruction Of Remote Sensing Observations, Kamrul Khan
Graduate Theses and Dissertations
The USDA Forest Service aims to use satellite imagery for monitoring and predicting changes in forest conditions over time within the country. We specifically focus on a 230, 400 hectares region in north-central Wisconsin between 2003 - 2012. The auxiliary data collected from the satellite imagery of this region are relatively dense in space and time and can be used to efficiently predict how the forest condition changed over that decade. However, these records have a significant proportion of missing values due to weather conditions and system failures. To fill in these missing values, we build spaciotemporal models based on …
Comparison Of Correlation, Partial Correlation, And Conditional Mutual Information For Interaction Effects Screening In Generalized Linear Models, Ji Li
Graduate Theses and Dissertations
Numerous screening techniques have been developed in recent years for genome-wide association studies (GWASs) (Moore et al., 2010). In this thesis, a novel model-free screening method was developed and validated by an extensive simulation study. Many screening methods were mainly focused on main effects, while very few studies considered the models containing both main effects and interaction effects. In this work, the interaction effects were fully considered and three different methods (Pearson’s Correlation Coefficient, Partial Correlation, and Conditional Mutual Information) were tested and their prediction accuracies were compared.
Pearson’s Correlation Coefficient method, which is a direct interaction screening (DIS) procedure, …
Adapting To Sparsity And Heavy Tailed Data, Mohamed Abdelkader Abba
Adapting To Sparsity And Heavy Tailed Data, Mohamed Abdelkader Abba
Graduate Theses and Dissertations
The Lasso and the Horseshoe, gold-standards in the frequentist and Bayesian paradigms, critically depend on learning the error variance. This causes a lack of scale invariance and adaptability to heavy-tailed data. The √ Lasso [Belloni et al., 2011] attempt to correct this by using the `1 norm on both the likelihood and the penalty for the objective function. In contrast, there is essentially no methods for uncertainty quantification or automatic parameter tuning via a formal Bayesian treatment of an unknown error distribution. On the other hand, Bayesian shrinkage priors lacking a local shrinkage term fails to adapt to the large …
Hierarchical Bayesian Regression With Application In Spatial Modeling And Outlier Detection, Ghadeer Mahdi
Hierarchical Bayesian Regression With Application In Spatial Modeling And Outlier Detection, Ghadeer Mahdi
Graduate Theses and Dissertations
This dissertation makes two important contributions to the development of Bayesian hierarchical models. The first contribution is focused on spatial modeling. Spatial data observed on a group of areal units is common in scientific applications. The usual hierarchical approach for modeling this kind of dataset is to introduce a spatial random effect with an autoregressive prior. However, the usual Markov chain Monte Carlo scheme for this hierarchical framework requires the spatial effects to be sampled from their full conditional posteriors one-by-one resulting in poor mixing. More importantly, it makes the model computationally inefficient for datasets with large number of units. …
Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea
Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea
Graduate Theses and Dissertations
Outlier detection is one of the most important challenges with many present-day applications. Outliers can occur due to uncertainty in data generating mechanisms or due to an error in data recording/processing. Outliers can drastically change the study's results and make predictions less reliable. Detecting outliers in longitudinal studies is quite challenging because this kind of study is working with observations that change over time. Therefore, the same subject can produce an outlier at one point in time produce regular observations at all other time points. A Bayesian hierarchical modeling assigns parameters that can quantify whether each observation is an outlier …
Identifying Three-Way Gene Interactions From Microarray Data Using Kolmogorov-Smirnov And Cross-Match Tests, Shubhashree Khadka
Identifying Three-Way Gene Interactions From Microarray Data Using Kolmogorov-Smirnov And Cross-Match Tests, Shubhashree Khadka
Graduate Theses and Dissertations
Human gene network is much more complex than just pairwise interaction among the genes. Zhang et al. [6] extracted microarray data from International Genomics Consortium (IGC), and presented the detection of three-way gene interactions in their paper using Fisher’s z-transformation test. Three-way gene interactions are closer than pairwise correlations in representing the complex gene structures. Additionally, it was more tractable than assessing four or more gene interactions. In this paper, we are simulating different models where Fisher’s test might not be as effective. Zhang et al.’s approach utilized Pearson’s correlation coefficients and involved detection of linear interactions only. Since gene …
Genomic And Physiological Approaches To Improve Drought Tolerance In Soybean, Avjinder Kaler
Genomic And Physiological Approaches To Improve Drought Tolerance In Soybean, Avjinder Kaler
Graduate Theses and Dissertations
Drought stress is a major global constraint for crop production, and improving crop tolerance to drought is of critical importance. Direct selection of drought tolerance among genotypes for yield is limited because of low heritability, polygenic control, epistasis effects, and genotype by environment interactions. Crop physiology can play a major role for improving drought tolerance through the identification of traits associated with drought tolerance that can be used as indirect selection criteria in a breeding program. Carbon isotope ratio (δ13C, associated with water use efficiency), oxygen isotope ratio (δ18O, associated with transpiration), canopy temperature (CT), canopy wilting, and canopy coverage …
A Linear-Linear Growth Model With Individual Change Point And Its Application To Ecls-K Data, Ping Zhang
A Linear-Linear Growth Model With Individual Change Point And Its Application To Ecls-K Data, Ping Zhang
Graduate Theses and Dissertations
The latent growth curve model with piecewise functions is a useful analytics tool to investigate the growth trajectory consisted of distinct phases of development in observed variables. An interesting feature of the growth trajectory is the time point that the trajectory changes from one phase to another one. In this thesis, we propose a simple computational pipeline to locate the change point under the linear-linear piecewise model and apply it to the longitudinal study of reading and math ability in early childhood (from kindergarten to eighth grade). In the first step, we conduct the hypothesis testing to filter out the …
A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang
A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang
Graduate Theses and Dissertations
This thesis first describes the general idea behind Bayes Inference, various sampling methods based on Bayes theorem and many examples. Then a Bayes approach to model selection, called Stochastic Search Variable Selection (SSVS) is discussed. It was originally proposed by George and McCulloch (1993). In a normal regression model where the number of covariates is large, only a small subset tend to be significant most of the times. This Bayes procedure specifies a mixture prior for each of the unknown regression coefficient, the mixture prior was originally proposed by Geweke (1996). This mixture prior will be updated as data becomes …
Monte Carlo Methods In Bayesian Inference: Theory, Methods And Applications, Huarui Zhang
Monte Carlo Methods In Bayesian Inference: Theory, Methods And Applications, Huarui Zhang
Graduate Theses and Dissertations
Monte Carlo methods are becoming more and more popular in statistics due to the fast development of efficient computing technologies. One of the major beneficiaries of this advent is the field of Bayesian inference. The aim of this thesis is two-fold: (i) to explain the theory justifying the validity of the simulation-based schemes in a Bayesian setting (why they should work) and (ii) to apply them in several different types of data analysis that a statistician has to routinely encounter. In Chapter 1, I introduce key concepts in Bayesian statistics. Then we discuss Monte Carlo Simulation methods in detail. Our …
Analysis Of Break-Points In Financial Time Series, Jean Remy Habimana
Analysis Of Break-Points In Financial Time Series, Jean Remy Habimana
Graduate Theses and Dissertations
A time series is a set of random values collected at equal time intervals; this randomness makes these types of series not easy to predict because the structure of the series may change at any time. As discussed in previous research, the structure of time series may change at any time due to the change in mean and/or variance of the series. Consequently, based on this structure, it is wise not to assume that these series are stationary. This paper, discusses, a method of analyzing time series by considering the entire series non-stationary, assuming there is random change in unconditional …
Statistical Modeling Of The Temporal Dynamics In A Large Scale-Citation Network, Luis Javier Ek Jr.
Statistical Modeling Of The Temporal Dynamics In A Large Scale-Citation Network, Luis Javier Ek Jr.
Graduate Theses and Dissertations
Citation Networks of papers are vast networks that grow over time. The manner or the form a citation network grows is not entirely a random process, but a preferential attachment relationship; highly cited papers are more likely to be cited by newly published papers. The result is a network whose degree distribution follows a power law. This growth of citation network of papers will be modeled with a negative binomial regression coupled with logistic growth and/or Cauchy distribution curve. Then a Barabasi-Albert model, based on the negative binomial models, and a combination of the Dirichlet distribution and multinomial will be …
Risk Estimation Toward A Natural History Model For Low Grade Glioma Patients, Anh Thi Hoang Pham
Risk Estimation Toward A Natural History Model For Low Grade Glioma Patients, Anh Thi Hoang Pham
Graduate Theses and Dissertations
Glioma is a common type of primary brain tumor that represents 28% of all brain tumors and 80% of malignant tumors. According to a recent study by the Centers for Disease Control and Prevention (CDC), gliomas account for 53%, 35% and 29% of all brain tumors (68%, 74% and 81% of malignant brain tumors) among children (aged 0-14), teenagers (aged 15-19) and young adults, respectively. Gliomas are often diagnosed through radiological imaging and histopathology. There are two main groups of gliomas following World Health Organization’s classification: Low grade gliomas (LGG), or grade I and II gliomas; and high grade gliomas …