Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2016

Statistics and Probability

Institution
Keyword
Publication

Articles 1 - 30 of 217

Full-Text Articles in Physical Sciences and Mathematics

What Affects Parents’ Choice Of Milk? An Application Of Bayesian Model Averaging, Yingzhe Cheng Dec 2016

What Affects Parents’ Choice Of Milk? An Application Of Bayesian Model Averaging, Yingzhe Cheng

Mathematics & Statistics ETDs

This study identifies the factors that influence parents’ choice of milk for their children, using data from a unique survey administered in 2013 in Hunan province, China. In this survey, we identified two brands of milk, which differ in their prices and safety claims by the producer. Data were collected on parents’ choice of milk between the two brands, demographics, attitude towards food safety and behaviors related to food. Stepwise model selection and Bayesian model averaging (BMA) are used to search for influential factors. The two approaches consistently select the same factors suggested by an economic theoretical model, including price …


Mechanistic Plug-And-Play Models For Understanding The Impact Of Control And Climate On Seasonal Dengue Dynamics In Iquitos, Peru, Nathan Levick Dec 2016

Mechanistic Plug-And-Play Models For Understanding The Impact Of Control And Climate On Seasonal Dengue Dynamics In Iquitos, Peru, Nathan Levick

Mathematics & Statistics ETDs

Dengue virus is a mosquito-borne multi-serotype disease whose dynamics are not precisely understood despite half of the world’s human population being at risk of infection. A recent dataset of dengue case reports from an isolated Amazonian city— Iquitos, Peru—provides a unique opportunity to assess dengue dynamics in a simpli- fied setting. Ten years of clinical surveillance data reveal a specific pattern: two novel serotypes, in turn, invaded and exclusively dominated incidence over several seasonal cycles, despite limited intra-annual variation in climate conditions. Together with mechanistic mathematical models, these data can provide an improved understand- ing of the nonlinear interactions between …


Economic Opportunity And Young Adult Mortality: Variations By Race/Ethnicity And Gender, Jocelyn Mineo Dec 2016

Economic Opportunity And Young Adult Mortality: Variations By Race/Ethnicity And Gender, Jocelyn Mineo

Honors Projects

This study examines the relationship between economic opportunity and adolescent and young adult mortality in the United States. In addition, this study explores other variables, such as social support and rurality, and their link to young adult mortality rates. First, we examined the link between economic opportunity and all-cause mortality rates for youth ages 15 to 34 in the United States. Given the increasing racial and ethnic diversity of America’s youth, we pay particular attention to race/ethnic differences. We also examine the differences in mortality by gender.


A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis Dec 2016

A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis

Open Access Dissertations

Mass spectrometry (MS) imaging is a powerful investigation technique for a wide range of biological applications such as molecular histology of tissue, whole body sections, and bacterial films , and biomedical applications such as cancer diagnosis. MS imaging visualizes the spatial distribution of molecular ions in a sample by repeatedly collecting mass spectra across its surface, resulting in complex, high-dimensional imaging datasets. Two of the primary goals of statistical analysis of MS imaging experiments are classification (for supervised experiments), i.e. assigning pixels to pre-defined classes based on their spectral profiles, and segmentation (for unsupervised experiments), i.e. assigning pixels to newly …


Group Transformation And Identification With Kernel Methods And Big Data Mixed Logistic Regression, Chao Pan Dec 2016

Group Transformation And Identification With Kernel Methods And Big Data Mixed Logistic Regression, Chao Pan

Open Access Dissertations

Exploratory Data Analysis (EDA) is a crucial step in the life cycle of data analysis. Exploring data with effective methods would reveal main characteristics of data and provides guidance for model building. The goal of this thesis is to develop effective and efficient methods for data exploration in the regression setting.

First, we propose to use optimal group transformations as a general approach for exploring the relationship between predictor variables X and the response Y. This approach can be considered an automatic procedure to identify the best characteristic of P( Y|X) under which the relationship …


Characterizing The Effects Of Repetitive Head Trauma In Female Soccer Athletes For Prevention Of Mild Traumatic Brain Injury, Diana Otero Svaldi Dec 2016

Characterizing The Effects Of Repetitive Head Trauma In Female Soccer Athletes For Prevention Of Mild Traumatic Brain Injury, Diana Otero Svaldi

Open Access Dissertations

As participation in women’s soccer continues to grow and the longevity of female athletes’ careers continues to increase, prevention of mTBI in women’s soccer has become a major concern for female athletes as the long-term risks associated with a history of mTBI are well documented. Among women’s sports, soccer exhibits the highest concussion rates, on par with those of men’s football at the collegiate level. Head impact monitoring technology has revealed that “concussive hits” occurring directly before symptomatic injury are not predictive of mTBI, suggesting that the cumulative effect of repetitive head impacts experienced by collision sport athletes should be …


Computational Environment For Modeling And Analysing Network Traffic Behaviour Using The Divide And Recombine Framework, Ashrith Barthur Dec 2016

Computational Environment For Modeling And Analysing Network Traffic Behaviour Using The Divide And Recombine Framework, Ashrith Barthur

Open Access Dissertations

There are two essential goals of this research. The first goal is to design and construct a computational environment that is used for studying large and complex datasets in the cybersecurity domain. The second goal is to analyse the Spamhaus blacklist query dataset which includes uncovering the properties of blacklisted hosts and understanding the nature of blacklisted hosts over time.

The analytical environment enables deep analysis of very large and complex datasets by exploiting the divide and recombine framework. The capability to analyse data in depth enables one to go beyond just summary statistics in research. This deep analysis is …


Functional Regression Models In The Frame Work Of Reproducing Kernel Hilbert Space, Simeng Qu Dec 2016

Functional Regression Models In The Frame Work Of Reproducing Kernel Hilbert Space, Simeng Qu

Open Access Dissertations

The aim of this thesis is to systematically investigate some functional regression models for accurately quantifying the effect of functional predictors. In particular, three functional models are studied: functional linear regression model, functional Cox model, and function-on-scalar model. Both theoretical properties and numerical algorithms are studied in depth. The new models find broad applications in many areas.

For the functional linear regression model, the focus is on testing the nullity of the slope function, and a generalized likelihood ratio test based on easily implementable data-driven estimate is proposed. The quality of the test is measured by the minimal distance between …


Divide And Recombined For Large Complex Data: Nonparametric-Regression Modelling Of Spatial And Seasonal-Temporal Time Series, Xiaosu Tong Dec 2016

Divide And Recombined For Large Complex Data: Nonparametric-Regression Modelling Of Spatial And Seasonal-Temporal Time Series, Xiaosu Tong

Open Access Dissertations

In the first chapter of this dissertation, I briefly introduce one type of nonparametric regression method, namely local polynomial regression, followed by emphasis on one specific application of loess on time series decomposition, called Seasonal Trend Loess (STL). The chapter is closed by the introduction of D\&R; (Divide and Recombined) statistical framework. Data can be divided into subsets, each of which is applied with a statistical analysis method. This is an embarrassing parallel procedure since there is no communication between each subset. Then the analysis result for each subset are combined together to be the final analysis outcome for the …


Development And Validation Of The Statistics Assessment Of Graduate Students, Dammika Lakmal Walpitage Dec 2016

Development And Validation Of The Statistics Assessment Of Graduate Students, Dammika Lakmal Walpitage

Doctoral Dissertations

This study developed the Statistics Assessment of Graduate Students (SAGS) instrument, and established its preliminary item characteristics, reliability, and validity evidence. Even though there are limited number of assessments available for measuring different aspects of statistical cognition, these previously available assessments have numerous limitations. The SAGS instrument was developed using Rasch modeling approach to create a new measure of statistical research methodology knowledge of graduate students in education and other behavioral and social sciences. Thirty-five multiple-choice questions were written with stems representing applied research situations and response options distinguishing between appropriate use of various statistical tests or procedures. A focus …


On The Quantification Of Complexity And Diversity From Phenotypes To Ecosystems, Zachary Harrison Marion Dec 2016

On The Quantification Of Complexity And Diversity From Phenotypes To Ecosystems, Zachary Harrison Marion

Doctoral Dissertations

A cornerstone of ecology and evolution is comparing and explaining the complexity of natural systems, be they genomes, phenotypes, communities, or entire ecosystems. These comparisons and explanations then beget questions about how complexity should be quantified in theory and estimated in practice. Here I embrace diversity partitioning using Hill or effective numbers to move the empirical side of the field regarding the quantification of biological complexity.

First, at the level of phenotypes, I show that traditional multivariate analyses ignore individual complexity and provide relatively abstract representations of variation among individuals. I then suggest using well-known diversity indices from community ecology …


Monte Carlo Methods In Bayesian Inference: Theory, Methods And Applications, Huarui Zhang Dec 2016

Monte Carlo Methods In Bayesian Inference: Theory, Methods And Applications, Huarui Zhang

Graduate Theses and Dissertations

Monte Carlo methods are becoming more and more popular in statistics due to the fast development of efficient computing technologies. One of the major beneficiaries of this advent is the field of Bayesian inference. The aim of this thesis is two-fold: (i) to explain the theory justifying the validity of the simulation-based schemes in a Bayesian setting (why they should work) and (ii) to apply them in several different types of data analysis that a statistician has to routinely encounter. In Chapter 1, I introduce key concepts in Bayesian statistics. Then we discuss Monte Carlo Simulation methods in detail. Our …


Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton Dec 2016

Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Random Forests are very memory intensive machine learning algorithms and most computers would fail at building models from datasets with millions of observations. Using the Center for High Performance Computing (CHPC) at the University of Utah and an airline on-time arrival dataset with 7 million observations from the U.S. Department of Transportation Bureau of Transportation Statistics we built 316 models by adjusting the depth of the trees and randomness of each forest and compared the accuracy and time each took. Using this dataset we discovered that substantial restrictions to the size of trees, observations allowed for each tree, and variables …


Estimating The Selection Gradient Of A Function-Valued Trait, Tyler John Baur Dec 2016

Estimating The Selection Gradient Of A Function-Valued Trait, Tyler John Baur

Theses and Dissertations

Kirkpatrick and Heckman initiated the study of function-valued traits in 1989. How to estimate the selection gradient of a function-valued trait is a major question asked by evolutionary biologists. In this dissertation, we give an explicit expansion of the selection gradient and construct estimators based on two different samples: one consisting of independent organisms (the independent case), and the other consisting of independent families of equally related organisms (the dependent case).

In the independent case we first construct and prove the joint consistency of sieve estimators of the mean and covariance functions of a Gaussian process, based on previous developments …


Density Estimation For Lifetime Distributions Under Semi-Parametric Random Censorship Models, Carsten Harlass Dec 2016

Density Estimation For Lifetime Distributions Under Semi-Parametric Random Censorship Models, Carsten Harlass

Theses and Dissertations

We derive product limit estimators of survival times and failure rates for randomly right censored data as the numerical solution of identifying Volterra integral equations by employing explicit and implicit Euler schemes. While the first approach results in some known estimators, the latter leads to a new general type of product limit estimator. Plugging in established methods to approximate the conditional probability of the censoring indicator given the observation, we introduce new semi-parametric and presmoothed Kaplan-Meier type estimators. In the case of the semi-parametric random censorship model, i.e. the latter probability belonging to some parametric family, we study the strong …


Analysis Of Break-Points In Financial Time Series, Jean Remy Habimana Dec 2016

Analysis Of Break-Points In Financial Time Series, Jean Remy Habimana

Graduate Theses and Dissertations

A time series is a set of random values collected at equal time intervals; this randomness makes these types of series not easy to predict because the structure of the series may change at any time. As discussed in previous research, the structure of time series may change at any time due to the change in mean and/or variance of the series. Consequently, based on this structure, it is wise not to assume that these series are stationary. This paper, discusses, a method of analyzing time series by considering the entire series non-stationary, assuming there is random change in unconditional …


A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz Dec 2016

A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz

Doctor of Business Administration Dissertations

At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with …


A Multi-Indexed Logistic Model For Time Series, Xiang Liu Dec 2016

A Multi-Indexed Logistic Model For Time Series, Xiang Liu

Electronic Theses and Dissertations

In this thesis, we explore a multi-indexed logistic regression (MILR) model, with particular emphasis given to its application to time series. MILR includes simple logistic regression (SLR) as a special case, and the hope is that it will in some instances also produce significantly better results. To motivate the development of MILR, we consider its application to the analysis of both simulated sine wave data and stock data. We looked at well-studied SLR and its application in the analysis of time series data. Using a more sophisticated representation of sequential data, we then detail the implementation of MILR. We compare …


Statistical Inference Of Genetic Forces Using A Poisson Random Field Model With Non-Constant Population Size, Jianbo Xu Dec 2016

Statistical Inference Of Genetic Forces Using A Poisson Random Field Model With Non-Constant Population Size, Jianbo Xu

UNLV Theses, Dissertations, Professional Papers, and Capstones

The fidelity of DNA sequence data makes it a perfect platform for quantitatively analyzing and interpreting evolutionary progress. By comparing the information between intraspecific polymorphism with interspecific divergence in two sibling species, the well established Poisson Random Field theory offers a statistical framework with which various genetic parameters such as natural selection intensity, mutation rate and speciation time can be effectively estimated. A recently developed time-inhomogeneous PRF model has reinforced the original method by removing the assumption of stationary site frequency, but it preserves the condition that the two sibling species share same effective population size with their ancestral species. …


Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour Dec 2016

Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour

Theses and Dissertations

Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …


Applications Of Credit Scoring Models, Mimi Mei Ling Chong Dec 2016

Applications Of Credit Scoring Models, Mimi Mei Ling Chong

Electronic Thesis and Dissertation Repository

The application of credit scoring on consumer lending is an automated, objective and consistent tool which helps lenders to provide quick loan decisions. In order to apply for a loan, applicants must provide their attributes by filling out an application form. Certain attributes are then selected as inputs to a credit scoring model which generates a credit score. The magnitude of this credit score is proved to be related to the credit quality of the loan applicant. As such, it is used to determine whether the loan will be granted, and also the amount of interest being charged. Currently, little …


Joint Models For Spatial And Spatio-Temporal Point Processes, Alisha Albert-Green Nov 2016

Joint Models For Spatial And Spatio-Temporal Point Processes, Alisha Albert-Green

Electronic Thesis and Dissertation Repository

In biostatistics and environmetrics, interest often centres around the development of models and methods for making inference on observed point patterns assumed to be generated by latent spatial or spatio-temporal processes. Such analyses, however, are challenging as these data are typically hierarchical with complex correlation structures. In instances where data are spatially aggregated by reporting region and rates are low, further complications may result from zero-inflation.

In this research, motivated by the analysis of spatio-temporal storm cell data, we generalize the Neyman-Scott parent-child process to account for hierarchical clustering. This is accomplished by allowing the parents to follow a log-Gaussian …


Searching Neuroimaging Biomarkers In Mental Disorders With Graph And Multimodal Fusion Analysis Of Functional Connectivity, Hao He Nov 2016

Searching Neuroimaging Biomarkers In Mental Disorders With Graph And Multimodal Fusion Analysis Of Functional Connectivity, Hao He

Electrical and Computer Engineering ETDs

Mental disorders such as schizophrenia (SZ), bipolar (BD), and major depression disorders (MDD) can cause severe symptoms and life disruption. They share some symptoms, which can pose a major clinical challenge to their differentiation. Objective biomarkers based on neuroimaging may help to improve diagnostic accuracy and facilitate optimal treatment for patients. Over the last decades, non-invasive in-vivo neuroimaging techniques such as magnetic resonance imaging (MRI) have been increasingly applied to measure structure and function in human brains. With functional MRI (fMRI) or structural MRI (sMRI), studies have identified neurophysiological deficits in patients’ brain from different perspective. Functional connectivity (FC) analysis …


Applications Of Sampling And Estimation On Networks, Fabricio Murai Ferreira Nov 2016

Applications Of Sampling And Estimation On Networks, Fabricio Murai Ferreira

Doctoral Dissertations

Networks or graphs are fundamental abstractions that allow us to study many important real systems, such as the Web, social networks and scientific collaboration. It is impossible to completely understand these systems and answer fundamental questions related to them without considering the way their components are connected, i.e., their topology. However, topology is not the only relevant aspect of networks. Nodes often have information associated with them, which can be regarded as node attributes or labels. An important problem is then how to characterize a network w.r.t. topology and node label distributions. Another important problem is how to design efficient …


Towards Deeper Understanding In Neuroimaging, Rex Devon Hjelm Nov 2016

Towards Deeper Understanding In Neuroimaging, Rex Devon Hjelm

Computer Science ETDs

Neuroimaging is a growing domain of research, with advances in machine learning having tremendous potential to expand understanding in neuroscience and improve public health. Deep neural networks have recently and rapidly achieved historic success in numerous domains, and as a consequence have completely redefined the landscape of automated learners, giving promise of significant advances in numerous domains of research. Despite recent advances and advantages over traditional machine learning methods, deep neural networks have yet to have permeated significantly into neuroscience studies, particularly as a tool for discovery. This dissertation presents well-established and novel tools for unsupervised learning which aid in …


Intrinsic Functions For Securing Cmos Computation: Variability, Modeling And Noise Sensitivity, Xiaolin Xu Nov 2016

Intrinsic Functions For Securing Cmos Computation: Variability, Modeling And Noise Sensitivity, Xiaolin Xu

Doctoral Dissertations

A basic premise behind modern secure computation is the demand for lightweight cryptographic primitives, like identifier or key generator. From a circuit perspective, the development of cryptographic modules has also been driven by the aggressive scalability of complementary metal-oxide-semiconductor (CMOS) technology. While advancing into nano-meter regime, one significant characteristic of today's CMOS design is the random nature of process variability, which limits the nominal circuit design. With the continuous scaling of CMOS technology, instead of mitigating the physical variability, leveraging such properties becomes a promising way. One of the famous products adhering to this double-edged sword philosophy is the Physically …


Stochastic Network Design: Models And Scalable Algorithms, Xiaojian Wu Nov 2016

Stochastic Network Design: Models And Scalable Algorithms, Xiaojian Wu

Doctoral Dissertations

Many natural and social phenomena occur in networks. Examples include the spread of information, ideas, and opinions through a social network, the propagation of an infectious disease among people, and the spread of species within an interconnected habitat network. The ability to modify a phenomenon towards some desired outcomes has widely recognized benefits to our society and the economy. The outcome of a phenomenon is largely determined by the topology or properties of its underlying network. A decision maker can take management actions to modify a network and, therefore, change the outcome of the phenomenon. A management action is an …


Identifying Examinees Who Possess Distinct And Reliable Subscores When Added Value Is Lacking For The Total Sample, Joseph A. Rios Nov 2016

Identifying Examinees Who Possess Distinct And Reliable Subscores When Added Value Is Lacking For The Total Sample, Joseph A. Rios

Doctoral Dissertations

Research has demonstrated that although subdomain information may provide no added value beyond the total score, in some contexts such information is of utility to particular demographic subgroups (Sinharay & Haberman, 2014). However, it is argued that the utility of reporting subscores for an individual should not be based on one’s manifest characteristics (e.g., gender or ethnicity), but rather on individual needs for diagnostic information, which is driven by multidimensionality in subdomain scores. To improve the validity of diagnostic information, this study proposed the use of Mahalanobis Distance and HT indices to assess whether an individual’s data significantly departs …


Efficiency Of An Unbalanced Design In Collecting Time To Event Data With Interval Censoring, Peiyao Cheng Nov 2016

Efficiency Of An Unbalanced Design In Collecting Time To Event Data With Interval Censoring, Peiyao Cheng

USF Tampa Graduate Theses and Dissertations

In longitudinal studies, the exact timing of an event often cannot be observed, and is usually detected at a subsequent visit, which is called interval censoring. Spacing of the visits is important when designing study with interval censored data. In a typical longitudinal study, the spacing of visits is usually the same across all subjects (balanced design). In this dissertation, I propose an unbalanced design: subjects at baseline are divided into a high risk group and a low risk group based on a risk factor, and the subjects in the high risk group are followed more frequently than those in …


A Comparison Of Techniques For Handling Missing Data In Longitudinal Studies, Alexander R. Bogdan Nov 2016

A Comparison Of Techniques For Handling Missing Data In Longitudinal Studies, Alexander R. Bogdan

Masters Theses

Missing data are a common problem in virtually all epidemiological research, especially when conducting longitudinal studies. In these settings, clinicians may collect biological samples to analyze changes in biomarkers, which often do not conform to parametric distributions and may be censored due to limits of detection. Using complete data from the BioCycle Study (2005-2007), which followed 259 premenopausal women over two menstrual cycles, we compared four techniques for handling missing biomarker data with non-Normal distributions. We imposed increasing degrees of missing data on two non-Normally distributed biomarkers under conditions of missing completely at random, missing at random, and missing not …