Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Statistics and Probability

2016

Institution
Keyword
Publication

Articles 31 - 60 of 217

Full-Text Articles in Physical Sciences and Mathematics

Hidden Markov Chain Analysis: Impact Of Misclassification On Effect Of Covariates In Disease Progression And Regression, Haritha Polisetti Nov 2016

Hidden Markov Chain Analysis: Impact Of Misclassification On Effect Of Covariates In Disease Progression And Regression, Haritha Polisetti

USF Tampa Graduate Theses and Dissertations

Most of the chronic diseases have a well-known natural staging system through which the disease progression is interpreted. It is well established that the transition rates from one stage of disease to other stage can be modeled by multi state Markov models. But, it is also well known that the screening systems used to diagnose disease states may subject to error some times. In this study, a simulation study is conducted to illustrate the importance of addressing for misclassification in multi-state Markov models by evaluating and comparing the estimates for the disease progression Markov model with misclassification opposed to disease …


Multiscale Wind Modelling For Sustainability And Resilience, Djordje Romanic Oct 2016

Multiscale Wind Modelling For Sustainability And Resilience, Djordje Romanic

Electronic Thesis and Dissertation Repository

The research presented herein is a mix of meteorological and wind engineering disciplines. In many cases, there is a gap between these two fields and this thesis is an attempt to bridge that gap through multiscale wind modelling approaches. Data and methods used in this study cover a multitude of spatial and temporal scales. Applications are in the fields of sustainability and resilience. This relationship between multiscale wind modelling and sustainability and resilience is investigated examining several case studies of three different developments: urban, rural and coastal.

An urban wind modelling methodology is proposed and applied for a specific development …


Development Of Anatomical And Functional Magnetic Resonance Imaging Measures Of Alzheimer Disease, Samaneh Kazemifar Oct 2016

Development Of Anatomical And Functional Magnetic Resonance Imaging Measures Of Alzheimer Disease, Samaneh Kazemifar

Electronic Thesis and Dissertation Repository

Alzheimer disease is considered to be a progressive neurodegenerative condition, clinically characterized by cognitive dysfunction and memory impairments. Incorporating imaging biomarkers in the early diagnosis and monitoring of disease progression is increasingly important in the evaluation of novel treatments. The purpose of the work in this thesis was to develop and evaluate novel structural and functional biomarkers of disease to improve Alzheimer disease diagnosis and treatment monitoring. Our overarching hypothesis is that magnetic resonance imaging methods that sensitively measure brain structure and functional impairment have the potential to identify people with Alzheimer’s disease prior to the onset of cognitive decline. …


Measurement Invariance And Psychometric Properties Of Career Indecision Profile-65 Scores: College Student And Non-College Samples, Casey J. Zobell Oct 2016

Measurement Invariance And Psychometric Properties Of Career Indecision Profile-65 Scores: College Student And Non-College Samples, Casey J. Zobell

Theses and Dissertations

This thesis reports the results of a study conducted to examine psychometric properties of Career Indecision Profile-65 scores, including measurement invariance between college student and non-college samples. The responses of 529 college students and 472 non-college students to an online survey revealed that a four-factor structure fit the data in both samples well. Metric invariance was not supported. Six-week test-retest reliability was found to be high, and in the expected range. The tendency to maximize was found to be correlated strongly with one of the four factors. This study furthered the psychometric research for the Career Indecision Profile-65 and found …


Statistics For Middle And High School Teachers: A Resource For Middle And High School Teachers To Feel Better Prepared To Teach The Common Core State Standards (Ccss) Relating To Statistics, Nanci Kopecky Oct 2016

Statistics For Middle And High School Teachers: A Resource For Middle And High School Teachers To Feel Better Prepared To Teach The Common Core State Standards (Ccss) Relating To Statistics, Nanci Kopecky

All Capstone Projects

The purpose of this project is to create a two-day workshop to better prepare middle and high school teachers to teach probability and statistics as required by the Common Core State Standards (CCSS), which have broadened the mathematics curriculum to include in depth understanding of probability and statistics. Many teachers are not prepared to address probability and statistics concepts. Research has demonstrated a need for greater professional development and resources for teachers in this area. The two-day workshop will allow teachers to review their knowledge and enhance their understanding of statistics by emphasizing student-centered teaching examples. Technology and/or software will …


Multiple Imputation Of Missing Data In Structural Equation Models With Mediators And Moderators Using Gradient Boosted Machine Learning, Robert J. Milletich Ii Oct 2016

Multiple Imputation Of Missing Data In Structural Equation Models With Mediators And Moderators Using Gradient Boosted Machine Learning, Robert J. Milletich Ii

Psychology Theses & Dissertations

Mediation and moderated mediation models are two commonly used models for indirect effects analysis. In practice, missing data is a pervasive problem in structural equation modeling with psychological data. Multiple imputation (MI) is one method used to estimate model parameters in the presence of missing data, while accounting for uncertainty due to the missing data. Unfortunately, commonly used MI methods are not equipped to handle categorical variables or nonlinear variables such as interactions. In this study, we introduce a general MI framework that uses the Bayesian bootstrap (BB) method to generate posterior inferences for indirect effects and gradient boosted machine …


Longitudinal Tidal Dispersion Coefficient Estimation And Total Suspended Solids Transport Characterization In The James River, Beatriz Eugenia Patino Oct 2016

Longitudinal Tidal Dispersion Coefficient Estimation And Total Suspended Solids Transport Characterization In The James River, Beatriz Eugenia Patino

Civil & Environmental Engineering Theses & Dissertations

The longitudinal dispersion coefficient is a parameter used to evaluate the effect of cross-sectional variations on substance mixing mechanisms in estuaries influenced by tide, wind and internal density variations. Considering a two dimensional approach, this study aims at evaluating a tidal area of the lower James River at approximately 19 miles upstream from the mouth at the Chesapeake Bay, in the City of Newport News, and applies an experimental procedure based on in-situ salinity concentrations to estimate the dispersion coefficient in the area where receives a discharge from the HRSD James River Wastewater Treatment Plant, and further characterizes Total Suspended …


A Statistical Approach To Characterize And Detect Degradation Within The Barabasi-Albert Network, Mohd-Fairul Mohd-Zaid Sep 2016

A Statistical Approach To Characterize And Detect Degradation Within The Barabasi-Albert Network, Mohd-Fairul Mohd-Zaid

Theses and Dissertations

Social Network Analysis (SNA) is widely used by the intelligence community when analyzing the relationships between individuals within groups of interest. Hence, any tools that can be quantitatively shown to help improve the analyses are advantageous for the intelligence community. To date, there have been no methods developed to characterize a real world network as a Barabasi-Albert network which is a type of network with properties contained in many real-world networks. In this research, two newly developed statistical tests using the degree distribution and the L-moments of the degree distribution are proposed with application to classifying networks and detecting degradation …


Advances In Portmanteau Diagnostic Tests, Jinkun Xiao Sep 2016

Advances In Portmanteau Diagnostic Tests, Jinkun Xiao

Electronic Thesis and Dissertation Repository

Portmanteau test serves an important role in model diagnostics for Box-Jenkins Modelling procedures. A large number of Portmanteau test based on the autocorrelation function are proposed for a general purpose goodness-of-fit test. Since the asymptotic distributions for the statistics has a complicated form which makes it hard to obtain the p-value directly, the gamma approximation is introduced to obtain the p-value. But the approximation will inevitably introduce approximation errors and needs a large number of observations to yield a good approximation. To avoid some pitfalls in the approximation, the Lin-Mcleod Test is further proposed to obtain a numeric solution to …


Probabilistic Methods In Information Theory, Erik W. Pachas Sep 2016

Probabilistic Methods In Information Theory, Erik W. Pachas

Electronic Theses, Projects, and Dissertations

Given a probability space, we analyze the uncertainty, that is, the amount of information of a finite system, by studying the entropy of the system. We also extend the concept of entropy to a dynamical system by introducing a measure preserving transformation on a probability space. After showing some theorems and applications of entropy theory, we study the concept of ergodicity, which helps us to further analyze the information of the system.


Actuarial Modelling With Mixtures Of Markov Chains, Yuzhou Zhang Aug 2016

Actuarial Modelling With Mixtures Of Markov Chains, Yuzhou Zhang

Electronic Thesis and Dissertation Repository

Multi-state models are widely used in actuarial science because that they provide a convenient way of representing changes in people's statuses. Calculations are easy if one assumes that the model is a Markov chain. However, the memoryless property of a Markov chain is rarely appropriate.

This thesis considers several mixtures of Markov chains to capture the heterogeneity of people's mortality rates, morbidity rates, recovery rates, and ageing speeds. This heterogeneity may be the result of unobservable factors that affect individuals' health. The focus of this thesis is on investigating the behaviours of intensities of the observable transitions in the mixture …


Survival Analysis In A Clinical Setting, Yunzhao Liu Aug 2016

Survival Analysis In A Clinical Setting, Yunzhao Liu

Arts & Sciences Electronic Theses and Dissertations

With the fast paced advancement of modern medicine, cancer treatments have improved greatly over the past few decades; however, the overall survival rate has not improved for head neck squamous cell carcinoma (HNSCC). Traditionally, the general affected population of HNSCC was male over 50-60 years of age, whom have had history of alcohol and tobacco use. Conversely, in the recent decades, HNSCC has exhibited significant rise in younger patients, largely due to the increase in human papillomavirus (HPV) infection among young adults.

Generally, HPV as the most prevalent sexually transmitted disease, consisted of strains that do not cause harm to …


Tornado Density And Return Periods In The Southeastern United States: Communicating Risk And Vulnerability At The Regional And State Levels, Michelle Bradburn Aug 2016

Tornado Density And Return Periods In The Southeastern United States: Communicating Risk And Vulnerability At The Regional And State Levels, Michelle Bradburn

Electronic Theses and Dissertations

Tornado intensity and impacts vary drastically across space, thus spatial and statistical analyses were used to identify patterns of tornado severity in the Southeastern United States and to assess the vulnerability and estimated recurrence of tornadic activity. Records from the Storm Prediction Center's tornado database (1950-2014) were used to estimate kernel density to identify areas of high and low tornado frequency at both the regional- and state-scales. Return periods (2-year, 5-year, 10-year, 25-year, 50-year, and 100-year) were calculated at both scales as well using a composite score that included EF-scale magnitude, injury counts, and fatality counts. Results showed that the …


Newsvendor Models With Monte Carlo Sampling, Ijeoma W. Ekwegh Aug 2016

Newsvendor Models With Monte Carlo Sampling, Ijeoma W. Ekwegh

Electronic Theses and Dissertations

Newsvendor Models with Monte Carlo Sampling by Ijeoma Winifred Ekwegh The newsvendor model is used in solving inventory problems in which demand is random. In this thesis, we will focus on a method of using Monte Carlo sampling to estimate the order quantity that will either maximizes revenue or minimizes cost given that demand is uncertain. Given data, the Monte Carlo approach will be used in sampling data over scenarios and also estimating the probability density function. A bootstrapping process yields an empirical distribution for the order quantity that will maximize the expected profit. Finally, this method will be used …


Spatio-Temporal Analysis Of Point Patterns, Abdul-Nasah Soale Aug 2016

Spatio-Temporal Analysis Of Point Patterns, Abdul-Nasah Soale

Electronic Theses and Dissertations

In this thesis, the basic tools of spatial statistics and time series analysis are applied to the case study of the earthquakes in a certain geographical region and time frame. Then some of the existing methods for joint analysis of time and space are described and applied. Finally, additional research questions about the spatial-temporal distribution of the earthquakes are posed and explored using statistical plots and models. The focus in the last section is in the relationship between number of events per year and maximum magnitude and its effect on how clustered the spatial distribution is and the relationship between …


Controlling For Confounding Network Properties In Hypothesis Testing And Anomaly Detection, Timothy La Fond Aug 2016

Controlling For Confounding Network Properties In Hypothesis Testing And Anomaly Detection, Timothy La Fond

Open Access Dissertations

An important task in network analysis is the detection of anomalous events in a network time series. These events could merely be times of interest in the network timeline or they could be examples of malicious activity or network malfunction. Hypothesis testing using network statistics to summarize the behavior of the network provides a robust framework for the anomaly detection decision process. Unfortunately, choosing network statistics that are dependent on confounding factors like the total number of nodes or edges can lead to incorrect conclusions (e.g., false positives and false negatives). In this dissertation we describe the challenges that face …


Learning From Data: Plant Breeding Applications Of Machine Learning, Alencar Xavier Aug 2016

Learning From Data: Plant Breeding Applications Of Machine Learning, Alencar Xavier

Open Access Dissertations

Increasingly, new sources of data are being incorporated into plant breeding pipelines. Enormous amounts of data from field phenomics and genotyping technologies places data mining and analysis into a completely different level that is challenging from practical and theoretical standpoints. Intelligent decision-making relies on our capability of extracting from data useful information that may help us to achieve our goals more efficiently. Many plant breeders, agronomists and geneticists perform analyses without knowing relevant underlying assumptions, strengths or pitfalls of the employed methods. The study endeavors to assess statistical learning properties and plant breeding applications of supervised and unsupervised machine learning …


Extreme-Strike And Small-Time Asymptotics For Gaussian Stochastic Volatility Models, Xin Zhang Aug 2016

Extreme-Strike And Small-Time Asymptotics For Gaussian Stochastic Volatility Models, Xin Zhang

Open Access Dissertations

Asymptotic behavior of implied volatility is of our interest in this dissertation. For extreme strike, we consider a stochastic volatility asset price model in which the volatility is the absolute value of a continuous Gaussian process with arbitrary prescribed mean and covariance. By exhibiting a Karhunen-Loève expansion for the integrated variance, and using sharp estimates of the density of a general second-chaos variable, we derive asymptotics for the asset price density for large or small values of the variable, and study the wing behavior of the implied volatility in these models. Our main result provides explicit expressions for the first …


The Design And Statistical Analysis Of Single-Cell Rna-Sequencing Experiments, Faye H. Zheng Aug 2016

The Design And Statistical Analysis Of Single-Cell Rna-Sequencing Experiments, Faye H. Zheng

Open Access Dissertations

Next-generation DNA- and RNA-sequencing (RNA-seq) technologies have expanded rapidly in both throughput and accuracy within the last decade. The momentum continues as emerging techniques become increasingly capable of profiling molecular content at the level of individual cells. One goal of this research is to put forward best practices in the design of single-cell RNA-sequencing (scRNA-seq) experiments, specifically as it relates to choices regarding the trade-off between sequencing depth and sample size. In addition to general guidelines, an interactive tool is presented to aid researchers in making experiment-specific decisions that are informed by real data and practical constraints. Further, a new …


Model-Free Variable Screening, Sparse Regression Analysis And Other Applications With Optimal Transformations, Qiming Huang Aug 2016

Model-Free Variable Screening, Sparse Regression Analysis And Other Applications With Optimal Transformations, Qiming Huang

Open Access Dissertations

Variable screening and variable selection methods play important roles in modeling high dimensional data. Variable screening is the process of filtering out irrelevant variables, with the aim to reduce the dimensionality from ultrahigh to high while retaining all important variables. Variable selection is the process of selecting a subset of relevant variables for use in model construction. The main theme of this thesis is to develop variable screening and variable selection methods for high dimensional data analysis. In particular, we will present two relevant methods for variable screening and selection under a unified framework based on optimal transformations.

In the …


Maximum Empirical Likelihood Estimation In U-Statistics Based General Estimating Equations, Lingnan Li Aug 2016

Maximum Empirical Likelihood Estimation In U-Statistics Based General Estimating Equations, Lingnan Li

Open Access Dissertations

In the first part of this thesis, we study maximum empirical likelihood estimates (MELE's) in U-statistics based general estimating equations (UGEE's). Our technical maneuver is the jackknife empirical likelihood (JEL) approach. We give the local uniform asymptotic normality condition for the log-JEL for UGEE's. We derive the estimating equations for finding MELE's and provide their asymptotic normality. We obtain easy MELE's which have less computational burden than the usual MELE's and can be easily implemented using existing software. We investigate the use of side information of the data to improve efficiency. We exhibit that the MELE's are fully efficient, and …


Some Nonparametric Ordered Restricted Inference Problems In The Context Of A Statistical Education Study, Bradford M. Dykes Aug 2016

Some Nonparametric Ordered Restricted Inference Problems In The Context Of A Statistical Education Study, Bradford M. Dykes

Dissertations

Over the past 10 years, the Department of Statistics at Western Michigan University has developed a question generating system that can be used for creating multiple forms of exams, quizzes and homework for online and face-to-face use. This system can also be used to provide students with a form of instantaneous feedback. With the goal of analyzing how different levels of feedback in an online learning environment impacts students' performance on assignments, this study presents data collected on two semesters of students enrolled in three different meeting types (strictly online, typical face-to-face, and honors face-to-face) of an introductory Statistics course. …


Multilevel Models For Longitudinal Data, Aastha Khatiwada Aug 2016

Multilevel Models For Longitudinal Data, Aastha Khatiwada

Electronic Theses and Dissertations

Longitudinal data arise when individuals are measured several times during an ob- servation period and thus the data for each individual are not independent. There are several ways of analyzing longitudinal data when different treatments are com- pared. Multilevel models are used to analyze data that are clustered in some way. In this work, multilevel models are used to analyze longitudinal data from a case study. Results from other more commonly used methods are compared to multilevel models. Also, comparison in output between two software, SAS and R, is done. Finally a method consisting of fitting individual models for each …


Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia Aug 2016

Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia

Electronic Theses and Dissertations

This dissertation consists of two interconnected research projects. The first project was a study of propensity scores based statistical methods for estimating the average treatment effect (ATE) and the average treatment effect among treated (ATT) when there are two treatment groups. The ATE is defined as the mean of the individual causal effects in the whole population, while ATT is defined as the treatment effect for the treated population. Propensity score based statistical methods, such as matching, regression, stratification, inverse probability weighting (IPW), and doubly robust (DR) methods were used to estimate the ATE and ATT. Simulation studies and case …


The Influence Of The Electric Supply Industry On Economic Growth In Less Developed Countries, Edward Richard Bee Aug 2016

The Influence Of The Electric Supply Industry On Economic Growth In Less Developed Countries, Edward Richard Bee

Dissertations

This study measures the impact that electrical outages have on manufacturing production in 135 less developed countries using stochastic frontier analysis and data from World Bank’s Investment Climate surveys. Outages of electricity, for firms with and without backup power sources, are the most frequently cited constraint on manufacturing growth in these surveys.

Outages are shown to reduce output below the production frontier by almost five percent in Africa and by a lower percentage in South Asia, Southeast Asia and the Middle East and North Africa. Production response to outages is quadratic in form. Outages also increase labor cost, reduce exports …


Utilizing Computed Tomography Image Features To Advance Prediction Of Radiation Pneumonitis, Shane P. Krafft Aug 2016

Utilizing Computed Tomography Image Features To Advance Prediction Of Radiation Pneumonitis, Shane P. Krafft

Dissertations & Theses (Open Access)

Improving outcomes for non-small-cell lung cancer patients treated with radiation therapy (RT) requires optimizing the balance between local tumor control and risk of normal tissue toxicity. In approximately 20% of patients, severe acute symptomatic lung toxicity, termed radiation pneumonitis (RP), still occurs. Identifying the individuals at risk of RP prior to or early during treatment offers tremendous potential to improve RT by providing the physician with information to assist in making clinical decisions that enhance therapy. Our central goal for this work was to demonstrate the potential gain in predictive accuracy of normal tissue complication probability models for RP by …


Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro Aug 2016

Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro

Doctoral Dissertations

This dissertation is a collection of examples, algorithms, and techniques for researchers interested in selecting influential variables from statistical regression models. Chapters 1, 2, and 3 provide background information that will be used throughout the remaining chapters, on topics including but not limited to information complexity, model selection, covariance estimation, stepwise variable selection, penalized regression, and especially the genetic algorithm (GA) approach to variable subsetting.

In chapter 4, we fully develop the framework for performing GA subset selection in logistic regression models. We present advantages of this approach against stepwise and elastic net regularized regression in selecting variables from a …


Advanced Sequential Monte Carlo Methods And Their Applications To Sparse Sensor Network For Detection And Estimation, Kai Kang Aug 2016

Advanced Sequential Monte Carlo Methods And Their Applications To Sparse Sensor Network For Detection And Estimation, Kai Kang

Doctoral Dissertations

The general state space models present a flexible framework for modeling dynamic systems and therefore have vast applications in many disciplines such as engineering, economics, biology, etc. However, optimal estimation problems of non-linear non-Gaussian state space models are analytically intractable in general. Sequential Monte Carlo (SMC) methods become a very popular class of simulation-based methods for the solution of optimal estimation problems. The advantages of SMC methods in comparison with classical filtering methods such as Kalman Filter and Extended Kalman Filter are that they are able to handle non-linear non-Gaussian scenarios without relying on any local linearization techniques. In this …


Numerical Solutions Of Stochastic Differential Equations, Liguo Wang Aug 2016

Numerical Solutions Of Stochastic Differential Equations, Liguo Wang

Doctoral Dissertations

In this dissertation, we consider the problem of simulation of stochastic differential equations driven by Brownian motions or the general Levy processes. There are two types of convergence for a numerical solution of a stochastic differential equation, the strong convergence and the weak convergence. We first introduce the strong convergence of the tamed Euler-Maruyama scheme under non-globally Lipschitz conditions, which allow the polynomial growth for the drift and diffusion coefficients. Then we prove a new weak convergence theorem given that the drift and diffusion coefficients of the stochastic differential equation are only twice continuously differentiable with bounded derivatives up to …


Regional Dynamic Price Relationships Of Distillers Dried Grains In U.S. Feed Markets, Matthew Fulton Johnson Aug 2016

Regional Dynamic Price Relationships Of Distillers Dried Grains In U.S. Feed Markets, Matthew Fulton Johnson

Masters Theses

Distillers dried grains with solubles (DDGS) is now a mainstream substitute in U.S. animal feed rations. DDGS is rich in fat and protein content and serves as a competitive feed source in livestock markets. The objective of this study is to identify dynamic price relationships among DDGS, corn, soybean meal, and livestock outputs in context of specific livestock sectors and their geographic location. Four locations associated with a predominant livestock sector are selected for analysis by measuring density and relative proportion of a livestock sector’s grain consumption at the county level. A vector error correction model is applied to post-mandate …