Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication
-
- Statistical Science Theses and Dissertations (6)
- Theses and Dissertations (2)
- USF Tampa Graduate Theses and Dissertations (2)
- Doctoral Dissertations (1)
- Education Policy and Leadership Theses and Dissertations (1)
-
- Electronic Theses, Projects, and Dissertations (1)
- Graduate Research Theses & Dissertations (1)
- LSU Doctoral Dissertations (1)
- Senior Honors Projects, 2020-current (1)
- Senior Theses (1)
- Theses: Doctorates and Masters (1)
- Undergraduate Honors Theses (1)
- Williams Honors College, Honors Research Projects (1)
Articles 1 - 20 of 20
Full-Text Articles in Entire DC Network
Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang
Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang
Statistical Science Theses and Dissertations
This dissertation investigates: (1) A Bayesian Semi-supervised Approach to Keyphrase Extraction with Only Positive and Unlabeled Data, (2) Jackknife Empirical Likelihood Confidence Intervals for Assessing Heterogeneity in Meta-analysis of Rare Binary Events.
In the big data era, people are blessed with a huge amount of information. However, the availability of information may also pose great challenges. One big challenge is how to extract useful yet succinct information in an automated fashion. As one of the first few efforts, keyphrase extraction methods summarize an article by identifying a list of keyphrases. Many existing keyphrase extraction methods focus on the unsupervised setting, …
Examining Multiple Imputation For Measurement Error Correction In Count Data With Excess Zeros, Shalima Zalsha
Examining Multiple Imputation For Measurement Error Correction In Count Data With Excess Zeros, Shalima Zalsha
Statistical Science Theses and Dissertations
Measurement error and missing data are two common problems in wildlife population surveys. These data are collected from the environment and may be missing or measured with error when the observer’s ability to see the animal is obscured. Methods such as video transects for estimating red snapper abundance and aerial surveys for estimating moose population sizes are highly affected by these problems since total abundance will be underestimated if missing/mismeasured counts are ignored. We shall refer to this problem as visibility bias; it occurs when the true counts are observed when visibility is high, partially observed when visibility is low …
Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu
Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu
Statistical Science Theses and Dissertations
In this dissertation, improved statistical methods for time-series and lifetime data are developed. First, an improved trend test for time series data is presented. Then, robust parametric estimation methods based on system lifetime data with known system signatures are developed.
In the first part of this dissertation, we consider a test for the monotonic trend in time series data proposed by Brillinger (1989). It has been shown that when there are highly correlated residuals or short record lengths, Brillinger’s test procedure tends to have significance level much higher than the nominal level. This could be related to the discrepancy between …
Biennial And Low-Frequency Components Of El Niño/Southern Oscillation, James Michael Ryan
Biennial And Low-Frequency Components Of El Niño/Southern Oscillation, James Michael Ryan
Theses and Dissertations
El Niño/Southern Oscillation (ENSO) is a coupled oscillation of sea surface temperatures (SSTs), winds, and air pressure in the eastern and central tropical Pacific, that repeats with quasi-regularity, every 2–7 years. Although the ENSO’s spectral peak is found at a 4–7-yr period, composite El Niño events, taken as the 84 months before and after the peak of each El Niño, show that the length of each event, and often the following La Niña if there is one, usually falls within a quasi-biennial (QB) range of around 18–42 months. We argue that the biennial range of ENSO events stems from the …
Bayesian Topological Machine Learning, Christopher A. Oballe
Bayesian Topological Machine Learning, Christopher A. Oballe
Doctoral Dissertations
Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …
Southwest Pacific Tropical Cyclone Frequency And Intensity Related To Observed And Modeled Geophysical And Aerosol Variables, Rupsa Bhowmick
Southwest Pacific Tropical Cyclone Frequency And Intensity Related To Observed And Modeled Geophysical And Aerosol Variables, Rupsa Bhowmick
LSU Doctoral Dissertations
The dissertation focuses on western region of Southwest Pacific Ocean (SWPO)
basin (135E - 180, and 5S - 35S) tropical cyclone (TC) climatology using observed
and modeled data. The classification-based machine learning approach
identifies the synoptic geophysical and aerosol environment favorable or unfavorable
for TC intensification and intensity change prior to landfall incorporating
observational and satellite data. A multiple poisson regression model with varying
temporal monthly lags was used to build a relationship between the number of
monthly TC days with basin wide average dust aerosol optical depth (AOD), sea
surface temperature (SST), and upper ocean temperature (UOT). This idea …
Causal Inference And Prediction On Observational Data With Survival Outcomes, Xiaofei Chen
Causal Inference And Prediction On Observational Data With Survival Outcomes, Xiaofei Chen
Statistical Science Theses and Dissertations
Infants with hypoplastic left heart syndrome require an initial Norwood operation, followed some months later by a stage 2 palliation (S2P). The timing of S2P is critical for the operation’s success and the infant’s survival, but the optimal timing, if one exists, is unknown. We attempt to estimate the optimal timing of S2P by analyzing data from the Single Ventricle Reconstruction Trial (SVRT), which randomized patients between two different types of Norwood procedure. In the SVRT, the timing of the S2P was chosen by the medical team; thus with respect to this exposure, the trial constitutes an observational study, and …
Bayesian Reliability Analysis For Optical Media Using Accelerated Degradation Test Data, Kun Bu
Bayesian Reliability Analysis For Optical Media Using Accelerated Degradation Test Data, Kun Bu
USF Tampa Graduate Theses and Dissertations
ISO (the International Organization for Standardization) 10995:2011 is the inter-national standard providing guidelines for assessing the reliability and service life of optical media, which is designed to be highly reliable and possesses a long lifetime. A well-known challenge of reliability analysis for highly reliable devices is that it is hard to obtain sufficient failure data under their normal use conditions. Accelerated degradation tests (ADTs) are commonly used to quickly obtain physical degradation data under elevated stress conditions, which are then extrapolated to predict reliability under the normal use condition. This standard achieves the estimation of the lifetime of recordable media, …
Research In Short Term Actuarial Modeling, Elijah Howells
Research In Short Term Actuarial Modeling, Elijah Howells
Electronic Theses, Projects, and Dissertations
This paper covers mathematical methods used to conduct actuarial analysis in the short term, such as policy deductible analysis, maximum covered loss analysis, and mixtures of distributions. Assessment of a loss variable's distribution under the effect of a policy deductible, as well as one with an implemented maximum covered loss, and under both a policy deductible and maximum covered loss will also be covered. The derivation, meaning, and use of cost per loss and cost per payment will be discussed, as will those of an aggregate sum distribution, stop loss policy, and maximum likelihood estimation. For each topic, special cases …
A Study Of Cusum Statistics On Bitcoin Transactions, Ivan Perez
A Study Of Cusum Statistics On Bitcoin Transactions, Ivan Perez
Theses and Dissertations
In this thesis, our objective is to study the relationship between transaction price and volume in the BTC/USD Coinbase exchange. In the second chapter, we develop a consecutive CUSUM algorithm to detect instantaneous changes in the arrival rate of market orders. We begin by estimating a baseline rate using the assumption of a local time-homogeneous Poisson process. Our observations lead us to reject the plausibility of a time-homogeneous Poisson model on a more global scale by using a chi squared test. We thus proceed to use CUSUM-based alarms to detect consecutive upward and downward changes in the arrival rate of …
Evaluation Of The Utility Of Informative Priors In Bayesian Structural Equation Modeling With Small Samples, Hao Ma
Education Policy and Leadership Theses and Dissertations
The estimation of parameters in structural equation modeling (SEM) has been primarily based on the maximum likelihood estimator (MLE) and relies on large sample asymptotic theory. Consequently, the results of the SEM analyses with small samples may not be as satisfactory as expected. In contrast, informative priors typically do not require a large sample, and they may be helpful for improving the quality of estimates in the SEM models with small samples. However, the role of informative priors in the Bayesian SEM has not been thoroughly studied to date. Given the limited body of evidence, specifying effective informative priors remains …
Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, Lochana Palayangoda
Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, Lochana Palayangoda
Statistical Science Theses and Dissertations
For degradation data in reliability analysis, estimation of the first-passage time (FPT) distribution to a threshold provides valuable information on reliability characteristics. Recently, Balakrishnan and Qin (2019; Applied Stochastic Models in Business and Industry, 35:571-590) studied a nonparametric method to approximate the FPT distribution of such degradation processes if the underlying process type is unknown. In this thesis, we propose improved techniques based on saddlepoint approximation, which enhance upon their suggested methods. Numerical examples and Monte Carlo simulation studies are used to illustrate the advantages of the proposed techniques. Limitations of the improved techniques are discussed and some possible solutions …
Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen
Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen
Statistical Science Theses and Dissertations
In this dissertation, we explore sensitivity analyses under three different types of incomplete data problems, including missing outcomes, missing outcomes and missing predictors, potential outcomes in \emph{Rubin causal model (RCM)}. The first sensitivity analysis is conducted for the \emph{missing completely at random (MCAR)} assumption in frequentist inference; the second one is conducted for the \emph{missing at random (MAR)} assumption in likelihood inference; the third one is conducted for one novel assumption, the ``sixth assumption'' proposed for the robustness of instrumental variable estimand in causal inference.
A Novel Approach To Updating Municipal Tax Parcel Impervious Surface Calculations, Patrick D. Muradaz
A Novel Approach To Updating Municipal Tax Parcel Impervious Surface Calculations, Patrick D. Muradaz
Senior Honors Projects, 2020-current
Accurate impervious surface calculations are important to many municipalities due to the high volumes of surface rainwater runoff caused by high impervious surface density. Municipalities must deal with this runoff through the establishment and maintenance of drainage facilities. To help offset the added cost of these facilities, many municipalities impose taxes and fees on privately owned impervious surfaces such as homes, driveways, and patios. Currently, in order for a city like Harrisonburg to calculate tax parcel impervious surface density, aerial images must be manually digitized or mapped using computer-based classification techniques using predictive models. These methods of impervious surface calculations …
Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice
Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice
Senior Theses
The goal of this thesis is to model the probability of a high school football player’s chance of being drafted based on information taken from their recruiting profile. The response variable is binary and defined as drafted (1) or undrafted (0). The independent variables were collected by scraping data from the recruiting websites including height, weight, position, hometown, recruiting grade and other socioeconomic factors based on the player’s high school. 247Sports and ESPN were the two recruiting services used and compared in this study. Because of the binary nature of the dependent variable, logistic regression and decision trees were chosen …
An Actuarial Approach To Personal Injury Protection Severity, Jason Colgrove
An Actuarial Approach To Personal Injury Protection Severity, Jason Colgrove
Undergraduate Honors Theses
Insurance companies examine the risk of financial losses for their policyholders as a way to accurately price insurance policies. Within the automobile insurance sector, the frequency of crashes and the associated liabilities started to increase in late 2013 when it had been on the decline for close to a decade. The purpose of this research focuses on the possible correlated variables that could lead to a better understanding of this change. To embark on this task, we teamed up with the Society of Actuaries, Casualty Actuarial Society, and the American Property Casualty Insurance Association to obtain data regarding frequency, severity, …
Gradient Boosting For Survival Analysis With Applications In Oncology, Nam Phuong Nguyen
Gradient Boosting For Survival Analysis With Applications In Oncology, Nam Phuong Nguyen
USF Tampa Graduate Theses and Dissertations
Cancer is one of the most deadly diseases that the world has been fighting against over decades. An enormous number of research has been conducted, via a wide scale of approaches, raging from genetic analysis to mathematical modeling. Survival analysis is a well-performed methodology frequently used to estimate the survival probability of a patient. Although there has been a large number of methods for survival analysis, efficient exploration of a high-dimensional feature space has been challenging due to its computational cost and complexity. This thesis adapts the component-wise gradient boosting algorithms for cancer survival analysis, and also proposes a new …
Bayesian Approach To Finding The Most Likely Circuit Structure, Shannon Harms
Bayesian Approach To Finding The Most Likely Circuit Structure, Shannon Harms
Graduate Research Theses & Dissertations
Systems, and their reliabilities, depend on the reliabilities of the components that theyare composed of, and in this paper we want to nd the system structure that is the most likely given observed data. Bayesian methods were utilized in order to discover the posterior means, or observed reliabilities, of both the components and the systems. Assuming the serial and parallel system structures have independent components, we calculated system reliabilities based on observed component reliabilities by using the multiplication and addi- tion probability rules. We are then able to expand upon the numerical comparison method through a maximum likelihood analysis that …
An Examination Of Covid-19 Statistical Modeling, Shane Vaughan
An Examination Of Covid-19 Statistical Modeling, Shane Vaughan
Williams Honors College, Honors Research Projects
The 2019 novel coronavirus, also known as COVID-19, is an infectious disease which was first reported in late 2019 and soon spread to become a global pandemic, prompting major action from world governments. Soon after, many institutions began attempts to analyze and predict the spread and severity of the disease via statistical modeling. Some information is not available for public consumption; however, a number of institutions have published the results of their analyses and some have made public repositories of the code used to build the models. This research paper attempts use these and other resources to examine the modeling …
Deriving Statistical Inference From The Application Of Artificial Neural Networks To Clinical Metabolomics Data, Kevin M. Mendez
Deriving Statistical Inference From The Application Of Artificial Neural Networks To Clinical Metabolomics Data, Kevin M. Mendez
Theses: Doctorates and Masters
Metabolomics data are complex with a high degree of multicollinearity. As such, multivariate linear projection methods, such as partial least squares discriminant analysis (PLS-DA) have become standard. Non-linear projections methods, typified by Artificial Neural Networks (ANNs) may be more appropriate to model potential nonlinear latent covariance; however, they are not widely used due to difficulty in deriving statistical inference, and thus biological interpretation. Herein, we illustrate the utility of ANNs for clinical metabolomics using publicly available data sets and develop an open framework for deriving and visualising statistical inference from ANNs equivalent to standard PLS-DA methods.