Statistical Computational Topology And Geometry For Understanding Data, 2017 University of Tennessee, Knoxville
Statistical Computational Topology And Geometry For Understanding Data, Joshua Lee Mike
Here we describe three projects involving data analysis which focus on engaging statistics with the geometry and/or topology of the data.
The first project involves the development and implementation of kernel density estimation for persistence diagrams. These kernel densities consider neighborhoods for every feature in the center diagram and gives to each feature an independent, orthogonal direction. The creation of kernel densities in this realm yields a (previously unavailable) full characterization of the (random) geometry of a dataspace or data distribution.
In the second project, cohomology is used to guide a search for kidney exchange cycles within a kidney ...
Tree-Based Regression For Interval-Valued Data, 2017 Utah State University
Tree-Based Regression For Interval-Valued Data, Chih-Ching Yeh
All Graduate Plan B and other Reports
Regression methods for interval-valued data have been increasingly studied in recent years. As most of the existing works focus on linear models, it is important to note that many problems in practice are nonlinear in nature and therefore development of nonlinear regression tools for intervalvalued data is crucial. In this project, we propose a tree-based regression method for interval-valued data, which is well applicable to both linear and nonlinear problems. Unlike linear regression models that usually require additional constraints to ensure positivity of the predicted interval length, the proposed method estimates the regression function in a nonparametric way, so the ...
A Tail-Based Test For Differential Expression Analysis And Pathway Analysis In Rna-Sequencing Data, 2017 The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences
A Tail-Based Test For Differential Expression Analysis And Pathway Analysis In Rna-Sequencing Data, Jiong Chen
UT GSBS Dissertations and Theses (Open Access)
RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and pathway analysis. Such data at the exon-level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for ...
Novel Bayesian Adaptive Clinical Trial Designs In Early Phases, 2017 The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences
Novel Bayesian Adaptive Clinical Trial Designs In Early Phases, Haitao Pan
UT GSBS Dissertations and Theses (Open Access)
Early phase, or phase I and phase II, trials are the first step in testing new medicines that have been developed in the lab. The main goal of phase I clinical trials is to establish the recommended dose of new drugs for phase II trials. For the cytotoxic drugs, the goal is to find maximum tolerated dose (MTD). The guiding principle for dose escalation in phase I trials is to avoid exposing too many patients to subtherapeutic doses while preserving safety and maintaining rapid accrual. Therefore, dose escalation methods, especially Bayesian designs, are recommended to be used in phase I ...
Methods For Scalar-On-Function Regression, 2017 Columbia University
Methods For Scalar-On-Function Regression, Philip T. Reiss, Jeff Goldsmith, Han Lin Shang, R. Todd Ogden
Philip T. Reiss
Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, 2017 University of Nebraska-Lincoln
Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz
CSE Journal Articles
Evolutionary studies usually assume that the genetic mutations are independent of each other. However, that does not imply that the observed mutations are independent of each other because it is possible that when a nucleotide is mutated, then it may be biologically beneficial if an adjacent nucleotide mutates too. With a number of decoded genes currently available in various genome libraries and online databases, it is now possible to have a large-scale computer-based study to test whether the independence assumption holds for pairs of adjacent amino acids. Hence the independence question also arises for pairs of adjacent amino acids within ...
Marketing The Mountain State: A Large N Study Of User Engagement On Twitter, 2017 Illinois State University
Marketing The Mountain State: A Large N Study Of User Engagement On Twitter, Kirk Richardson
Capstone Projects – Politics and Government
Much of the evolving research on the use of social media in destination marketing emphasizes how information diffusion influences the reputational image of place. The present study uses Twitter data to focus on the relative differences in user engagement across discrete account types. Specifically, this is done to examine how the official destination marketing organization of Montana—the Montana Office of Tourism (MTOT)—performs relative to other account types. Several regression analyses conducted on Twitter data associated with an ongoing MTOT place branding campaign reveal that tweets sent from ‘official’ accounts are more likely to be retweeted, and are estimated ...
A Comparison Of Some Confidence Intervals For Estimating The Kurtosis Parameter, 2017 Florida International University
A Comparison Of Some Confidence Intervals For Estimating The Kurtosis Parameter, Guensley Jerome
FIU Electronic Theses and Dissertations
Several methods have been proposed to estimate the kurtosis of a distribution. The three common estimators are: g2, G2 and b2. This thesis addressed the performance of these estimators by comparing them under the same simulation environments and conditions. The performance of these estimators are compared through confidence intervals by determining the average width and probabilities of capturing the kurtosis parameter of a distribution. We considered and compared classical and non-parametric methods in constructing these intervals. Classical method assumes normality to construct the confidence intervals while the non-parametric methods rely on bootstrap techniques. The bootstrap techniques used ...
Propensity Score Analysis With Matching Weights, 2017 Cleveland Clinic
Propensity Score Analysis With Matching Weights, Liang Li
The propensity score analysis is one of the most widely used methods for studying the causal treatment effect in observational studies. This paper studies treatment effect estimation with the method of matching weights. This method resembles propensity score matching but offers a number of new features including efficient estimation, rigorous variance calculation, simple asymptotics, statistical tests of balance, clearly identified target population with optimal sampling property, and no need for choosing matching algorithm and caliper size. In addition, we propose the mirror histogram as a useful tool for graphically displaying balance. The method also shares some features of the inverse ...
Incorporating Place And Space: A Hierarchical Spatial Approach To Exploring Preventable Congestive Heart Failure Hospitalizations In New York City, Rachael Weiss Riley
Dissertations and Theses
Background: Faced with rising medical care costs, increasing prevalence, and widening health disparities, preventing congestive heart failure (CHF) hospitalizations is a central public health concern. Despite evidence of geographical clustering in preventable CHF admissions, there is a lack of research designed to examine spatial patterning of CHF and the local area neighborhood determinants that contribute to this variability. This study sought to assess and evaluate the importance of both space and place in analyzing preventable CHF hospitalizations and readmissions by applying appropriate statistical techniques, clarifying the assumption inherent in each method, and interpreting the findings within the context of existing ...
Mechanistic Mathematical Models: An Underused Platform For Hpv Research, 2017 George Washington University
Mechanistic Mathematical Models: An Underused Platform For Hpv Research, Marc Ryser, Patti Gravitt, Evan R. Myers
Global Health Faculty Publications
Health economic modeling has become an invaluable methodology for the design and evaluation of clinical and public health interventions against the human papillomavirus (HPV) and associated diseases. At the same time, relatively little attention has been paid to a different yet complementary class of models, namely that of mechanistic mathematical models. The primary focus of mechanistic mathematical models is to better understand the intricate biologic mechanisms and dynamics of disease. Inspired by a long and successful history of mechanistic modeling in other biomedical fields, we highlight several areas of HPV research where mechanistic models have the potential to advance the ...
Semiparametric Fractional Imputation Using Empirical Likelihood In Survey Sampling, 2017 University of Oklahoma
Semiparametric Fractional Imputation Using Empirical Likelihood In Survey Sampling, Sixia Chen, Jae Kwang Kim
The empirical likelihood method is a powerful tool for incorporating moment conditions in statistical inference. We propose a novel application of the empirical likelihood for handling item nonresponse in survey sampling. The proposed method takes the form of fractional imputation (Kim, 2011) but it does not require parametric model assumptions. Instead, only the first moment condition based on a regression model is assumed and the empirical likelihood method is applied to the observed residuals to get the fractional weights. The resulting semiparametric fractional imputation provides -consistent estimates for various parameters. Variance estimation is implemented using a jackknife method. Two limited ...
Swine Respiratory Disease Minimally Affects Responses Of Nursery Pigs To Gas Euthanasia, 2017 Iowa State University
Swine Respiratory Disease Minimally Affects Responses Of Nursery Pigs To Gas Euthanasia, Larry J. Sadler, Locke A. Karriker, Anna K. Butters-Johnson, Kent J. Schwartz, Tina M. Widowski, Chong Wang, Suzanne T. Millman
Anna K. Butters-Johnson
Objectives: To assess effects of swine respiratory disease (SRD) on nursery pig responses during gas euthanasia and to compare responses to carbon dioxide (CO2) and argon (Ar) gas euthanasia in terms of efficacy and welfare. Materials and methods: Fifty-four pigs identified for euthanasia were classified as having SRD or euthanized for other reasons (OT). These pigs were distributed among three treatments: prefill CO2 (P-CO2), gradual fill CO2(G-CO2), and prefill Ar (P-Ar). Behavioral and physiological indicators of efficacy and welfare were assessed directly and from video. Modified atmosphere CO2 and O2concentrations (%) were collected throughout the process. Results: Respiratory disease status ...
Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons
Student Scholar Symposium Abstracts and Posters
After going on the Warner Brothers Tour in December of 2015, I created a Gilmore Girls Instagram account. This account, which started off as a way for me to create edits of the show and post my photos from the tour turned into something bigger than I ever could have imagined. In just over a year I have over 55,000 followers. I post content including revival news, merchandise, and edits of the show that have been featured in Entertainment Weekly, Bustle, E! News, People Magazine, Yahoo News, & GilmoreNews.
I created a dataset of qualitative and quantitative outcomes from my ...
Washington State Public Teachers' Ambient Positional Instability From A Statistical Approach Of Retrospective Study & Prospective Study, Bowen Cai, Robert Boruch
The purpose of this research is to study the movements of teachers’ churn rate in the state of Washington over the past 14 years. The research of teachers’ churn rate is an integrative study, with retrospective part and prospective part. Retrospective study includes the analysis of descriptive statistics (level I), statistical inference (level II) and causal inference (level III) (Berk, R.A. (2016) Statistical Learning from a Regression Perspective. Philadelphia, PA: Springer). Prospective study is mainly about forecasting and statistical inference that generated from the predictions. In this research, we are using longitudinal data analysis. The good point of longitudinal ...
Telephone Polls And Pps Sampling: A Potential Boon To The Polling Industry, 2017 Utah State University
Telephone Polls And Pps Sampling: A Potential Boon To The Polling Industry, Jade Mckay Burt
Undergraduate Honors Capstone Projects
In the wake of the 2016 election, the polling industry has no shortage of critics. While these are difficult times for the industry as a whole, there are exciting innovations happening that will serve to benefit and revitalize the industry for years. One of these exciting innovations is Probability Proportional to Size (PPS) sampling. I will elaborate on what PPS sampling is and provide a mathematical foundation for its use in polling. I also discuss what some of the myriad of issues plaguing the polling industry are and then show how PPS sampling can be used to remedy many of ...
Denoising Tandem Mass Spectrometry Data, 2017 East Tennessee State Universtiy
Denoising Tandem Mass Spectrometry Data, Felix Offei
Electronic Theses and Dissertations
Protein identification using tandem mass spectrometry (MS/MS) has proven to be an effective way to identify proteins in a biological sample. An observed spectrum is constructed from the data produced by the tandem mass spectrometer. A protein can be identified if the observed spectrum aligns with the theoretical spectrum. However, data generated by the tandem mass spectrometer are affected by errors thus making protein identification challenging in the field of proteomics. Some of these errors include wrong calibration of the instrument, instrument distortion and noise. In this thesis, we present a pre-processing method, which focuses on the removal of ...
On Post-Selection Confidence Intervals In Linear Regression, 2017 Washington University in St. Louis
On Post-Selection Confidence Intervals In Linear Regression, Xinwei Zhang
Arts & Sciences Electronic Theses and Dissertations
The general goal of this thesis is to investigate and examine some issues about post-selection inference which arises from the setting where statistical inference is carried out after a datadriven model selection step. In this setting, the classical inference theory which requires a fixed priori model becomes invalid since the selected model is a result of random event. Hence, a common practice in applied research which ignores the model selection and builds up confidence interval will result in misleading or even false conclusion. In this thesis, specifically, we first discusses some examples to show how the classical inference theory loses ...
Comparision Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers, Richard Cutler Dr.
All Graduate Plan B and other Reports
Survival analysis methods are a mainstay of the biomedical fields but are finding increasing use in other disciplines including finance and engineering. A widely used tool in survival analysis is the Cox proportional hazards regression model. For this model, all the predicted survivor curves have the same basic shape, which may not be a good approximation to reality. In contrast the Random Survival Forests does not make the proportional hazards assumption and has the flexibility to model survivor curves that are of quite different shapes for different groups of subjects. We applied both techniques to a number of publicly available ...
Error Costs, Legal Standards Of Proof And Statistical Significance, 2017 Charles River Associates (CRA) International
Error Costs, Legal Standards Of Proof And Statistical Significance, Michelle Burtis, Jonah B. Gelbach, Bruce H. Kobayashi
Faculty Scholarship at Penn Law
The relationship between legal standards of proof and thresholds of statistical significance is a well-known and studied phenomena in the academic literature. Moreover, the distinction between the two has been recognized in law. For example, in Matrix v. Siracusano, the Court unanimously rejected the petitioner’s argument that the issue of materiality in a securities class action can be defined by the presence or absence of a statistically significant effect. However, in other contexts, thresholds based on fixed significance levels imported from academic settings continue to be used as a legal standard of proof. Our positive analysis demonstrates how a ...