Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 23 of 23

Full-Text Articles in Physical Sciences and Mathematics

Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako Nov 2023

Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako

Doctoral Dissertations

This dissertation is in the field of Nonparametric Derivative Estimation using
Penalized Splines. It is conducted in two parts. In the first part, we study the L2
convergence rates of estimating derivatives of mean regression functions using penalized splines. In 1982, Stone provided the optimal rates of convergence for estimating derivatives of mean regression functions using nonparametric methods. Using these rates, Zhou et. al. in their 2000 paper showed that the MSE of derivative estimators based on regression splines approach zero at the optimal rate of convergence. Also, in 2019, Xiao showed that, under some general conditions, penalized spline estimators …


Inverse Probability Weighting In Survival Analysis And Network Analysis, Yukun Lu Feb 2023

Inverse Probability Weighting In Survival Analysis And Network Analysis, Yukun Lu

Doctoral Dissertations

Inverse probability weighting is a popular technique to accommodate selection bias due to non-random sampling and missing data. In the first chapter, we develop an inverse probability weighted estimator and an augmented inverse probability weighted estimator of regression coefficients for a linear model with randomly censored covariates, when the censoring mechanism may be dependent on the outcome. We investigate the asymptotic properties of both estimators and evaluate their finite sample performance through extensive simulation studies. We apply the proposed methods to an Alzheimer’s disease study. In the second chapter, we present an application of network analysis in a study of …


Methods To Improve Inference From Dependent Network Data, Dongah Kim Feb 2022

Methods To Improve Inference From Dependent Network Data, Dongah Kim

Doctoral Dissertations

Over the past decade, network research has increased dramatically. Network data are used in many fields because they contain not only covariates of each observation, but also `relationships' between observations. Therefore, statistical analysis of network data has been rapidly developed. However, network data presents many challenges, such as collecting network data, inferring the prevalence of an outcome of interest, and valid statistical testing typically with highly dependent data. The methods discussed in this thesis are developed to improve statistical inference from dependent network data.


Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan Oct 2021

Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan

Doctoral Dissertations

Carnivores are distributed widely and threatened by habitat loss, poaching, climate change, and disease. They are considered integral to ecosystem function through their direct and indirect interactions with species at different trophic levels. Given the importance of carnivores, it is of high conservation priority to understand the processes driving carnivore assemblages in different systems. It is thus essential to determine the abiotic and biotic drivers of carnivore community composition at different spatial scales and address the following questions: (i) What factors influence carnivore community composition and diversity? (ii) How do the factors influencing carnivore communities vary across spatial and temporal …


Using Generalizability And Rasch Measurement Theory To Ensure Rigorous Measurement In An International Development Education Evaluation, Louise Bahry Oct 2021

Using Generalizability And Rasch Measurement Theory To Ensure Rigorous Measurement In An International Development Education Evaluation, Louise Bahry

Doctoral Dissertations

Between the United States and Great Britain, over 30 billion USD was spent in 2018 on international aid, over a billion of which is dedicated to education programs alone. Recently, there has been increased attention on the rigorous evaluation of aid-funded programs, moving beyond counting outputs to the measurement of educational impact. The current study uses two methodological approaches (Generalizability (Brennan, 1992, 2001) and Rasch Measurement Theory (Andrich, 1978; Rasch, 1980; Wright & Masters, 1982) to analyze data from math and literacy assessments, and self-report surveys used in an international evaluation of an educational initiative in the Democratic Republic of …


Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang Jul 2021

Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang

Doctoral Dissertations

In the process of statistical modeling, the descriptive modeling plays an essential role in accelerating the formulation of plausible hypotheses in the subsequent explanatory modeling and facilitating the selection of potential variables in the subsequent predictive modeling. Especially, for multivariate categorical data analysis, it is desirable to use the descriptive modeling methods for uncovering and summarizing the potential association structure among multiple categorical variables in a compact manner. However, many classical methods in this case either rely on strong assumptions for parametric models or become infeasible when the data dimension is higher. To this end, we propose a model-free method …


Bayesian Topological Machine Learning, Christopher A. Oballe Aug 2020

Bayesian Topological Machine Learning, Christopher A. Oballe

Doctoral Dissertations

Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …


Latent Class Models For At-Risk Populations, Shuaimin Kang Jul 2020

Latent Class Models For At-Risk Populations, Shuaimin Kang

Doctoral Dissertations

Clustering Network Tree Data From Respondent-Driven Sampling With Application to Opioid Users in New York City There is great interest in finding meaningful subgroups of attributed network data. There are many available methods for clustering complete network. Unfortunately, much network data is collected through sampling, and therefore incomplete. Respondent-driven sampling (RDS) is a widely used method for sampling hard-to-reach human populations based on tracing links in the underlying unobserved social network. The resulting data therefore have tree structure representing a sub-sample of the network, along with many nodal attributes. In this paper, we introduce an approach to adjust mixture models …


Joint Asymptotics For Smoothing Spline Semiparametric Nonlinear Models, Jiahui Yu Oct 2019

Joint Asymptotics For Smoothing Spline Semiparametric Nonlinear Models, Jiahui Yu

Doctoral Dissertations

We study the joint asymptotics of general smoothing spline semiparametric models in the settings of density estimation and regression. We provide a systematic framework which incorporates many existing models as special cases, and further allows for nonlinear relationships between the finite-dimensional Euclidean parameter and the infinite-dimensional functional parameter. For both density estimation and regression, we establish the local existence and uniqueness of the penalized likelihood estimators for our proposed models. In the density estimation setting, we prove joint consistency and obtain the rates of convergence of the joint estimator in an appropriate norm. The convergence rate of the parametric component …


Allocative Poisson Factorization For Computational Social Science, Aaron Schein Jul 2019

Allocative Poisson Factorization For Computational Social Science, Aaron Schein

Doctoral Dissertations

Social science data often comes in the form of high-dimensional discrete data such as categorical survey responses, social interaction records, or text. These data sets exhibit high degrees of sparsity, missingness, overdispersion, and burstiness, all of which present challenges to traditional statistical modeling techniques. The framework of Poisson factorization (PF) has emerged in recent years as a natural way to model high-dimensional discrete data sets. This framework assumes that each observed count in a data set is a Poisson random variable $y ~ Pois(\mu)$ whose rate parameter $\mu$ is a function of shared model parameters. This thesis examines a specific …


Variational Approximations For Density Deconvolution, Yue Chang Nov 2018

Variational Approximations For Density Deconvolution, Yue Chang

Doctoral Dissertations

This thesis considers the problem of density estimation when the variables of interest are subject to measurement error. The measurement error is assumed to be additive and homoscedastic. We specify the density of interest by a Dirichlet Process Mixture Model and establish variational approximation approaches to the density deconvolution problem. Gaussian and Laplacian error distributions are considered, which are representatives of supersmooth and ordinary smooth distributions, respectively. We develop two variational approximation algorithms for Gaussian error deconvolution and one variational approximation algorithm for Laplacian error deconvolution. Their performances are compared to deconvoluting kernels and Monte Carlo Markov Chain method by …


Statistical Computational Topology And Geometry For Understanding Data, Joshua Lee Mike Aug 2017

Statistical Computational Topology And Geometry For Understanding Data, Joshua Lee Mike

Doctoral Dissertations

Here we describe three projects involving data analysis which focus on engaging statistics with the geometry and/or topology of the data.

The first project involves the development and implementation of kernel density estimation for persistence diagrams. These kernel densities consider neighborhoods for every feature in the center diagram and gives to each feature an independent, orthogonal direction. The creation of kernel densities in this realm yields a (previously unavailable) full characterization of the (random) geometry of a dataspace or data distribution.

In the second project, cohomology is used to guide a search for kidney exchange cycles within a kidney paired …


Statistical Methods On Risk Management Of Extreme Events, Zijing Zhang Jul 2017

Statistical Methods On Risk Management Of Extreme Events, Zijing Zhang

Doctoral Dissertations

The goal of the dissertation is the investigation of financial risk analysis methodologies, using the schemes for extreme value modeling as well as techniques from copula modeling. Extreme value theory is concerned with probabilistic and statistical questions re- lated to unusual behavior or rare events. The subject has a rich mathematical theory and also a long tradition of applications in a variety of areas. We are interested in its application in risk management, with a focus on estimating and forcasting the Value-at-Risk of financial time series data. Extremal data are inherently scarce, thus making inference challenging. In order to obtain …


Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu Jul 2017

Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu

Doctoral Dissertations

In this thesis, we propose statistical models for addressing commonly encountered data types and study designs in large epidemiologic investigations aimed at understanding the molecular basis of complex disorders. The motivating applications come from diverse disease areas in Women's Health, including the study of type II diabetes in the Women's Health Initiative (WHI), invasive breast cancer in the Nurses' Health Study and the study of the metabolomic underpinnings of cardiovascular disease in the WHI. We have also put significant effort into making the implementation of the proposed methods accessible through freely available, user-friendly software packages in R. The first chapter …


Inference From Network Data In Hard-To-Reach Populations, Isabelle Beaudry Mar 2017

Inference From Network Data In Hard-To-Reach Populations, Isabelle Beaudry

Doctoral Dissertations

The objective of this thesis is to develop methods to make inference about the prevalence of an outcome of interest in hard-to-reach populations. The proposed methods address issues specific to the survey strategies employed to access those populations. One of the common sampling methodology used in this context is respondent-driven sampling (RDS). Under RDS, the network connecting members of the target population is used to uncover the hidden members. Specialized techniques are then used to make inference from the data collected in this fashion. Our first objective is to correct traditional RDS prevalence estimators and their associated uncertainty estimators for …


Intrinsic Functions For Securing Cmos Computation: Variability, Modeling And Noise Sensitivity, Xiaolin Xu Nov 2016

Intrinsic Functions For Securing Cmos Computation: Variability, Modeling And Noise Sensitivity, Xiaolin Xu

Doctoral Dissertations

A basic premise behind modern secure computation is the demand for lightweight cryptographic primitives, like identifier or key generator. From a circuit perspective, the development of cryptographic modules has also been driven by the aggressive scalability of complementary metal-oxide-semiconductor (CMOS) technology. While advancing into nano-meter regime, one significant characteristic of today's CMOS design is the random nature of process variability, which limits the nominal circuit design. With the continuous scaling of CMOS technology, instead of mitigating the physical variability, leveraging such properties becomes a promising way. One of the famous products adhering to this double-edged sword philosophy is the Physically …


Identifying Examinees Who Possess Distinct And Reliable Subscores When Added Value Is Lacking For The Total Sample, Joseph A. Rios Nov 2016

Identifying Examinees Who Possess Distinct And Reliable Subscores When Added Value Is Lacking For The Total Sample, Joseph A. Rios

Doctoral Dissertations

Research has demonstrated that although subdomain information may provide no added value beyond the total score, in some contexts such information is of utility to particular demographic subgroups (Sinharay & Haberman, 2014). However, it is argued that the utility of reporting subscores for an individual should not be based on one’s manifest characteristics (e.g., gender or ethnicity), but rather on individual needs for diagnostic information, which is driven by multidimensionality in subdomain scores. To improve the validity of diagnostic information, this study proposed the use of Mahalanobis Distance and HT indices to assess whether an individual’s data significantly departs …


Variable Selection In Single Index Varying Coefficient Models With Lasso, Peng Wang Nov 2015

Variable Selection In Single Index Varying Coefficient Models With Lasso, Peng Wang

Doctoral Dissertations

Single index varying coefficient model is a very attractive statistical model due to its ability to reduce dimensions and easy-of-interpretation. There are many theoretical studies and practical applications with it, but typically without features of variable selection, and no public software is available for solving it. Here we propose a new algorithm to fit the single index varying coefficient model, and to carry variable selection in the index part with LASSO. The core idea is a two-step scheme which alternates between estimating coefficient functions and selecting-and-estimating the single index. Both in simulation and in application to a Geoscience dataset, we …


Threat Analysis, Countermeaures And Design Strategies For Secure Computation In Nanometer Cmos Regime, Raghavan Kumar Nov 2015

Threat Analysis, Countermeaures And Design Strategies For Secure Computation In Nanometer Cmos Regime, Raghavan Kumar

Doctoral Dissertations

Advancements in CMOS technologies have led to an era of Internet Of Things (IOT), where the devices have the ability to communicate with each other apart from their computational power. As more and more sensitive data is processed by embedded devices, the trend towards lightweight and efficient cryptographic primitives has gained significant momentum. Achieving a perfect security in silicon is extremely difficult, as the traditional cryptographic implementations are vulnerable to various active and passive attacks. There is also a threat in the form of "hardware Trojans" inserted into the supply chain by the untrusted third-party manufacturers for economic incentives. Apart …


Evaluating The Effects Of Standardized Patient Care Pathways On Clinical Outcomes, Anna V. Romanova Aug 2015

Evaluating The Effects Of Standardized Patient Care Pathways On Clinical Outcomes, Anna V. Romanova

Doctoral Dissertations

The main focus of this study is to create a standardized approach to evaluating the impact of the patient care pathways across all major disease categories and key outcome measures in a hospital setting when randomized clinical trials are not feasible. Toward this goal I identify statistical methods, control factors, and adjustments that can correct for potential confounding in observational studies. I investigate the efficiency of existing bias correction methods under varying conditions of imbalanced samples through a Monte Carlo simulation. The simulation results are then utilized in a case study for one of the largest primary diagnosis areas, chronic …


If And How Many 'Races'? The Application Of Mixture Modeling To World-Wide Human Craniometric Variation, Bridget Frances Beatrice Algee-Hewitt Dec 2011

If And How Many 'Races'? The Application Of Mixture Modeling To World-Wide Human Craniometric Variation, Bridget Frances Beatrice Algee-Hewitt

Doctoral Dissertations

Studies in human cranial variation are extensive and widely discussed. While skeletal biologists continue to focus on questions of biological distance and population history, group-specific knowledge is being increasingly used for human identification in medico-legal contexts. The importance of this research has been often overshadowed by both philosophic and methodological concerns. Many analyses have been constrained in their scope by the limited availability of representative samples and readily criticized for adopting statistical techniques that require user-guidance and a priori information. A multi-part project is presented here that implements model-based clustering as an alternative approach for population studies using craniometric traits. …


Mixture Of Factor Analyzers With Information Criteria And The Genetic Algorithm, Esra Turan Aug 2010

Mixture Of Factor Analyzers With Information Criteria And The Genetic Algorithm, Esra Turan

Doctoral Dissertations

In this dissertation, we have developed and combined several statistical techniques in Bayesian factor analysis (BAYFA) and mixture of factor analyzers (MFA) to overcome the shortcoming of these existing methods. Information Criteria are brought into the context of the BAYFA model as a decision rule for choosing the number of factors m along with the Press and Shigemasu method, Gibbs Sampling and Iterated Conditional Modes deterministic optimization. Because of sensitivity of BAYFA on the prior information of the factor pattern structure, the prior factor pattern structure is learned directly from the given sample observations data adaptively using Sparse Root algorithm. …


A New Screening Methodology For Mixture Experiments, Maria Weese May 2010

A New Screening Methodology For Mixture Experiments, Maria Weese

Doctoral Dissertations

Many materials we use in daily life are comprised of a mixture; plastics, gasoline, food, medicine, etc. Mixture experiments, where factors are proportions of components and the response depends only on the relative proportions of the components, are an integral part of product development and improvement. However, when the number of components is large and there are complex constraints, experimentation can be a daunting task. We study screening methods in a mixture setting using the framework of the Cox mixture model [1]. We exploit the easy interpretation of the parameters in the Cox mixture model and develop methods for screening …