Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 29 of 29

Full-Text Articles in Entire DC Network

Mixture Models In Machine Learning, Soumyabrata Pal Mar 2022

Mixture Models In Machine Learning, Soumyabrata Pal

Doctoral Dissertations

Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings. In this thesis, we look at three groups of problems. The first part …


Machine Learning With Topological Data Analysis, Ephraim Robert Love May 2021

Machine Learning With Topological Data Analysis, Ephraim Robert Love

Doctoral Dissertations

Topological Data Analysis (TDA) is a relatively new focus in the fields of statistics and machine learning. Methods of exploiting the geometry of data, such as clustering, have proven theoretically and empirically invaluable. TDA provides a general framework within which to study topological invariants (shapes) of data, which are more robust to noise and can recover information on higher dimensional features than immediately apparent in the data. A common tool for conducting TDA is persistence homology, which measures the significance of these invariants. Persistence homology has prominent realizations in methods of data visualization, statistics and machine learning. Extending ML with …


Improving The Data Quality In Gravitation-Wave Detectors By Mitigating Transient Noise Artifacts, Kentaro Mogushi Jan 2021

Improving The Data Quality In Gravitation-Wave Detectors By Mitigating Transient Noise Artifacts, Kentaro Mogushi

Doctoral Dissertations

“The existence of gravitational waves (GWs), small perturbations in spacetime produced by accelerating massive objects was first predicted in 1916 as solutions of Einstein’s Theory of General Relativity (Einstein, 1916). Detecting and analyzing GWs produced by sources allows us to probe astrophysical phenomena.

The era of GW astronomy began from the first direct detection of the coalescence of a binary black hole in 2015 by the collaboration of the advanced Laser Interferometer Gravitational-wave Observatory (LIGO) (Aasi et al., 2015) and advanced Virgo (Abbott et al., 2016a). Since 2015, LIGO-Virgo detected about 50 confident transient events of GW signals (Abbott et …


Bayesian Topological Machine Learning, Christopher A. Oballe Aug 2020

Bayesian Topological Machine Learning, Christopher A. Oballe

Doctoral Dissertations

Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …


Ecology And Conservation Of Immature Sea Turtles Across Multiple Scales, Lucas Griffin Oct 2019

Ecology And Conservation Of Immature Sea Turtles Across Multiple Scales, Lucas Griffin

Doctoral Dissertations

Considering many sea turtle populations are a fraction of their historic size and anthropogenic threats within the marine environment are increasing, additional data are imperative to help mitigate anthropogenic disturbances and to build resilience into sea turtle populations. In this dissertation, I present three data chapters focused on immature sea turtle ecology and conservation. These chapters evaluate sea turtle ecology and conservation at varying scales, ranging from mitigating human-wildlife interactions at the individual level, to coastal movements and space use at the ecosystem level, and to large scale climate change impacts at the population level. Ultimately, these chapters provide a …


Statistical Models For Single Molecule Localization Microscopy, Ahmed Elmokadem Sep 2017

Statistical Models For Single Molecule Localization Microscopy, Ahmed Elmokadem

Doctoral Dissertations

Single-molecule localization microscopy (SMLM) has revolutionized the field of cell biology. It allowed scientists to break the Abbe diffraction limit for fluorescence microscopy and got it closer to the electron microscopy resolution but still it faced some serious challenges. Two of the most important of these are the sample drift and the measurement noise problems that result in lower resolution images. Both of these problems are generally unavoidable where the sample drift is a natural mechanical phenomenon that occurs during the long time of image acquisition required for SMLM (Geisler et al. 2012) while the measurement noise, which arises from …


Development And Application Of Advanced Econometric Models For Exploring Activity-Travel Behavior, Annesha Enam Aug 2017

Development And Application Of Advanced Econometric Models For Exploring Activity-Travel Behavior, Annesha Enam

Doctoral Dissertations

Historically, transportation planning relied on aggregate, trip-based procedures, namely, four-step modeling, for modeling travel demand. The aggregate approaches served well when the capacity oriented policies were of primary interest. However, in the last few decades, with the growing demand for travel and the increasing externalities (e.g. congestion, energy implications, pollution), there is a widespread acknowledgement that capacity oriented approach to transportation planning is unsustainable. Instead, the focus of the transportation planners has shifted towards sustainable demand management strategies wherein the idea is to alter existing behaviors and promote new behaviors such that demand for travel can be met while also …


Data Analysis Methods Using Persistence Diagrams, Andrew Marchese Aug 2017

Data Analysis Methods Using Persistence Diagrams, Andrew Marchese

Doctoral Dissertations

In recent years, persistent homology techniques have been used to study data and dynamical systems. Using these techniques, information about the shape and geometry of the data and systems leads to important information regarding the periodicity, bistability, and chaos of the underlying systems. In this thesis, we study all aspects of the application of persistent homology to data analysis. In particular, we introduce a new distance on the space of persistence diagrams, and show that it is useful in detecting changes in geometry and topology, which is essential for the supervised learning problem. Moreover, we introduce a clustering framework directly …


Multistage Sampling Strategies And Inference In Health Studies Under Appropriate Linex Loss Functions, Sudeep R. Bapat Jul 2017

Multistage Sampling Strategies And Inference In Health Studies Under Appropriate Linex Loss Functions, Sudeep R. Bapat

Doctoral Dissertations

A sequential sampling methodology provides concrete results and proves to be benefecial in many scenarios, where a fixed sampling technique fails to deliver. This dissertation introduces several multistage sampling methodologies to estimate the unknown parameters depending on the model in hand. We construct both two-stage and purely sequential sampling rules under different situations. The estimation is carried under a loss function which in our case is either a usual squared error loss or a Linex loss. We adopt a technique known as bounded risk estimation strategy, where we bound the appropriate risk function from above by a fixed and known …


On The Quantification Of Complexity And Diversity From Phenotypes To Ecosystems, Zachary Harrison Marion Dec 2016

On The Quantification Of Complexity And Diversity From Phenotypes To Ecosystems, Zachary Harrison Marion

Doctoral Dissertations

A cornerstone of ecology and evolution is comparing and explaining the complexity of natural systems, be they genomes, phenotypes, communities, or entire ecosystems. These comparisons and explanations then beget questions about how complexity should be quantified in theory and estimated in practice. Here I embrace diversity partitioning using Hill or effective numbers to move the empirical side of the field regarding the quantification of biological complexity.

First, at the level of phenotypes, I show that traditional multivariate analyses ignore individual complexity and provide relatively abstract representations of variation among individuals. I then suggest using well-known diversity indices from community ecology …


Geography Of Health Care Access: Measurement, Analyses And Integration, Huairen Ye May 2016

Geography Of Health Care Access: Measurement, Analyses And Integration, Huairen Ye

Doctoral Dissertations

This dissertation addresses the geography of healthcare access and disparity issues in the United States using geospatial methods. Disparities in access to quality healthcare services are of great concern in the field of both public health and geography. Access is a key element within the healthcare delivery system, influenced by both spatial factors and non-spatial factors. Focusing on the spatial dimensions of access, an innovative contribution of this dissertation is the integration of spatial modeling, geo-statistics and location problems in a Geographic Information System (GIS) environment to investigate healthcare access.

Improving health access begins with developing reliable methods to measure …


Computationally Efficient Specifications Of Spatial Point Process Models And Spatio-Temporal Gaussian Models: Combining Remote Sensing Drivers With Geospatial Disease Case Data To Enhance Geographic Epidemiology, Beth Louise Ziniti Jan 2016

Computationally Efficient Specifications Of Spatial Point Process Models And Spatio-Temporal Gaussian Models: Combining Remote Sensing Drivers With Geospatial Disease Case Data To Enhance Geographic Epidemiology, Beth Louise Ziniti

Doctoral Dissertations

In this dissertation, the flexibility of Bayesian hierarchical models specified using a latent Gaussian Markov Random Field (GMRF) are evaluated for use in analyzing large complex spatial and spatio-temporal data with the goal of contributing to an interdisciplinary effort of developing an eco-epidemiological model that quantifies the relationship between remotely sensed water quality and the incidence of ALS (Amyotrophic Lateral Sclerosis or Lou Gehrig’s Disease) over large areas such as Northern New England (NNE).

In particular, a Log-Gaussian Cox Process (LGCP) specified by the logarithm of a GMRF on a regular lattice is shown to allow for simultaneous estimation of …


Social Fingerprinting: Identifying Users Of Social Networks By Their Data Footprint, Denise Koessler Gosnell Dec 2014

Social Fingerprinting: Identifying Users Of Social Networks By Their Data Footprint, Denise Koessler Gosnell

Doctoral Dissertations

This research defines, models, and quantifies a new metric for social networks: the social fingerprint. Just as one's fingers leave behind a unique trace in a print, this dissertation introduces and demonstrates that the manner in which people interact with other accounts on social networks creates a unique data trail. Accurate identification of a user's social fingerprint can address the growing demand for improved techniques in unique user account analysis, computational forensics and social network analysis.

In this dissertation, we theorize, construct and test novel software and methodologies which quantify features of social network data. All approaches and methodologies are …


The Application Of Information Integration Theory To Standard Setting: Setting Cut Scores Using Cognitive Theory, Christopher C. Foster Apr 2014

The Application Of Information Integration Theory To Standard Setting: Setting Cut Scores Using Cognitive Theory, Christopher C. Foster

Doctoral Dissertations

Information integration theory (IIT) is a cognitive psychology theory that is primarily concerned with understanding rater judgments and deriving quantitative values from rater expertise. Since standard setting is a process by which subject matter experts are asked to make expert judgment about test content, it is an ideal context for the application of information integration theory. Information integration theory (IIT) was proposed by Norman H. Anderson, a cognitive psychologist. It is a cognitive theory that is primarily concerned with how an individual integrates information from two or more stimuli to derive a quantitative value. The theory focuses on evaluating the …


Extreme Value Theory: Applications To Estimation Of Stochastic Traffic Capacity And Statistical Downscaling Of Precipitation Extremes, Eric Matthew Laflamme Jan 2013

Extreme Value Theory: Applications To Estimation Of Stochastic Traffic Capacity And Statistical Downscaling Of Precipitation Extremes, Eric Matthew Laflamme

Doctoral Dissertations

This work explores two applications of extreme value analysis. First, we apply EV techniques to traffic stream data to develop an accurate distribution of capacity. Data were collected by the NHDOT along Interstate I93, and two adjacent locations in Salem, NH were examined. Daily flow maxima were used to estimate capacity, and data not associated with daily breakdown were deemed censored values. Under this definition, capacity values are approximated by the generalized extreme value (GEV) distribution for block maxima. To address small sample sizes and the presence of censoring, a Bayesian framework using semi-informative priors was implemented. A simple cross …


On Wavelet-Based Testing For Serial Correlation Of Unknown Form Using Fan's Adaptive Neyman Method, Shan Yao Jan 2012

On Wavelet-Based Testing For Serial Correlation Of Unknown Form Using Fan's Adaptive Neyman Method, Shan Yao

Doctoral Dissertations

Test procedures for serial correlation of unknown form with wavelet methods are investigated in this dissertation. The new wavelet-based consistent test is motivated using Fan's (1996) canonical multivariate normal hypothesis testing model. In our framework, the test statistic relies on empirical wavelet coefficients of a wavelet-based spectral density estimator. We advocate the choice of the simple Haar wavelet function, since evidence demonstrates that the choice of the wavelet function is not critical. Under the null hypothesis of no serial correlation, the asymptotic distribution of a vector of empirical wavelet coefficients is derived, which is the multivariate normal distribution in the …


Geographic Disparities Associated With Stroke And Myocardial Infarction In East Tennessee, Ashley Pedigo Golden Dec 2011

Geographic Disparities Associated With Stroke And Myocardial Infarction In East Tennessee, Ashley Pedigo Golden

Doctoral Dissertations

Stroke and myocardial infarction (MI) are serious conditions whose burdens vary by socio-demographic and geographic factors. Although several studies have investigated and identified disparities in burdens of these conditions at the county and state levels, little is known regarding their geographic epidemiology at the neighborhood level. Both conditions require emergency treatments and therefore timely geographic accessibility to appropriate care is critical. Investigation of disparities in geographic accessibility to stroke and MI care and the role of Emergency Medical Services (EMS) in reducing treatment delays are vital in improving health outcomes. Therefore, the objectives of this work were to: (i) classify …


Energy Functional For Nuclear Masses, Michael Giovanni Bertolli Dec 2011

Energy Functional For Nuclear Masses, Michael Giovanni Bertolli

Doctoral Dissertations

An energy functional is formulated for mass calculations of nuclei across the nuclear chart with major-shell occupations as the relevant degrees of freedom. The functional is based on Hohenberg-Kohn theory. Motivation for its form comes from both phenomenology and relevant microscopic systems, such as the three-level Lipkin Model. A global fit of the 17-parameter functional to nuclear masses yields a root- mean-square deviation of χ[chi] = 1.31 MeV, on the order of other mass models. The construction of the energy functional includes the development of a systematic method for selecting and testing possible functional terms. Nuclear radii are computed within …


Models And Methods For Computationally Efficient Analysis Of Large Spatial And Spatio-Temporal Data, Chengwei Yuan Jan 2011

Models And Methods For Computationally Efficient Analysis Of Large Spatial And Spatio-Temporal Data, Chengwei Yuan

Doctoral Dissertations

With the development of technology, massive amounts of data are often observed at a large number of spatial locations (n). However, statistical analysis is usually not feasible or not computationally efficient for such large dataset. This is the so-called "big n problem".

The goal of this dissertation is to contribute solutions to the "big n problem". The dissertation is devoted to computationally efficient methods and models for large spatial and spatio-temporal data. Several approximation methods to "the big n problem" are reviewed, and an extended autoregressive model, called the EAR model, is proposed as a parsimonious model that accounts for …


Wavelet Regression With Long Memory Infinite Moving Average Errors, Juan Liu Jan 2009

Wavelet Regression With Long Memory Infinite Moving Average Errors, Juan Liu

Doctoral Dissertations

For more than a decade there has been great interest in wavelets and wavelet-based methods. Among the most successful applications of wavelets is nonparametric statistical estimation, following the pioneering work of Donoho and Johnstone (1994, 1995) and Donoho et al. (1995). In this thesis, we consider the wavelet-based estimators of the mean regression function with long memory infinite moving average errors, and investigate the rates of convergence of estimators based on thresholding of empirical wavelet coefficients. We show that these estimators achieve nearly optimal minimax convergence rates within a logarithmic term over a large class of non-smooth functions that involve …


Modeling And Simulation Of Value -At -Risk In The Financial Market Area, Xiangyin Zheng Apr 2006

Modeling And Simulation Of Value -At -Risk In The Financial Market Area, Xiangyin Zheng

Doctoral Dissertations

Value-at-Risk (VaR) is a statistical approach to measure market risk. It is widely used by banks, securities firms, commodity and energy merchants, and other trading organizations. The main focus of this research is measuring and analyzing market risk by modeling and simulation of Value-at-Risk for portfolios in the financial market area. The objectives are (1) predicting possible future loss for a financial portfolio from VaR measurement, and (2) identifying how the distributions of the risk factors affect the distribution of the portfolio. Results from (1) and (2) provide valuable information for portfolio optimization and risk management.

The model systems chosen …


Contributions To Modeling And Computer Efficient Estimation For Gaussian Space -Time Processes, Veronica Pocsik Hupper Jan 2005

Contributions To Modeling And Computer Efficient Estimation For Gaussian Space -Time Processes, Veronica Pocsik Hupper

Doctoral Dissertations

This thesis research provides several contributions to computer efficient methodology for estimation with space-time data. First we propose a parsimonious class of computer-efficient Gaussian spatial interaction models that includes as special cases CAR and SAR-like models. This extended class is capable of modeling smooth spatial random fields. We show that, for rectangular lattices, this class is equivalent to higher-order Markov random fields. Thus we capture the computational advantage of iterative updating of Markov random fields, while at the same time provide the possibility of simple interpretation of smooth spatial structure.

This class of spatial models is defined via a spatial …


Statistical Properties Of Maximum Likelihood Estimates For Accelerated Lifetime Data Under The Weibull Model, Mahmoud A. Yousef Apr 2001

Statistical Properties Of Maximum Likelihood Estimates For Accelerated Lifetime Data Under The Weibull Model, Mahmoud A. Yousef

Doctoral Dissertations

Pipe rehabilitation liners are often installed in host pipes that lie below the water table. As such, they are subjected to external hydrostatic pressure. The external pressure leads to early deformation in the liners, which could ultimately lead to its failing or buckling before its expected service lifetime is achieved. Experiments involving long term buckling behavior of liners are typically accelerated lifetime testing procedures. In an accelerated testing procedure a liner is subjected to a constant external hydrostatic pressure and observed until it fails or for a certain time, t whichever occurs first. Liners that do not fail at time …


Dynamic Analysis Of Unevenly Sampled Data With Applications To Statistical Process Control, Laura Ann Mcsweeney Jan 1999

Dynamic Analysis Of Unevenly Sampled Data With Applications To Statistical Process Control, Laura Ann Mcsweeney

Doctoral Dissertations

Dynamic analysis involves describing how a process changes over time. Applications of this type of analysis can be implemented in industrial settings in order to control manufacturing processes and recognize when they have changed significantly. The primary focus of this work is to construct methods to detect the onset of periodic behavior in a process which is being monitored using a scheme where data is sampled unevenly.

Techniques that can be used to identify statistically significant periodic structure using the periodogram will be reviewed and developed. The statistical properties of the periodogram for unevenly sampled data will be calculated. These …


New Methods For Modeling Accelerated Life Test Data, Michelle Hopkins Capozzoli Jan 1999

New Methods For Modeling Accelerated Life Test Data, Michelle Hopkins Capozzoli

Doctoral Dissertations

An accelerated life test (ALT) is often used to obtain timely information for highly reliable items. The increased use of ALTs has resulted in nontraditional reliability data which can not be analyzed with standard statistical methodologies. I propose new methods for analyzing ALT data for studies with (1) two independent populations, (2) paired samples and (3) limited failure populations (LFP). Here, the Weibull distribution, which can accommodate a variety of failure rates, is assumed for the models I develop. For case (1), a parametric hypothesis test, a Bayesian analysis and a test using partial likelihood are proposed and discussed. For …


Ecological Database Development And Analyses Of Soil Variability In Northern New England, Michael Anayo Okoye Jan 1997

Ecological Database Development And Analyses Of Soil Variability In Northern New England, Michael Anayo Okoye

Doctoral Dissertations

The 1983 Forest Inventory and Analysis (FIA) data of the states of Maine, New Hampshire and Vermont (the study area) contain large amounts of field-measurements of many ecologically important variables. Despite the vast potential usefulness of the FIA data for scientific research, the data were until now, literally unused except for a few administrative purposes, because of problems in the way the data were organized, summarized, and coded for storage. The primary objective of this research was to solve the problems that had thus precluded these FIA data from use in scientific applications, and present the data in a form …


The Association Between Arbitrage Pricing Theory Risk Measures And Traditional Accounting Variables, Theophanis Stratopoulous Jan 1994

The Association Between Arbitrage Pricing Theory Risk Measures And Traditional Accounting Variables, Theophanis Stratopoulous

Doctoral Dissertations

According to the Arbitrage Pricing Theory (APT), actual security returns depend on a variety of pervasive economic and financial risk factors; as well as firm or industry specific influences. The sensitivity of an asset's returns to unanticipated changes in the pervasive risk factors reflects the security's measure of systematic risk. In equilibrium, the expected security return is a linear function of the sensitivities of actual security returns to unanticipated changes in the pervasive risk factors.

The APT does not specify the number or the nature of the pervasive risk factors. Factor analysis of stock returns can be used to determine …


Money, Income And Causality: An Open Economy Reexamination, El-Hachemi Aliouche Jan 1992

Money, Income And Causality: An Open Economy Reexamination, El-Hachemi Aliouche

Doctoral Dissertations

The positive relationship between the rate of growth of the money supply and the rate of growth of aggregate income is a widely accepted principle in macroeconomics. However, the direction of the causality between these two variables has been an enduring subject of controversy.

Recent developments in time series analysis, particularly those relating to the concepts of integration and cointegration, and the stationary nature of economic time series, promise to help settle the debate on the statistical relationship between money supply growth and income growth. Most of the recent work on this issue, however, has been confined to a closed …


Dynamic Probabilistic Systems With Continuous Parameter Markov Chains And Semi-Markov Processes, Christopher Tin Htun Lee Jan 1973

Dynamic Probabilistic Systems With Continuous Parameter Markov Chains And Semi-Markov Processes, Christopher Tin Htun Lee

Doctoral Dissertations

No abstract provided.