Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Applied Statistics

Clustering Mixed Data: An Extension Of The Gower Coefficient With Weighted L2 Distance, Augustine Oppong Aug 2018

Clustering Mixed Data: An Extension Of The Gower Coefficient With Weighted L2 Distance, Augustine Oppong

Electronic Theses and Dissertations

Sorting out data into partitions is increasing becoming complex as the constituents of data is growing outward everyday. Mixed data comprises continuous, categorical, directional functional and other types of variables. Clustering mixed data is based on special dissimilarities of the variables. Some data types may influence the clustering solution. Assigning appropriate weight to the functional data may improve the performance of the clustering algorithm. In this paper we use the extension of the Gower coefficient with judciously chosen weight for the L2 to cluster mixed data.The benefits of weighting are demonstrated both in in applications to the Buoy data set …


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor Aug 2018

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …


Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen May 2018

Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen

Electronic Theses and Dissertations

The bootstrap procedure is widely used in nonparametric statistics to generate an empirical sampling distribution from a given sample data set for a statistic of interest. Generally, the results are good for location parameters such as population mean, median, and even for estimating a population correlation. However, the results for a population variance, which is a spread parameter, are not as good due to the resampling nature of the bootstrap method. Bootstrap samples are constructed using sampling with replacement; consequently, groups of observations with zero variance manifest in these samples. As a result, a bootstrap variance estimator will carry a …


Old English Character Recognition Using Neural Networks, Sattajit Sutradhar Jan 2018

Old English Character Recognition Using Neural Networks, Sattajit Sutradhar

Electronic Theses and Dissertations

Character recognition has been capturing the interest of researchers since the beginning of the twentieth century. While the Optical Character Recognition for printed material is very robust and widespread nowadays, the recognition of handwritten materials lags behind. In our digital era more and more historical, handwritten documents are digitized and made available to the general public. However, these digital copies of handwritten materials lack the automatic content recognition feature of their printed materials counterparts. We are proposing a practical, accurate, and computationally efficient method for Old English character recognition from manuscript images. Our method relies on a modern machine learning …


Some New And Generalized Distributions Via Exponentiation, Gamma And Marshall-Olkin Generators With Applications, Hameed Abiodun Jimoh Jan 2018

Some New And Generalized Distributions Via Exponentiation, Gamma And Marshall-Olkin Generators With Applications, Hameed Abiodun Jimoh

Electronic Theses and Dissertations

Three new generalized distributions developed via completing risk, gamma generator, Marshall-Olkin generator and exponentiation techniques are proposed and studied. Structural properties including quantile functions, hazard rate functions, moment, conditional moments, mean deviations, R\'enyi entropy, distribution of order statistics and maximum likelihood estimates are presented. Monte Carlo simulation is employed to examine the performance of the proposed distributions. Applications of the generalized distributions to real lifetime data are presented to illustrate the usefulness of the models.