Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Series

Multivariate Analysis

Institution
Keyword
Publication Year
Publication
File Type

Articles 1 - 28 of 28

Full-Text Articles in Applied Statistics

Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe Jan 2024

Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe

Data Science and Data Mining

This project estimates a regression model to predict the superconducting critical temperature based on variables extracted from the superconductor’s chemical formula. The regression model along with the stepwise variable selection gives a reasonable and good predictive model with a lower prediction error (MSE). Variables extracted based on atomic radius, valence, atomic mass and thermal conductivity appeared to have the most contribution to the predictive model.


Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth Feb 2023

Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth

Articles

A model is proposed for two-way tables of measurement data containing outliers. The two independent variables are categorical and error free. Neither missing values nor replication are present. The model consists of the sum of a customary additive part that can be fit using least squares and a part that is composed of outliers. Recommendations are made for methods for identifying cells containing outliers and for fitting the model. A graph of the observations is used to determine the outliers’ locations. For all cells containing an outlier, replacement values are determined simultaneously using a classical missing-data tool. The result is …


A Bayesian Programming Approach To Car-Following Model Calibration And Validation Using Limited Data, Franklin Abodo Jun 2022

A Bayesian Programming Approach To Car-Following Model Calibration And Validation Using Limited Data, Franklin Abodo

FIU Electronic Theses and Dissertations

Traffic simulation software is used by transportation researchers and engineers to design and evaluate changes to roadway networks. Underlying these simulators are mathematical models of microscopic driver behavior from which macroscopic measures of flow and congestion can be recovered. Many models are intended to apply to only a subset of possible traffic scenarios and roadway configurations, while others do not have any explicit constraint on their applicability. Work zones on highways are one scenario for which no model invented to date has been shown to accurately reproduce realistic driving behavior. This makes it difficult to optimize for safety and other …


Lecture 04: Spatial Statistics Applications Of Hrl, Trl, And Mixed Precision, David Keyes Apr 2021

Lecture 04: Spatial Statistics Applications Of Hrl, Trl, And Mixed Precision, David Keyes

Mathematical Sciences Spring Lecture Series

As simulation and analytics enter the exascale era, numerical algorithms, particularly implicit solvers that couple vast numbers of degrees of freedom, must span a widening gap between ambitious applications and austere architectures to support them. We present fifteen universals for researchers in scalable solvers: imperatives from computer architecture that scalable solvers must respect, strategies towards achieving them that are currently well established, and additional strategies currently being developed for an effective and efficient exascale software ecosystem. We consider recent generalizations of what it means to “solve” a computational problem, which suggest that we have often been “oversolving” them at the …


Optimal Design For A Causal Structure, Zaher Kmail Aug 2019

Optimal Design For A Causal Structure, Zaher Kmail

Department of Statistics: Dissertations, Theses, and Student Work

Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.

Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …


Best Probable Subset: A New Method For Reducing Data Dimensionality In Linear Regression, Elieser Nodarse Apr 2019

Best Probable Subset: A New Method For Reducing Data Dimensionality In Linear Regression, Elieser Nodarse

FIU Electronic Theses and Dissertations

Regression is a statistical technique for modeling the relationship between a dependent variable Y and two or more predictor variables, also known as regressors. In the broad field of regression, there exists a special case in which the relationship between the dependent variable and the regressor(s) is linear. This is known as linear regression.

The purpose of this paper is to create a useful method that effectively selects a subset of regressors when dealing with high dimensional data and/or collinearity in linear regression. As the name depicts it, high dimensional data occurs when the number of predictor variables is far …


Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan Mar 2019

Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan

COBRA Preprint Series

One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …


Nonparametric Depth And Quantile Regression For Functional Data, Joydeep Chowdhury, Probal Chaudhuri Feb 2019

Nonparametric Depth And Quantile Regression For Functional Data, Joydeep Chowdhury, Probal Chaudhuri

Journal Articles

We investigate nonparametric regression methods based on spatial depth and quantiles when the response and the covariate are both functions. As in classical quantile regression for finite dimensional data, regression techniques developed here provide insight into the influence of the functional covariate on different parts, like the center as well as the tails, of the conditional distribution of the functional response. Depth and quantile based nonparametric regression methods are useful to detect heteroscedasticity in functional regression. We derive the asymptotic behavior of the nonparametric depth and quantile regression estimates, which depend on the small ball probabilities in the covariate space. …


The Dark Sky Character Of Archaeological Landscapes: Cultural Meaning And Conservation Strategies, Frank Prendergast Jan 2019

The Dark Sky Character Of Archaeological Landscapes: Cultural Meaning And Conservation Strategies, Frank Prendergast

Book/Book Chapter

This paper presents the first ever study of light pollution at selected Irish prehistoric archaeological landscapes. The concepts of cosmology and landscape are first briefly described and followed by a summary of early human settlement of the island. Building on this, the extant corpus of early prehistoric megalithic burial tombs is illustrated to show their contrasting distribution patterns and typology. Analysis of tomb locations using nearest-neighbour statistical methods reveals evidence of intentional clustering. Further geo-statistical analysis identifies the geographical locations and the density ranking of these nucleated clusters - a feature especially evident in the passage tomb tradition on this …


Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma Jul 2018

Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma

Stevenson Center for Community and Economic Development—Student Research

Actuarial risk assessment tools increasingly have been employed in jurisdictions across the U.S. to assist courts in the decision of whether someone charged with a crime should be detained or released prior to their trial. These tools should be continually monitored and researched by independent 3rd parties to ensure that these powerful tools are being administered properly and used in the most proficient way as to provide socially optimal results. McLean County, Illinois began using the Public Safety Assessment-CourtTM (PSA-Court or simply PSA) risk assessment tool beginning in 2016. This study culls data from the McLean County Jail …


On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar Mar 2018

On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar

FIU Electronic Theses and Dissertations

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …


A Preliminary Study Of Smithport Plain Bottle Morphology In The Southern Caddo Area, Robert Z. Selden Jr. Jan 2018

A Preliminary Study Of Smithport Plain Bottle Morphology In The Southern Caddo Area, Robert Z. Selden Jr.

CRHR: Archaeology

This study expands upon a previous analysis of the Clarence H. Webb collection, which resulted in the identification of two discrete shapes used in the manufacture of the base and body of Smithport Plain bottles. The sample includes the Smithport Plain bottles from the Webb collection, and four new bottles: two previously repatriated specimens in the Pohler Collection, and two from the Mitchell site (41BW4) to test whether those specimens align morphologically with the Belcher Mound or Smithport Landing specimens. Results indicate significant allometry and a significant difference in Smithport Plain body and base shapes for bottles produced at the …


Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei Jul 2017

Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei

Publications and Research

Comparative data on the burden of atopic dermatitis (AD) in adults relative to the general population are limited. We performed a large-scale evaluation of the burden of disease among US adults with AD relative to matched non-AD controls, encompassing comorbidities, healthcare resource utilization (HCRU), and costs, using healthcare claims data. The impact of AD disease severity on these outcomes was also evaluated.


An Investigation Of The Accuracy Of Parallel Analysis For Determining The Number Of Factors In A Factor Analysis, Mandy Matsumoto Jun 2017

An Investigation Of The Accuracy Of Parallel Analysis For Determining The Number Of Factors In A Factor Analysis, Mandy Matsumoto

Mahurin Honors College Capstone Experience/Thesis Projects

Exploratory factor analysis is an analytic technique used to determine the number of factors in a set of data (usually items on a questionnaire) for which the factor structure has not been previously analyzed. Parallel analysis (PA) is a technique used to determine the number of factors in a factor analysis. There are a number of factors that affect the results of a PA: the choice of the eigenvalue percentile, the strength of the factor loadings, the number of variables, and the sample size of the study. Although PA is the most accurate method to date to determine which factors …


Marketing The Mountain State: A Large N Study Of User Engagement On Twitter, Kirk Richardson Jun 2017

Marketing The Mountain State: A Large N Study Of User Engagement On Twitter, Kirk Richardson

Capstone Projects – Politics and Government

Much of the evolving research on the use of social media in destination marketing emphasizes how information diffusion influences the reputational image of place. The present study uses Twitter data to focus on the relative differences in user engagement across discrete account types. Specifically, this is done to examine how the official destination marketing organization of Montana—the Montana Office of Tourism (MTOT)—performs relative to other account types. Several regression analyses conducted on Twitter data associated with an ongoing MTOT place branding campaign reveal that tweets sent from ‘official’ accounts are more likely to be retweeted, and are estimated to receive …


Studying The Optimal Scheduling For Controlling Prostate Cancer Under Intermittent Androgen Suppression, Sunil K. Dhar, Hans R. Chaudhry, Bruce G. Bukiet, Zhiming Ji, Nan Gao, Thomas W. Findley Jan 2017

Studying The Optimal Scheduling For Controlling Prostate Cancer Under Intermittent Androgen Suppression, Sunil K. Dhar, Hans R. Chaudhry, Bruce G. Bukiet, Zhiming Ji, Nan Gao, Thomas W. Findley

Harvard University Biostatistics Working Paper Series

This retrospective study shows that the majority of patients’ correlations between PSA and Testosterone during the on-treatment period is at least 0.90. Model-based duration calculations to control PSA levels during off-treatment are provided. There are two pairs of models. In one pair, the Generalized Linear Model and Mixed Model are both used to analyze the variability of PSA at the individual patient level by using the variable “Patient ID” as a repeated measure. In the second pair, Patient ID is not used as a repeated measure but additional baseline variables are included to analyze the variability of PSA.


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Spatiotemporal Meta-Analysis: Reviewing Health Psychology Phenomena Over Space And Time., Blair T. Johnson Jan 2016

Spatiotemporal Meta-Analysis: Reviewing Health Psychology Phenomena Over Space And Time., Blair T. Johnson

CHIP Documents

This supplemental material is meant to support this article:

Johnson, B. T., Crowley, E., & Marrouch, N. Spatiotemporal meta-analysis: Reviewing health psychology phenomena over space and time. Health Psychology Review.

Specifically, it is a database of GDPs per capita for nations in the world between 1800 and 2015. It is archived here to support an online supplement to this article.

GDP per capita


Gis-Integrated Mathematical Modeling Of Social Phenomena At Macro- And Micro- Levels—A Multivariate Geographically-Weighted Regression Model For Identifying Locations Vulnerable To Hosting Terrorist Safe-Houses: France As Case Study, Elyktra Eisman Nov 2015

Gis-Integrated Mathematical Modeling Of Social Phenomena At Macro- And Micro- Levels—A Multivariate Geographically-Weighted Regression Model For Identifying Locations Vulnerable To Hosting Terrorist Safe-Houses: France As Case Study, Elyktra Eisman

FIU Electronic Theses and Dissertations

Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to …


Instrumental Neutron Activation Analysis (Inaa) Of Shell-Tempered Ceramics In The Ancestral Caddo Region: Rethinking Methods, Robert Z. Selden Jr., Timothy K. Perttula Jan 2014

Instrumental Neutron Activation Analysis (Inaa) Of Shell-Tempered Ceramics In The Ancestral Caddo Region: Rethinking Methods, Robert Z. Selden Jr., Timothy K. Perttula

CRHR: Archaeology

The geochemical analysis of shell-tempered ceramics in the ancestral Caddo region has been a matter of confusion since the mid-1990s. While Caddo archaeologists have long perceived most or all of the shell-tempered ceramics in East Texas to have originated from two different areas within the Red River basin, the geochemical data and interpretations remain inconsistent with that idea. This poster takes another look at this dataset, and considers an approach that was initially put forth by MURR, and then seemingly abandoned. Using only the geochemical data from shell-tempered sherds, we take a closer look at the contributions of calcium (Ca), …


Advances In Documentation, Digital Curation, Virtual Exhibition, And A Test Of 3d Geometric Morhpometrics: A Case Study Of The Vanderpool Vessels From The Ancestral Caddo Territory, Robert Z. Selden Jr., Timothy K. Perttula, Michael J. O'Brien Jan 2014

Advances In Documentation, Digital Curation, Virtual Exhibition, And A Test Of 3d Geometric Morhpometrics: A Case Study Of The Vanderpool Vessels From The Ancestral Caddo Territory, Robert Z. Selden Jr., Timothy K. Perttula, Michael J. O'Brien

CRHR: Archaeology

Three-dimensional (3D) digital scanning of archaeological materials is typically used as a tool for artifact documentation. With the permission of the Caddo Nation of Oklahoma, 3D documentation of Caddo funerary vessels from the Vanderpool site (41SM77) was conducted with the initial goal of ensuring that these data would be publicly available for future research long after the vessels were repatriated. A digital infrastructure was created to archive and disseminate the resultant 3D datasets, ensuring that they would be accessible by both researchers and the general public (CRHR 2014a). However, 3D imagery can be used for much more than documentation. To …


A Bayesian Regression Tree Approach To Identify The Effect Of Nanoparticles Properties On Toxicity Profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, Donatello Telesca Mar 2013

A Bayesian Regression Tree Approach To Identify The Effect Of Nanoparticles Properties On Toxicity Profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, Donatello Telesca

COBRA Preprint Series

We introduce a Bayesian multiple regression tree model to characterize relationships between physico-chemical properties of nanoparticles and their in-vitro toxicity over multiple doses and times of exposure. Unlike conventional models that rely on data summaries, our model solves the low sample size issue and avoids arbitrary loss of information by combining all measurements from a general exposure experiment across doses, times of exposure, and replicates. The proposed technique integrates Bayesian trees for modeling threshold effects and interactions, and penalized B-splines for dose and time-response surfaces smoothing. The resulting posterior distribution is sampled via a Markov Chain Monte Carlo algorithm. This …


Epistemology And Synthesis: Instrumental Neutron Activation Analysis And The Caddo Tradition, Robert Z. Selden Jr. Jan 2013

Epistemology And Synthesis: Instrumental Neutron Activation Analysis And The Caddo Tradition, Robert Z. Selden Jr.

CRHR: Archaeology

The statistical groupings illustrated herein represent the current iteration of Caddo INAA compositional groups based upon the chemical composition of archaeologically-recovered ceramics. For some time, a number of Caddo archaeologists have thought these results to be lacking. This poster symbolizes the first step toward a new interpretation of chemical composition groups, and the initial instancce within which GIS has been employed as an analytical tool.


Differential Patterns Of Interaction And Gaussian Graphical Models, Masanao Yajima, Donatello Telesca, Yuan Ji, Peter Muller Apr 2012

Differential Patterns Of Interaction And Gaussian Graphical Models, Masanao Yajima, Donatello Telesca, Yuan Ji, Peter Muller

COBRA Preprint Series

We propose a methodological framework to assess heterogeneous patterns of association amongst components of a random vector expressed as a Gaussian directed acyclic graph. The proposed framework is likely to be useful when primary interest focuses on potential contrasts characterizing the association structure between known subgroups of a given sample. We provide inferential frameworks as well as an efficient computational algorithm to fit such a model and illustrate its validity through a simulation. We apply the model to Reverse Phase Protein Array data on Acute Myeloid Leukemia patients to show the contrast of association structure between refractory patients and relapsed …


Spatial And Temporal Correlations Of Freeway Link Speeds: An Empirical Study, Piotr J. Rachtan Jan 2012

Spatial And Temporal Correlations Of Freeway Link Speeds: An Empirical Study, Piotr J. Rachtan

Masters Theses 1911 - February 2014

Congestion on roadways and high level of uncertainty of traffic conditions are major considerations for trip planning. The purpose of this research is to investigate the characteristics and patterns of spatial and temporal correlations and also to detect other variables that affect correlation in a freeway setting. 5-minute speed aggregates from the Performance Measurement System (PeMS) database are obtained for two directions of an urban freeway – I-10 between Santa Monica and Los Angeles, California. Observations are for all non-holiday weekdays between January 1st and June 30th, 2010. Other variables include traffic flow, ramp locations, number of lanes and the …


Modeling Regional Radicarbon Trends: A Case Study From The East Texas Woodland Period, Robert Z. Selden Jr. Jan 2012

Modeling Regional Radicarbon Trends: A Case Study From The East Texas Woodland Period, Robert Z. Selden Jr.

CRHR: Archaeology

The East Texas Radiocarbon Database contributes to an analysis of tempo and place for Woodland era (~500 BC–AD 800) archaeological sites within the region. The temporal and spatial distributions of calibrated 14C ages (n = 127) with a standard deviation (ΔT) of 61 from archaeological sites with Woodland components (n = 51) are useful in exploring the development and geographical continuity of the peoples in east Texas, and lead to a refinement of our current chronological understanding of the period. While analysis of summed probability distributions (SPDs) produces less than significant findings due to sample size, they are used …


Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr. Oct 2011

Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr.

CHIP Documents

In any scientific discipline, the ability to portray research patterns graphically often aids greatly in interpreting a phenomenon. In part to depict phenomena, the statistics and capabilities of meta-analytic models have grown increasingly sophisticated. Accordingly, this article details how to move the constant in weighted meta-analysis regression models (viz. “meta-regression”) to illuminate the patterns in such models across a range of complexities. Although it is commonly ignored in practice, the constant (or intercept) in such models can be indispensible when it is not relegated to its usual static role. The moving constant technique makes possible estimates and confidence intervals at …


Performance Indices For On-Ice Hockey Statistics, William (Bill) H. Williams Aug 1995

Performance Indices For On-Ice Hockey Statistics, William (Bill) H. Williams

Publications and Research

No abstract provided.