Open Access. Powered by Scholars. Published by Universities.®

Multivariate Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 160

Full-Text Articles in Multivariate Analysis

A Novel Correction For The Multivariate Ljung-Box Test, Minhao Huang May 2024

A Novel Correction For The Multivariate Ljung-Box Test, Minhao Huang

Computational and Data Sciences (PhD) Dissertations

This research introduces an analytical improvement to the Multivariate Ljung-Box test that addresses significant deviations of the original test from the nominal Type I error rates under almost all scenarios. Prior attempts to mitigate this issue have been directed at modification of the test statistics or correction of the test distribution to achieve precise results in finite samples. In previous studies, focused on designing corrections to the univariate Ljung-Box, a method that specifically adjusts the test rejection region has been the most successful of attaining the best Type I error rates. We adopt the same approach for the more complex, …


Session 6: Model-Based Clustering Analysis On The Spatial-Temporal And Intensity Patterns Of Tornadoes, Yana Melnykov, Yingying Zhang, Rong Zheng Feb 2024

Session 6: Model-Based Clustering Analysis On The Spatial-Temporal And Intensity Patterns Of Tornadoes, Yana Melnykov, Yingying Zhang, Rong Zheng

SDSU Data Science Symposium

Tornadoes are one of the nature’s most violent windstorms that can occur all over the world except Antarctica. Previous scientific efforts were spent on studying this nature hazard from facets such as: genesis, dynamics, detection, forecasting, warning, measuring, and assessing. While we want to model the tornado datasets by using modern sophisticated statistical and computational techniques. The goal of the paper is developing novel finite mixture models and performing clustering analysis on the spatial-temporal and intensity patterns of the tornadoes. To analyze the tornado dataset, we firstly try a Gaussian distribution with the mean vector and variance-covariance matrix represented as …


Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe Jan 2024

Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe

Data Science and Data Mining

This project estimates a regression model to predict the superconducting critical temperature based on variables extracted from the superconductor’s chemical formula. The regression model along with the stepwise variable selection gives a reasonable and good predictive model with a lower prediction error (MSE). Variables extracted based on atomic radius, valence, atomic mass and thermal conductivity appeared to have the most contribution to the predictive model.


Measuring The Performance Of Sdgs In Provincial Level Using Regional Sustainable Development Index, Nurafiza Thamrin, Ika Yuni Wulansari, Puguh Bodro Irawan Dec 2023

Measuring The Performance Of Sdgs In Provincial Level Using Regional Sustainable Development Index, Nurafiza Thamrin, Ika Yuni Wulansari, Puguh Bodro Irawan

Journal of Environmental Science and Sustainable Development

Measuring the national and sub-national progress in achieving such globally adopted development agendas as Sustainable Development Goals (SDGs) is particularly challenging due to data availability and compatibility of indicators to measure SDGs, especially in Indonesia. This paper attempts to measure the performance of sustainable development at the regional level in Indonesia by newly constructing a multidimensional composite index called the Regional Sustainable Development Index (RSDI). RSDI comprises four dimensions, covering comprehensive economic, social, environmental, and governance indicators. By applying factor analysis, the paper assesses the uncertainty of RSDI and the sensitivity of its composing indicators, then further investigates the relationship …


Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia Dec 2023

Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia

Journal of Nonprofit Innovation

Urban farming can enhance the lives of communities and help reduce food scarcity. This paper presents a conceptual prototype of an efficient urban farming community that can be scaled for a single apartment building or an entire community across all global geoeconomics regions, including densely populated cities and rural, developing towns and communities. When deployed in coordination with smart crop choices, local farm support, and efficient transportation then the result isn’t just sustainability, but also increasing fresh produce accessibility, optimizing nutritional value, eliminating the use of ‘forever chemicals’, reducing transportation costs, and fostering global environmental benefits.

Imagine Doris, who is …


Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre Dec 2023

Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre

SMU Data Science Review

Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …


Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako Nov 2023

Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako

Doctoral Dissertations

This dissertation is in the field of Nonparametric Derivative Estimation using
Penalized Splines. It is conducted in two parts. In the first part, we study the L2
convergence rates of estimating derivatives of mean regression functions using penalized splines. In 1982, Stone provided the optimal rates of convergence for estimating derivatives of mean regression functions using nonparametric methods. Using these rates, Zhou et. al. in their 2000 paper showed that the MSE of derivative estimators based on regression splines approach zero at the optimal rate of convergence. Also, in 2019, Xiao showed that, under some general conditions, penalized spline estimators …


Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi Sep 2023

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi

Electronic Thesis and Dissertation Repository

The Expectation-Maximization (EM) algorithm is an iterative algorithm for finding the maximum likelihood estimates in problems involving missing data or latent variables. The EM algorithm can be applied to problems consisting of evidently incomplete data or missingness situations, such as truncated distributions, censored or grouped observations, and also to problems in which the missingness of the data is not natural or evident, such as mixed-effects models, mixture models, log-linear models, and latent variables. In Chapter 2 of this thesis, we apply the EM algorithm to grouped data, a problem in which incomplete data are evident. Nowadays, data confidentiality is of …


Statistical And Biological Analyses Of Acoustic Signals In Estrildid Finches, Moises Rivera Jun 2023

Statistical And Biological Analyses Of Acoustic Signals In Estrildid Finches, Moises Rivera

Dissertations, Theses, and Capstone Projects

Acoustic communication is a process that involves auditory perception and signal processing. Discrimination and recognition further require cognitive processes and supporting mechanisms in order to successfully identify and appropriately respond to signal senders. Although acoustic communication is common across birds, classical research has largely disregarded the perceptual abilities of perinatal altricial taxa. Chapter 1 reviews the literature of perinatal acoustic stimulation in birds, highlighting the disproportionate focus on precocial birds (e.g., chickens, ducks, quails). The long-held belief that altricial birds were incapable of acoustic perception in ovo was only recently overturned, as researchers began to find behavioral and physiological evidence …


Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash Apr 2023

Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash

Symposium of Student Scholars

Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …


Application Of Gaussian Mixture Models To Simulated Additive Manufacturing, Jason Hasse, Semhar Michael, Anamika Prasad Feb 2023

Application Of Gaussian Mixture Models To Simulated Additive Manufacturing, Jason Hasse, Semhar Michael, Anamika Prasad

SDSU Data Science Symposium

Additive manufacturing (AM) is the process of building components through an iterative process of adding material in specific designs. AM has a wide range of process parameters that influence the quality of the component. This work applies Gaussian mixture models to detect clusters of similar stress values within and across components manufactured with varying process parameters. Further, a mixture of regression models is considered to simultaneously find groups and also fit regression within each group. The results are compared with a previous naive approach.


Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth Feb 2023

Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth

Articles

A model is proposed for two-way tables of measurement data containing outliers. The two independent variables are categorical and error free. Neither missing values nor replication are present. The model consists of the sum of a customary additive part that can be fit using least squares and a part that is composed of outliers. Recommendations are made for methods for identifying cells containing outliers and for fitting the model. A graph of the observations is used to determine the outliers’ locations. For all cells containing an outlier, replacement values are determined simultaneously using a classical missing-data tool. The result is …


The Impact Of Subjective Risk Analysis On Real Estate Prices In The Nisqually Region Following The 2001 Nisqually Earthquake, Ryan Espedal Jan 2023

The Impact Of Subjective Risk Analysis On Real Estate Prices In The Nisqually Region Following The 2001 Nisqually Earthquake, Ryan Espedal

All Master's Theses

Earthquakes are an environmental hazard that pose great risks to communities almost every day. With earthquakes, the main cause of concern is physical destruction of property, however, there are also psychological effects that are researched and discussed much less. In 2001, the Nisqually area of western Washington experienced a substantial earthquake that produced minimal physical damage but caused a significant decrease in real estate prices. Studying single-family homes from 1986-2012, this research utilizes hedonic property models to measure the change in consumer’s subjective risk calculations with reference to real estate purchases after the Nisqually earthquake, measure the relationship between earthquake …


Enforcement Penalties At The Itc, Andrea R. Hugill, John C. Jarosz, Katherine D. Cappaert Jan 2023

Enforcement Penalties At The Itc, Andrea R. Hugill, John C. Jarosz, Katherine D. Cappaert

Northwestern Journal of International Law & Business

The U.S. International Trade Commission (“ITC” or “Commission”) has grown in importance as a venue for U.S. companies to pursue intellectual property (“IP”) violators and to block the sale or importation of goods from overseas that infringe U.S. IP rights. Once a violation of the Section 337 of the Tariff Act of 1930 is found, an order halting further infringement, including importation, is almost always entered. In theory, potentially sizeable penalties may be imposed on entities that do not comply with the terms of an import restriction. In practice, the terms of an import restriction are almost always honored, but …


Modeling Growth And Stress Factors For Converted Silvopasture Systems In The Missouri Ozarks, Bailee N. Suedmeyer Jan 2023

Modeling Growth And Stress Factors For Converted Silvopasture Systems In The Missouri Ozarks, Bailee N. Suedmeyer

MSU Graduate Theses

Silvopasture systems are becoming increasingly popular among sustainable agriculture ranchers, due to the increase in knowledge of benefits to the cattle and ability to grow cool season grasses beneath the canopy. This project focuses on the forest crop aspect of silvopasture systems from monitoring of the health of the trees over time to recommendations for thinning management to keep it functioning as viable silvopasture. The study site consists of five acres of upland hardwood forest area in Southern Missouri with 18 monumented fixed area plots. Arial and ground data was collected at each plot throughout the growing season, along with …


Impacts Of Covid-19 On Industrial Growth In The United States, Emily G. Warthman, Charles J. Landis Jan 2023

Impacts Of Covid-19 On Industrial Growth In The United States, Emily G. Warthman, Charles J. Landis

Williams Honors College, Honors Research Projects

COVID-19 has caused massive ramifications on all parts of life in the world and industry growth/decline is not immune to it. This report will analyze nine different industries’ profit and revenue from quarterly data during the years 2009-2022. Forecast models will be generated using various methods and different techniques of validating to predict the values from Q2 2020- Q4 2022 based on historical data. After which, a comparison will be conducted between those predicted values to the actual average revenue and profit generated by order of greatest error percentage made. Thorough research will then be completed to determine if there …


Predicting Insulin Pump Therapy Settings, Riccardo L. Ferraro, David Grijalva, Alex Trahan Sep 2022

Predicting Insulin Pump Therapy Settings, Riccardo L. Ferraro, David Grijalva, Alex Trahan

SMU Data Science Review

Millions of people live with diabetes worldwide [7]. To mitigate some of the many symptoms associated with diabetes, an estimated 350,000 people in the United States rely on insulin pumps [17]. For many of these people, how effectively their insulin pump performs is the difference between sleeping through the night and a life threatening emergency treatment at a hospital. Three programmed insulin pump therapy settings governing effective insulin pump function are: Basal Rate (BR), Insulin Sensitivity Factor (ISF), and Carbohydrate Ratio (ICR). For many people using insulin pumps, these therapy settings are often not correct, given their physiological needs. While …


A Bayesian Programming Approach To Car-Following Model Calibration And Validation Using Limited Data, Franklin Abodo Jun 2022

A Bayesian Programming Approach To Car-Following Model Calibration And Validation Using Limited Data, Franklin Abodo

FIU Electronic Theses and Dissertations

Traffic simulation software is used by transportation researchers and engineers to design and evaluate changes to roadway networks. Underlying these simulators are mathematical models of microscopic driver behavior from which macroscopic measures of flow and congestion can be recovered. Many models are intended to apply to only a subset of possible traffic scenarios and roadway configurations, while others do not have any explicit constraint on their applicability. Work zones on highways are one scenario for which no model invented to date has been shown to accurately reproduce realistic driving behavior. This makes it difficult to optimize for safety and other …


Evaluating Soil Health Changes Following Cover Crop And No-Till Integration Into A Soybean (Glycine Max) Cropping System In The Mississippi Alluvial Valley, Alexandra Gwin Firth May 2022

Evaluating Soil Health Changes Following Cover Crop And No-Till Integration Into A Soybean (Glycine Max) Cropping System In The Mississippi Alluvial Valley, Alexandra Gwin Firth

Theses and Dissertations

The transition of natural landscapes to intensive agricultural uses has resulted in severe loss of soil organic carbon (SOC), increased CO₂ emissions, river depletion, and groundwater overdraft. Despite negative documented effects of agricultural land use (i.e., soil erosion, nutrient runoff) on critical natural resources (i.e., water, soil), food production must increase to meet the demands of a rising human population. Given the environmental and agricultural productivity concerns of intensely managed soils, it is critical to implement conservation practices that mitigate the negative effects of crop production and enhance environmental integrity. In the Mississippi Alluvial Valley (MAV) region of Mississippi, USA, …


Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano Apr 2022

Intra-Hour Solar Forecasting Using Cloud Dynamics Features Extracted From Ground-Based Infrared Sky Images, Guillermo Terrén-Serrano

Electrical and Computer Engineering ETDs

Due to the increasing use of photovoltaic systems, power grids are vulnerable to the projection of shadows from moving clouds. An intra-hour solar forecast provides power grids with the capability of automatically controlling the dispatch of energy, reducing the additional cost for a guaranteed, reliable supply of energy (i.e., energy storage). This dissertation introduces a novel sky imager consisting of a long-wave radiometric infrared camera and a visible light camera with a fisheye lens. The imager is mounted on a solar tracker to maintain the Sun in the center of the images throughout the day, reducing the scattering effect produced …


Analysis Of Minor League Rule Changes Effect On Stolen Bases, Zachary Houghtaling Jan 2022

Analysis Of Minor League Rule Changes Effect On Stolen Bases, Zachary Houghtaling

Williams Honors College, Honors Research Projects

This study uses various statistical analyses to evaluate the justification of rule changes for Major League Baseball that were implemented within the Minor Leagues during the 2021 minor league season. The primary focus of the study is predicting how some of these Minor League rule changes could affect the stolen base success rate and the number of attempts per game within the Major Leagues. A survey was conducted to evaluate how fans feel about stolen bases within the current game and if rules should be altered to increase the number of stolen bases that occur. Additionally, recorded Major and Minor …


Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan Oct 2021

Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan

Doctoral Dissertations

Carnivores are distributed widely and threatened by habitat loss, poaching, climate change, and disease. They are considered integral to ecosystem function through their direct and indirect interactions with species at different trophic levels. Given the importance of carnivores, it is of high conservation priority to understand the processes driving carnivore assemblages in different systems. It is thus essential to determine the abiotic and biotic drivers of carnivore community composition at different spatial scales and address the following questions: (i) What factors influence carnivore community composition and diversity? (ii) How do the factors influencing carnivore communities vary across spatial and temporal …


Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin Aug 2021

Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin

Electronic Theses and Dissertations

In this work, we seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. To develop this method, we consider data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as the longitudinal response variable, we consider a Bayesian mixture model with $K$ components. The data consist of a large collection of demographic, financial, and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster-level predictors is fit to the data through an MCMC …


Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang Jul 2021

Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang

Doctoral Dissertations

In the process of statistical modeling, the descriptive modeling plays an essential role in accelerating the formulation of plausible hypotheses in the subsequent explanatory modeling and facilitating the selection of potential variables in the subsequent predictive modeling. Especially, for multivariate categorical data analysis, it is desirable to use the descriptive modeling methods for uncovering and summarizing the potential association structure among multiple categorical variables in a compact manner. However, many classical methods in this case either rely on strong assumptions for parametric models or become infeasible when the data dimension is higher. To this end, we propose a model-free method …


Lecture 04: Spatial Statistics Applications Of Hrl, Trl, And Mixed Precision, David Keyes Apr 2021

Lecture 04: Spatial Statistics Applications Of Hrl, Trl, And Mixed Precision, David Keyes

Mathematical Sciences Spring Lecture Series

As simulation and analytics enter the exascale era, numerical algorithms, particularly implicit solvers that couple vast numbers of degrees of freedom, must span a widening gap between ambitious applications and austere architectures to support them. We present fifteen universals for researchers in scalable solvers: imperatives from computer architecture that scalable solvers must respect, strategies towards achieving them that are currently well established, and additional strategies currently being developed for an effective and efficient exascale software ecosystem. We consider recent generalizations of what it means to “solve” a computational problem, which suggest that we have often been “oversolving” them at the …


Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao Jan 2021

Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao

Theses and Dissertations

Drug addiction can lead to many health-related problems and social concerns. Functional connectivity obtained from functional magnetic resonance imaging (fMRI) data promotes a variety of fundamental understandings in such association. Due to its complex correlation structure and large dimensionality, the modeling and analysis of the functional connectivity from neuroimage are challenging. By proposing a spatio-temporal model for multi-subject neuroimage data, we incorporate voxel-level spatio-temporal dependencies of whole-brain measurements to improve the accuracy of statistical inference. To tackle large-scale spatio-temporal neuroimage data, we develop a computationally efficient algorithm to estimate the parameters. Our method is used to identify functional connectivity and …


Statistical Methods In Genetic Studies, Cheng Gao Jan 2021

Statistical Methods In Genetic Studies, Cheng Gao

Dissertations, Master's Theses and Master's Reports

This dissertation includes three Chapters. A brief description of each chapter is organized as follows.

In Chapter 1, we proposed a new method, called MF-TOWmuT, for genome-wide association studies with multiple genetic variants and multiple phenotypes using family samples. MF-TOWmuT uses kinship matrix to account for sample relatedness. It is worth mentioning that in simulations, we considered hidden polygenic effects and varied the proportion of variance contributed by it to generate phenotypes. Simulation studies show that MF-TOWmuT can preserve the type I error rates and is more powerful than several existing methods in different simulation scenarios, MFTOWmuT is also quite …


Neither “Post-War” Nor Post-Pregnancy Paranoia: How America’S War On Drugs Continues To Perpetuate Disparate Incarceration Outcomes For Pregnant, Substance-Involved Offenders, Becca S. Zimmerman Jan 2021

Neither “Post-War” Nor Post-Pregnancy Paranoia: How America’S War On Drugs Continues To Perpetuate Disparate Incarceration Outcomes For Pregnant, Substance-Involved Offenders, Becca S. Zimmerman

Pitzer Senior Theses

This thesis investigates the unique interactions between pregnancy, substance involvement, and race as they relate to the War on Drugs and the hyper-incarceration of women. Using ordinary least square regression analyses and data from the Bureau of Justice Statistics’ 2016 Survey of Prison Inmates, I examine if (and how) pregnancy status, drug use, race, and their interactions influence two length of incarceration outcomes: sentence length and amount of time spent in jail between arrest and imprisonment. The results collectively indicate that pregnancy decreases length of incarceration outcomes for those offenders who are not substance-involved but not evenhandedly -- benefitting white …


Variation In Personality Among Semi-Wild Myanmar Timber Elephants, Sateesh Venkatesh Dec 2020

Variation In Personality Among Semi-Wild Myanmar Timber Elephants, Sateesh Venkatesh

Theses and Dissertations

This study examines two personality traits: exploration and neophobia, which could influence human-elephant conflicts. Thirty-one semi-wild elephants were tested over two trials using a custom novel puzzle tube containing three tasks and three rewards. Our studies show that elephants do vary significantly between individuals in both exploration and neophobia.


A Geochemical And Statistical Investigation Of The Big Four Springs Region In Southern Missouri, Jordan Jasso Vega Aug 2020

A Geochemical And Statistical Investigation Of The Big Four Springs Region In Southern Missouri, Jordan Jasso Vega

MSU Graduate Theses

The Big Four Springs region hosts four major first-order magnitude springs in southern Missouri and northern Arkansas. These springs are Big Spring (Carter County, MO), Greer Spring (Oregon County, MO), Mammoth Spring (Fulton County, AR), and Hodgson Mill Spring (Ozark County, MO). Based on historic dye traces and hydrogeological investigations, these springs drain an area of approximately 1500 square miles and collectively discharge an average of 780 million gallons of water per day. The rocks from youngest to oldest that are found in Big Four Springs region are the Cotter and Jefferson City Dolomite (Ordovician), Roubidoux Formation (Ordovician), Gasconade Dolomite …