Open Access. Powered by Scholars. Published by Universities.®

Multivariate Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

394 Full-Text Articles 623 Authors 207,880 Downloads 83 Institutions

All Articles in Multivariate Analysis

Faceted Search

394 full-text articles. Page 4 of 16.

Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui 2020 University of Kentucky

Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui

Theses and Dissertations--Statistics

In this dissertation, we investigate three distinct but interrelated problems for nonparametric analysis of clustered data and multivariate data in pre-post factorial design.

In the first project, we propose a nonparametric approach for one-sample clustered data in pre-post intervention design. In particular, we consider the situation where for some clusters all members are only observed at either pre or post intervention but not both. This type of clustered data is referred to us as partially complete clustered data. Unlike most of its parametric counterparts, we do not assume specific models for data distributions, intra-cluster dependence structure or variability, in effect …


Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu 2020 University of Kentucky

Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu

Theses and Dissertations--Statistics

A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …


Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan 2020 West Virginia University

Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan

Graduate Theses, Dissertations, and Problem Reports

Fluvial deposits represent some of the best hydrocarbon reservoirs, but the quality of fluvial reservoirs varies depending on the reservoir architecture, which is controlled by allogenic and autogenic processes. Allogenic controls, including paleoclimate, tectonics, and glacio-eustasy, have long been debated as dominant controls in the deposition of fluvial strata. However, recent research has questioned the validity of this cyclicity and may indicate major influence from autogenic controls. To further investigate allogenic controls on stratal order, I analyzed the facies architecture, geomorphology, paleohydrology, and the stratigraphic framework of the Middle Pennsylvanian Allegheny Formation (MPAF), a fluvial depositional system in the Appalachian …


Three Essays On Health Economics And Policy Evaluation, Shishir Shakya 2020 West Virginia University

Three Essays On Health Economics And Policy Evaluation, Shishir Shakya

Graduate Theses, Dissertations, and Problem Reports

This dissertation consists of three essays on the U.S. Health care policy. Each paragraph below refers to the three abstracts for the three chapters in this dissertation, respectively. I provide quantitative evidence on how much Prescription Drug Monitoring Programs (PDMPs) affects the retail opioid prescribing behaviors. Using the American Community Survey (ACS), I retrieve county-level high dimensional panel data set from 2010 to 2017. I employ three separate identification strategies: difference-in-difference, double selection post-LASSO, and spatial difference-in-difference. I compare how the retail opioid prescribing behaviors of counties, that are mandatory for prescribers to check the PDMP before prescribing controlled substances …


An Assessment Of Convergence In The Feeding Morphology Of Xiphactinus Audax And Megalops Atlanticus Using Landmark-Based Geometric Morphometrics, Edward Chase Shelburne 2020 Fort Hays State University

An Assessment Of Convergence In The Feeding Morphology Of Xiphactinus Audax And Megalops Atlanticus Using Landmark-Based Geometric Morphometrics, Edward Chase Shelburne

Master's Theses

Convergence is an evolutionary phenomenon wherein distantly related organisms independently develop features or functional adaptations to overcome similar environmental constraints. Historically, convergence among organisms has been speculated or asserted with little rigorous or quantitative investigation. More recent advancements in systematics has allowed for the detection and study of convergence in a phylogenetic context, but this does little to elucidate convergent anatomical features in extinct taxa with poorly understood evolutionary histories. The purpose of this study is to investigate one potentially convergent system—the feeding structure of Xiphactinus audax (Teleostei: Ichthyodectiformes) and Megalops atlanticus (Teleostei: Elopiformes)—using a comparative anatomical approach to assess …


Theory Of Principal Components For Applications In Exploratory Crime Analysis And Clustering, Daniel Silva 2020 Minnesota State University, Mankato

Theory Of Principal Components For Applications In Exploratory Crime Analysis And Clustering, Daniel Silva

All Graduate Theses, Dissertations, and Other Capstone Projects

The purpose of this paper is to develop the theory of principal components analysis succinctly from the fundamentals of matrix algebra and multivariate statistics. Principal components analysis is sometimes used as a descriptive technique to explain the variance-covariance or correlation structure of a dataset. However, most often, it is used as a dimensionality reduction technique to visualize a high dimensional dataset in a lower dimensional space. Principal components analysis accomplishes this by using the first few principal components, provided that they account for a substantial proportion of variation in the original dataset. In the same way, the first few principal …


Joint Simulation Of Continuous And Categorical Variables For Mineral Resource Modeling And Recoverable Reserves Calculation, Sentle Augustinus Hlajoane 2020 Michigan Technological University

Joint Simulation Of Continuous And Categorical Variables For Mineral Resource Modeling And Recoverable Reserves Calculation, Sentle Augustinus Hlajoane

Dissertations, Master's Theses and Master's Reports

Spatial variability and uncertainty of continuous variables (grade) and categorical variables (rock-types) in mineral evaluation significantly impact the economics of mining projects. The conventional approach of simulating grades using deterministic rock- types is problematic since spatial variability, and uncertainty of grades at rock-type contacts are not well captured in deposits where the grade changes gradually between rock-types. Therefore, jointly simulating these variables can improve confidence (reduce uncertainty) in a resource model. Also, resource classification and recoverable reserve calculation can significantly improve the understanding of the deposit and its economic viability. This research utilized the Plural-Gaussian geostatistical simulation to jointly simulate …


Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma 2019 University of Washington

Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma

UW Biostatistics Working Paper Series

Analysis of two-way structured data, i.e., data with structures among both variables and samples, is becoming increasingly common in ecology, biology and neuro-science. Classical dimension-reduction tools, such as the singular value decomposition (SVD), may perform poorly for two-way structured data. The generalized matrix decomposition (GMD, Allen et al., 2014) extends the SVD to two-way structured data and thus constructs singular vectors that account for both structures. While the GMD is a useful dimension-reduction tool for exploratory analysis of two-way structured data, it is unsupervised and cannot be used to assess the association between such data and an outcome of interest. …


Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie 2019 University of Washington - Seattle Campus

Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie

UW Biostatistics Working Paper Series

Fueled in part by recent applications in neuroscience, high-dimensional Hawkes process have become a popular tool for modeling the network of interactions among multivariate point process data. While evaluating the uncertainty of the network estimates is critical in scientific applications, existing methodological and theoretical work have only focused on estimation. To bridge this gap, this paper proposes a high-dimensional statistical inference procedure with theoretical guarantees for multivariate Hawkes process. Key to this inference procedure is a new concentration inequality on the first- and second-order statistics for integrated stochastic processes, which summarizes the entire history of the process. We apply this …


Function Space Tensor Decomposition And Its Application In Sports Analytics, Justin Reising 2019 East Tennessee State University

Function Space Tensor Decomposition And Its Application In Sports Analytics, Justin Reising

Electronic Theses and Dissertations

Recent advancements in sports information and technology systems have ushered in a new age of applications of both supervised and unsupervised analytical techniques in the sports domain. These automated systems capture large volumes of data points about competitors during live competition. As a result, multi-relational analyses are gaining popularity in the field of Sports Analytics. We review two case studies of dimensionality reduction with Principal Component Analysis and latent factor analysis with Non-Negative Matrix Factorization applied in sports. Also, we provide a review of a framework for extending these techniques for higher order data structures. The primary scope of this …


Habitat Associations And Reproduction Of Fishes On The Northwestern Gulf Of Mexico Shelf Edge, Elizabeth Marie Keller 2019 Louisiana State University and Agricultural and Mechanical College

Habitat Associations And Reproduction Of Fishes On The Northwestern Gulf Of Mexico Shelf Edge, Elizabeth Marie Keller

LSU Doctoral Dissertations

Several of the northwestern Gulf of Mexico (GOM) shelf-edge banks provide critical hard bottom habitat for coral and fish communities, supporting a wide diversity of ecologically and economically important species. These sites may be fish aggregation and spawning sites and provide important habitat for fish growth and reproduction. Already designated as habitat areas of particular concern, many of these banks are also under consideration for inclusion in the expansion of the Flower Garden Banks National Marine Sanctuary. This project aimed to gain a more comprehensive understanding of the communities and fish species on shelf-edge banks by way of gonad histology, …


Classification Of Coronary Artery Disease In Non-Diabetic Patients Using Artificial Neural Networks, Demond Handley 2019 Illinois State University

Classification Of Coronary Artery Disease In Non-Diabetic Patients Using Artificial Neural Networks, Demond Handley

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood 2019 Duquesne University

Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood

Electronic Theses and Dissertations

Premature birth has been identified as the single greatest cause of death worldwide in children under the age of five. This thesis will implement binary logistic regression and proportional odds ordinal logistic regression to predict different levels of premature birth and identify associated risk factors. The models will be built from the Center for Disease Control and Prevention's 2014 Vital Statistics Natality Birth Data containing nearly 4 million live births within the United States. Odds ratios and confidence intervals on risk factors were produced utilizing binary logistic regression.


Optimal Design For A Causal Structure, Zaher Kmail 2019 University of Nebraska-Lincoln

Optimal Design For A Causal Structure, Zaher Kmail

Dissertations and Theses in Statistics

Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.

Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …


Taking Multiple Regression Analysis To Task: A Review Of Mindware: Tools For Smart Thinking, By Richard Nisbett (2015), Jason Makansi 2019 Pearl Street Inc.

Taking Multiple Regression Analysis To Task: A Review Of Mindware: Tools For Smart Thinking, By Richard Nisbett (2015), Jason Makansi

Numeracy

Richard Nisbett. 2015. Mindware: Tools for Smart Thinking.(New York, NY: Farrar, Strauss, and Giroux). 336 pp. ISBN: 9780374536244

Nisbett, a psychologist, may not achieve his stated goal of teaching readers to “effortlessly” extend their common sense when it comes to quantitative analysis applied to everyday issues, but his critique of multiple regression analysis (MRA) in the middle chapters of Mindware is worth attention from, and contemplation by, the QL/QR and Numeracy community. While in at least one other source, Nisbett’s critique has been called a “crusade” against MRA, what he really advocates is that it not be used as …


Implementation Of Multivariate Artificial Neural Networks Coupled With Genetic Algorithms For The Multi-Objective Property Prediction And Optimization Of Emulsion Polymers, David Chisholm 2019 California Polytechnic State University, San Luis Obispo

Implementation Of Multivariate Artificial Neural Networks Coupled With Genetic Algorithms For The Multi-Objective Property Prediction And Optimization Of Emulsion Polymers, David Chisholm

Master's Theses

Machine learning has been gaining popularity over the past few decades as computers have become more advanced. On a fundamental level, machine learning consists of the use of computerized statistical methods to analyze data and discover trends that may not have been obvious or otherwise observable previously. These trends can then be used to make predictions on new data and explore entirely new design spaces. Methods vary from simple linear regression to highly complex neural networks, but the end goal is similar. The application of these methods to material property prediction and new material discovery has been of high interest …


Blacklegged Tick (Ixodes Scapularis) Distribution In Maine, Usa, As Related To Climate Change, White-Tailed Deer, And The Landscape, Susan P. Elias 2019 University of Maine

Blacklegged Tick (Ixodes Scapularis) Distribution In Maine, Usa, As Related To Climate Change, White-Tailed Deer, And The Landscape, Susan P. Elias

Electronic Theses and Dissertations

Lyme disease is caused by the bacterial spirochete Borrelia burgdorferi, which is transmitted through the bite of an infected blacklegged (deer) tick (Ixodes scapularis). Geographic invasion of I. scapularis in North America has been attributed to causes including 20th century reforestation and suburbanization, burgeoning populations of the white-tailed deer (Odocoileus virginianus) which is the primary reproductive host of I. scapularis, tick-associated non-native plant invasions, and climate change. Maine, USA, is a high Lyme disease incidence state, with a history of increasing I. scapularis abundance and northward range expansion. This thesis addresses the question: “To …


Analyzing Two-Year College Student Success Using Structural Equation Modeling, Jessica Taylor 2019 Bellarmine University

Analyzing Two-Year College Student Success Using Structural Equation Modeling, Jessica Taylor

Graduate Theses, Dissertations, and Capstones

The goal of this study is to more fully understand the scope of community college student success using the principles of mindset, engagement, and college readiness. Using structural equation modeling ensures this study is able to measure the combined effects these concepts have on student success, group differences, and the combined model of student success. Findings suggest student success can be significantly impacted by self-belief and mindset behaviors that can outweigh the initial effect of academically under-prepared students. Groups included in this study are non-traditional students, minority populations, first generation students, and Pell eligible students.


Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley 2019 Southern Methodist University

Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley

SMU Data Science Review

In this paper, we will explore and present a method of finding characteristics of a restaurant using its reviews through machine learning algorithms. We begin by building models to predict the ratings of individual reviews using text and categorical features. This is to examine the efficacy of the algorithms to the task. Both XGBoost and logistic regression will be examined. With these models, our goal is then to identify key phrases in reviews that are correlated with positive and negative experience. Our analysis makes use of review data publicly made available by Yelp. Key bigrams extracted were non-specific to the …


Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt 2019 East Tennessee State University

Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt

Electronic Theses and Dissertations

A statistician's job is to produce statistical models. When these models are precise and unbiased, we can relate them to new data appropriately. However, when data sets have missing values, assumptions to statistical methods are violated and produce biased results. The statistician's objective is to implement methods that produce unbiased and accurate results. Research in missing data is becoming popular as modern methods that produce unbiased and accurate results are emerging, such as MICE in R, a statistical software. Using real data, we compare four common imputation methods, in the MICE package in R, at different levels of missingness. The …


Digital Commons powered by bepress