Nonparametric Analysis Of Clustered And Multivariate Data,
2020
University of Kentucky
Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui
Theses and Dissertations--Statistics
In this dissertation, we investigate three distinct but interrelated problems for nonparametric analysis of clustered data and multivariate data in pre-post factorial design.
In the first project, we propose a nonparametric approach for one-sample clustered data in pre-post intervention design. In particular, we consider the situation where for some clusters all members are only observed at either pre or post intervention but not both. This type of clustered data is referred to us as partially complete clustered data. Unlike most of its parametric counterparts, we do not assume specific models for data distributions, intra-cluster dependence structure or variability, in effect …
Nonparametric Tests Of Lack Of Fit For Multivariate Data,
2020
University of Kentucky
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Theses and Dissertations--Statistics
A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …
Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv,
2020
West Virginia University
Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan
Graduate Theses, Dissertations, and Problem Reports
Fluvial deposits represent some of the best hydrocarbon reservoirs, but the quality of fluvial reservoirs varies depending on the reservoir architecture, which is controlled by allogenic and autogenic processes. Allogenic controls, including paleoclimate, tectonics, and glacio-eustasy, have long been debated as dominant controls in the deposition of fluvial strata. However, recent research has questioned the validity of this cyclicity and may indicate major influence from autogenic controls. To further investigate allogenic controls on stratal order, I analyzed the facies architecture, geomorphology, paleohydrology, and the stratigraphic framework of the Middle Pennsylvanian Allegheny Formation (MPAF), a fluvial depositional system in the Appalachian …
Three Essays On Health Economics And Policy Evaluation,
2020
West Virginia University
Three Essays On Health Economics And Policy Evaluation, Shishir Shakya
Graduate Theses, Dissertations, and Problem Reports
This dissertation consists of three essays on the U.S. Health care policy. Each paragraph below refers to the three abstracts for the three chapters in this dissertation, respectively. I provide quantitative evidence on how much Prescription Drug Monitoring Programs (PDMPs) affects the retail opioid prescribing behaviors. Using the American Community Survey (ACS), I retrieve county-level high dimensional panel data set from 2010 to 2017. I employ three separate identification strategies: difference-in-difference, double selection post-LASSO, and spatial difference-in-difference. I compare how the retail opioid prescribing behaviors of counties, that are mandatory for prescribers to check the PDMP before prescribing controlled substances …
An Assessment Of Convergence In The Feeding Morphology Of Xiphactinus Audax And Megalops Atlanticus Using Landmark-Based Geometric Morphometrics,
2020
Fort Hays State University
An Assessment Of Convergence In The Feeding Morphology Of Xiphactinus Audax And Megalops Atlanticus Using Landmark-Based Geometric Morphometrics, Edward Chase Shelburne
Master's Theses
Convergence is an evolutionary phenomenon wherein distantly related organisms independently develop features or functional adaptations to overcome similar environmental constraints. Historically, convergence among organisms has been speculated or asserted with little rigorous or quantitative investigation. More recent advancements in systematics has allowed for the detection and study of convergence in a phylogenetic context, but this does little to elucidate convergent anatomical features in extinct taxa with poorly understood evolutionary histories. The purpose of this study is to investigate one potentially convergent system—the feeding structure of Xiphactinus audax (Teleostei: Ichthyodectiformes) and Megalops atlanticus (Teleostei: Elopiformes)—using a comparative anatomical approach to assess …
Theory Of Principal Components For Applications In Exploratory Crime Analysis And Clustering,
2020
Minnesota State University, Mankato
Theory Of Principal Components For Applications In Exploratory Crime Analysis And Clustering, Daniel Silva
All Graduate Theses, Dissertations, and Other Capstone Projects
The purpose of this paper is to develop the theory of principal components analysis succinctly from the fundamentals of matrix algebra and multivariate statistics. Principal components analysis is sometimes used as a descriptive technique to explain the variance-covariance or correlation structure of a dataset. However, most often, it is used as a dimensionality reduction technique to visualize a high dimensional dataset in a lower dimensional space. Principal components analysis accomplishes this by using the first few principal components, provided that they account for a substantial proportion of variation in the original dataset. In the same way, the first few principal …
Joint Simulation Of Continuous And Categorical Variables For Mineral Resource Modeling And Recoverable Reserves Calculation,
2020
Michigan Technological University
Joint Simulation Of Continuous And Categorical Variables For Mineral Resource Modeling And Recoverable Reserves Calculation, Sentle Augustinus Hlajoane
Dissertations, Master's Theses and Master's Reports
Spatial variability and uncertainty of continuous variables (grade) and categorical variables (rock-types) in mineral evaluation significantly impact the economics of mining projects. The conventional approach of simulating grades using deterministic rock- types is problematic since spatial variability, and uncertainty of grades at rock-type contacts are not well captured in deposits where the grade changes gradually between rock-types. Therefore, jointly simulating these variables can improve confidence (reduce uncertainty) in a resource model. Also, resource classification and recoverable reserve calculation can significantly improve the understanding of the deposit and its economic viability. This research utilized the Plural-Gaussian geostatistical simulation to jointly simulate …
Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data,
2019
University of Washington
Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma
UW Biostatistics Working Paper Series
Analysis of two-way structured data, i.e., data with structures among both variables and samples, is becoming increasingly common in ecology, biology and neuro-science. Classical dimension-reduction tools, such as the singular value decomposition (SVD), may perform poorly for two-way structured data. The generalized matrix decomposition (GMD, Allen et al., 2014) extends the SVD to two-way structured data and thus constructs singular vectors that account for both structures. While the GMD is a useful dimension-reduction tool for exploratory analysis of two-way structured data, it is unsupervised and cannot be used to assess the association between such data and an outcome of interest. …
Statistical Inference For Networks Of High-Dimensional Point Processes,
2019
University of Washington - Seattle Campus
Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie
UW Biostatistics Working Paper Series
Fueled in part by recent applications in neuroscience, high-dimensional Hawkes process have become a popular tool for modeling the network of interactions among multivariate point process data. While evaluating the uncertainty of the network estimates is critical in scientific applications, existing methodological and theoretical work have only focused on estimation. To bridge this gap, this paper proposes a high-dimensional statistical inference procedure with theoretical guarantees for multivariate Hawkes process. Key to this inference procedure is a new concentration inequality on the first- and second-order statistics for integrated stochastic processes, which summarizes the entire history of the process. We apply this …
Function Space Tensor Decomposition And Its Application In Sports Analytics,
2019
East Tennessee State University
Function Space Tensor Decomposition And Its Application In Sports Analytics, Justin Reising
Electronic Theses and Dissertations
Recent advancements in sports information and technology systems have ushered in a new age of applications of both supervised and unsupervised analytical techniques in the sports domain. These automated systems capture large volumes of data points about competitors during live competition. As a result, multi-relational analyses are gaining popularity in the field of Sports Analytics. We review two case studies of dimensionality reduction with Principal Component Analysis and latent factor analysis with Non-Negative Matrix Factorization applied in sports. Also, we provide a review of a framework for extending these techniques for higher order data structures. The primary scope of this …
Habitat Associations And Reproduction Of Fishes On The Northwestern Gulf Of Mexico Shelf Edge,
2019
Louisiana State University and Agricultural and Mechanical College
Habitat Associations And Reproduction Of Fishes On The Northwestern Gulf Of Mexico Shelf Edge, Elizabeth Marie Keller
LSU Doctoral Dissertations
Several of the northwestern Gulf of Mexico (GOM) shelf-edge banks provide critical hard bottom habitat for coral and fish communities, supporting a wide diversity of ecologically and economically important species. These sites may be fish aggregation and spawning sites and provide important habitat for fish growth and reproduction. Already designated as habitat areas of particular concern, many of these banks are also under consideration for inclusion in the expansion of the Flower Garden Banks National Marine Sanctuary. This project aimed to gain a more comprehensive understanding of the communities and fish species on shelf-edge banks by way of gonad histology, …
Classification Of Coronary Artery Disease In Non-Diabetic Patients Using Artificial Neural Networks,
2019
Illinois State University
Classification Of Coronary Artery Disease In Non-Diabetic Patients Using Artificial Neural Networks, Demond Handley
Annual Symposium on Biomathematics and Ecology Education and Research
No abstract provided.
Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression,
2019
Duquesne University
Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood
Electronic Theses and Dissertations
Premature birth has been identified as the single greatest cause of death worldwide in children under the age of five. This thesis will implement binary logistic regression and proportional odds ordinal logistic regression to predict different levels of premature birth and identify associated risk factors. The models will be built from the Center for Disease Control and Prevention's 2014 Vital Statistics Natality Birth Data containing nearly 4 million live births within the United States. Odds ratios and confidence intervals on risk factors were produced utilizing binary logistic regression.
Optimal Design For A Causal Structure,
2019
University of Nebraska-Lincoln
Optimal Design For A Causal Structure, Zaher Kmail
Dissertations and Theses in Statistics
Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.
Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …
Taking Multiple Regression Analysis To Task: A Review Of Mindware: Tools For Smart Thinking, By Richard Nisbett (2015),
2019
Pearl Street Inc.
Taking Multiple Regression Analysis To Task: A Review Of Mindware: Tools For Smart Thinking, By Richard Nisbett (2015), Jason Makansi
Numeracy
Richard Nisbett. 2015. Mindware: Tools for Smart Thinking.(New York, NY: Farrar, Strauss, and Giroux). 336 pp. ISBN: 9780374536244
Nisbett, a psychologist, may not achieve his stated goal of teaching readers to “effortlessly” extend their common sense when it comes to quantitative analysis applied to everyday issues, but his critique of multiple regression analysis (MRA) in the middle chapters of Mindware is worth attention from, and contemplation by, the QL/QR and Numeracy community. While in at least one other source, Nisbett’s critique has been called a “crusade” against MRA, what he really advocates is that it not be used as …
Implementation Of Multivariate Artificial Neural Networks Coupled With Genetic Algorithms For The Multi-Objective Property Prediction And Optimization Of Emulsion Polymers,
2019
California Polytechnic State University, San Luis Obispo
Implementation Of Multivariate Artificial Neural Networks Coupled With Genetic Algorithms For The Multi-Objective Property Prediction And Optimization Of Emulsion Polymers, David Chisholm
Master's Theses
Machine learning has been gaining popularity over the past few decades as computers have become more advanced. On a fundamental level, machine learning consists of the use of computerized statistical methods to analyze data and discover trends that may not have been obvious or otherwise observable previously. These trends can then be used to make predictions on new data and explore entirely new design spaces. Methods vary from simple linear regression to highly complex neural networks, but the end goal is similar. The application of these methods to material property prediction and new material discovery has been of high interest …
Blacklegged Tick (Ixodes Scapularis) Distribution In Maine, Usa, As Related To Climate Change, White-Tailed Deer, And The Landscape,
2019
University of Maine
Blacklegged Tick (Ixodes Scapularis) Distribution In Maine, Usa, As Related To Climate Change, White-Tailed Deer, And The Landscape, Susan P. Elias
Electronic Theses and Dissertations
Lyme disease is caused by the bacterial spirochete Borrelia burgdorferi, which is transmitted through the bite of an infected blacklegged (deer) tick (Ixodes scapularis). Geographic invasion of I. scapularis in North America has been attributed to causes including 20th century reforestation and suburbanization, burgeoning populations of the white-tailed deer (Odocoileus virginianus) which is the primary reproductive host of I. scapularis, tick-associated non-native plant invasions, and climate change. Maine, USA, is a high Lyme disease incidence state, with a history of increasing I. scapularis abundance and northward range expansion. This thesis addresses the question: “To …
Analyzing Two-Year College Student Success Using Structural Equation Modeling,
2019
Bellarmine University
Analyzing Two-Year College Student Success Using Structural Equation Modeling, Jessica Taylor
Graduate Theses, Dissertations, and Capstones
The goal of this study is to more fully understand the scope of community college student success using the principles of mindset, engagement, and college readiness. Using structural equation modeling ensures this study is able to measure the combined effects these concepts have on student success, group differences, and the combined model of student success. Findings suggest student success can be significantly impacted by self-belief and mindset behaviors that can outweigh the initial effect of academically under-prepared students. Groups included in this study are non-traditional students, minority populations, first generation students, and Pell eligible students.
Leveraging Reviews To Improve User Experience,
2019
Southern Methodist University
Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley
SMU Data Science Review
In this paper, we will explore and present a method of finding characteristics of a restaurant using its reviews through machine learning algorithms. We begin by building models to predict the ratings of individual reviews using text and categorical features. This is to examine the efficacy of the algorithms to the task. Both XGBoost and logistic regression will be examined. With these models, our goal is then to identify key phrases in reviews that are correlated with positive and negative experience. Our analysis makes use of review data publicly made available by Yelp. Key bigrams extracted were non-specific to the …
Comparison Of Imputation Methods For Mixed Data Missing At Random,
2019
East Tennessee State University
Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt
Electronic Theses and Dissertations
A statistician's job is to produce statistical models. When these models are precise and unbiased, we can relate them to new data appropriately. However, when data sets have missing values, assumptions to statistical methods are violated and produce biased results. The statistician's objective is to implement methods that produce unbiased and accurate results. Research in missing data is becoming popular as modern methods that produce unbiased and accurate results are emerging, such as MICE in R, a statistical software. Using real data, we compare four common imputation methods, in the MICE package in R, at different levels of missingness. The …