Physical Sciences and Mathematics | Open Access Articles

Split Classification Model For Complex Clustered Data, Katherine Gerot Mar 2022

Split Classification Model For Complex Clustered Data, Katherine Gerot

Honors Theses

Classification in high-dimensional data has generated tremendous interest in a multitude of fields. Data in higher dimensions often tend to reside in non-Euclidean metric space. This prevents Euclidean-based classification methodologies, such as regression, from reliably modeling the data. Many proposed models rely on computationally-complex embedding to convert the data to a more usable format. Others, namely the Support Vector Machine, rely on kernel manipulation to implicitly describe the "feature space" to arrive at a non-linear decision boundary. The proposed methodology in this paper seeks to classify complex data in a relatively computationally-simple and explainable manner.

Go to article

Using Stability To Select A Shrinkage Method, Dean Dustin May 2020

Using Stability To Select A Shrinkage Method, Dean Dustin

Department of Statistics: Dissertations, Theses, and Student Work

Shrinkage methods are estimation techniques based on optimizing expressions to find which variables to include in an analysis, typically a linear regression. The general form of these expressions is the sum of an empirical risk plus a complexity penalty based on the number of parameters. Many shrinkage methods are known to satisfy an ‘oracle’ property meaning that asymptotically they select the correct variables and estimate their coefficients efficiently. In Section 1.2, we show oracle properties in two general settings. The first uses a log likelihood in place of the empirical risk and allows a general class of penalties. The second …

Go to article

The Role Of Topography, Soil, And Remotely Sensed Vegetation Condition Towards Predicting Crop Yield, Trenton E. Franz, Sayli Pokal, Justin P. Gibson, Yuzhen Zhou, Hamed Gholizadeh, Fatima Amor Tenorio, Daran Rudnick, Derek M. Heeren, Matthew F. Mccabe, Matteo Ziliani, Zhenong Jin, Kaiyu Guan, Ming Pan, John Gates, Brian Wardlow Jan 2020

The Role Of Topography, Soil, And Remotely Sensed Vegetation Condition Towards Predicting Crop Yield, Trenton E. Franz, Sayli Pokal, Justin P. Gibson, Yuzhen Zhou, Hamed Gholizadeh, Fatima Amor Tenorio, Daran Rudnick, Derek M. Heeren, Matthew F. Mccabe, Matteo Ziliani, Zhenong Jin, Kaiyu Guan, Ming Pan, John Gates, Brian Wardlow

School of Natural Resources: Faculty Publications

Foreknowledge of the spatiotemporal drivers of crop yield would provide a valuable source of information to optimize on-farm inputs and maximize profitability. In recent years, an abundance of spatial data providing information on soils, topography, and vegetation condition have become available from both proximal and remote sensing platforms. Given the wide range of data costs (between USD $0−50/ha), it is important to understand where often limited financial resources should be directed to optimize field production. Two key questions arise. First, will these data actually aid in better fine-resolution yield prediction to help optimize crop management and farm economics? Second, what …

Go to article

Essentials Of Structural Equation Modeling, Mustafa Emre Civelek Mar 2018

Essentials Of Structural Equation Modeling, Mustafa Emre Civelek

Zea E-Books Collection

Structural Equation Modeling is a statistical method increasingly used in scientific studies in the fields of Social Sciences. It is currently a preferred analysis method, especially in doctoral dissertations and academic researches. However, since many universities do not include this method in the curriculum of undergraduate and graduate courses, students and scholars try to solve the problems they encounter by using various books and internet resources.

This book aims to guide the researcher who wants to use this method in a way that is free from math expressions. It teaches the steps of a research program using structured equality modeling …

Go to article

How Do You Interpret A Confidence Interval?, Paul Savory Jan 2008

How Do You Interpret A Confidence Interval?, Paul Savory

Industrial and Management Systems Engineering: Instructional Materials

A confidence interval (CI) is an interval estimate of a population parameter. Instead of estimating the parameter by a single value, a point estimate, an interval likely to cover the parameter is developed. Many student incorrectly interpret the meaning of a confidence interval. This paper offers a quick overview of how to correctly interpret a confidence interval.

Go to article

Why Divide By (N-1) For Sample Standard Deviation?, Paul Savory Jan 2008

Why Divide By (N-1) For Sample Standard Deviation?, Paul Savory

Industrial and Management Systems Engineering: Instructional Materials

In statistics, the sample standard deviation is a widely used measure of the variability or dispersion of a data set. The standard deviation of a data set is the square root of its variance. In calculating the sample standard deviation, the divisor is the number of samples in the data set minus one (n-1) rather than n. This often confuses students. This paper offers a quick overview of why the divisor is (n-1) for calculating the sample standard deviation.

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Split Classification Model For Complex Clustered Data, Katherine Gerot

Honors Theses

Using Stability To Select A Shrinkage Method, Dean Dustin

Department of Statistics: Dissertations, Theses, and Student Work

School of Natural Resources: Faculty Publications

Essentials Of Structural Equation Modeling, Mustafa Emre Civelek

Zea E-Books Collection

How Do You Interpret A Confidence Interval?, Paul Savory

Industrial and Management Systems Engineering: Instructional Materials

Why Divide By (N-1) For Sample Standard Deviation?, Paul Savory

Industrial and Management Systems Engineering: Instructional Materials