Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Statistical Models (32)
- Social and Behavioral Sciences (27)
- Statistical Methodology (19)
- Mathematics (18)
- Biostatistics (13)
-
- Categorical Data Analysis (13)
- Statistical Theory (13)
- Computer Sciences (11)
- Applied Mathematics (10)
- Other Statistics and Probability (10)
- Design of Experiments and Sample Surveys (9)
- Data Science (8)
- Multivariate Analysis (8)
- Probability (8)
- Life Sciences (7)
- Business (6)
- Engineering (6)
- Other Applied Mathematics (6)
- Psychology (6)
- Education (5)
- Medicine and Health Sciences (5)
- Other Mathematics (5)
- Agriculture (4)
- Analysis (4)
- Longitudinal Data Analysis and Time Series (4)
- Numerical Analysis and Computation (4)
- Sociology (4)
- Institution
-
- California Polytechnic State University, San Luis Obispo (12)
- Southern Methodist University (10)
- Wayne State University (6)
- University of Nebraska - Lincoln (5)
- Utah State University (5)
-
- Western Kentucky University (5)
- The University of Akron (4)
- Claremont Colleges (3)
- Kansas State University Libraries (3)
- University of Arkansas, Fayetteville (3)
- Florida International University (2)
- Georgia College (2)
- Georgia Southern University (2)
- Misericordia University (2)
- Selected Works (2)
- University of Nevada, Las Vegas (2)
- University of New Hampshire (2)
- University of South Carolina (2)
- University of South Florida (2)
- University of Southern Maine (2)
- Ursinus College (2)
- Air Force Institute of Technology (1)
- Bowling Green State University (1)
- Bridgewater State University (1)
- Chapman University (1)
- City University of New York (CUNY) (1)
- East Tennessee State University (1)
- Edith Cowan University (1)
- Embry-Riddle Aeronautical University (1)
- HELIN Consortium (1)
- Publication Year
- Publication
-
- Statistical Science Theses and Dissertations (9)
- Statistics (8)
- Journal of Modern Applied Statistical Methods (6)
- Electronic Theses and Dissertations (4)
- Williams Honors College, Honors Research Projects (4)
-
- All Graduate Plan B and other Reports, Spring 1920 to Spring 2023 (3)
- Conference on Applied Statistics in Agriculture (3)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (2)
- Doctoral Dissertations (2)
- FIU Electronic Theses and Dissertations (2)
- Graduate Theses and Dissertations (2)
- Honors Theses and Capstones (2)
- Industrial and Management Systems Engineering: Instructional Materials (2)
- Masters Theses & Specialist Projects (2)
- Numeracy (2)
- Physics (2)
- Senior Theses (2)
- Student Research Poster Presentations 2020 (2)
- Theses and Dissertations (2)
- Access*: Interdisciplinary Journal of Student Research and Scholarship (1)
- Beyond: Undergraduate Research Journal (1)
- CMC Senior Theses (1)
- DU Undergraduate Research Journal Archive (1)
- Department of Mathematics Publications (1)
- Department of Statistics: Dissertations, Theses, and Student Work (1)
- Departmental Technical Reports (CS) (1)
- Dissertations, Theses, and Capstone Projects (1)
- Doctor of Business Administration Dissertations (1)
- Economics Faculty Publications (1)
- Georgia College Student Research Events (1)
- Publication Type
- File Type
Articles 1 - 30 of 105
Full-Text Articles in Applied Statistics
Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre
Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre
SMU Data Science Review
Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …
Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang
Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang
Statistical Science Theses and Dissertations
Spatially resolved transcriptomics (SRT) quantifies expression levels at different spatial locations, providing a new and powerful tool to investigate novel biological insights. As experimental technologies enhance both in capacity and efficiency, there arises a growing demand for the development of analytical methodologies.
One question in SRT data analysis is to identify genes whose expressions exhibit spatially correlated patterns, called spatially variable (SV) genes. Most current methods to identify SV genes are built upon the geostatistical model with Gaussian process, which could limit the models' ability to identify complex spatial patterns. In order to overcome this challenge and capture more types …
Reu-Deim Classification Of Hispanic Voters In Hispanic Groups Using Name And Zip Code Data In Palm Beach, Florida, Kamila Soto-Ortiz
Reu-Deim Classification Of Hispanic Voters In Hispanic Groups Using Name And Zip Code Data In Palm Beach, Florida, Kamila Soto-Ortiz
Beyond: Undergraduate Research Journal
When it comes to registering to vote, Hispanic voters can only register as “Hispanic” in the “Race/Ethnicity” category, causing difficulties when analyzing voting trends amongst the Hispanic community. Upon the recent idea that not all Hispanic Groups vote the same, the goal is to create a model that can possibly identify a voter’s Hispanic Group with the information provided on the public Florida voter file. This is accomplished using name and zip code data for all voters in Palm Beach, Florida. This paper will explore the model implemented, its findings and limitations. Palm Beach, Florida, is met with low confidence …
Sentiment Analysis Before And During The Covid-19 Pandemic, Emily Musgrove
Sentiment Analysis Before And During The Covid-19 Pandemic, Emily Musgrove
Mathematics Summer Fellows
This study examines the change in connotative language use before and during the Covid-19 pandemic. By analyzing news articles from several major US newspapers, we found that there is a statistically significant correlation between the sentiment of the text and the publication period. Specifically, we document a large, systematic, and statistically significant decline in the overall sentiment of articles published in major news outlets. While our results do not directly gauge the sentiment of the population, our findings have important implications regarding the social responsibility of journalists and media outlets especially in times of crisis.
A Comparison Of Confidence Intervals In State Space Models, Jinyu Du
A Comparison Of Confidence Intervals In State Space Models, Jinyu Du
Statistical Science Theses and Dissertations
This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …
Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile
Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile
Statistical Science Theses and Dissertations
Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …
Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley
Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley
Senior Theses
In Major League Baseball (MLB), the outcome of a stolen base attempt has important implications. Success moves the runner closer to scoring, while failure records an out and removes the runner from the basepaths altogether. Therefore, it is important that the decision by a coach or player to steal a base is well-informed. In this thesis, I explore a statistical approach to making this decision. I train logistic regression and random forest models, using data about the game situation and about the runner, pitcher, and catcher involved in the stolen base attempt, to estimate the probability that a stolen base …
Dynamic Prediction For Alternating Recurrent Events Using A Semiparametric Joint Frailty Model, Jaehyeon Yun
Dynamic Prediction For Alternating Recurrent Events Using A Semiparametric Joint Frailty Model, Jaehyeon Yun
Statistical Science Theses and Dissertations
Alternating recurrent events data arise commonly in health research; examples include hospital admissions and discharges of diabetes patients; exacerbations and remissions of chronic bronchitis; and quitting and restarting smoking. Recent work has involved formulating and estimating joint models for the recurrent event times considering non-negligible event durations. However, prediction models for transition between recurrent events are lacking. We consider the development and evaluation of methods for predicting future events within these models. Specifically, we propose a tool for dynamically predicting transition between alternating recurrent events in real time. Under a flexible joint frailty model, we derive the predictive probability of …
Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti
Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti
Honors Thesis
Machine learning is often used to build predictive models by extracting patterns from large data sets. Such techniques are increasingly being utilized to predict outcomes in the social sciences. One such application is predicting student success. Machine learning can be applied to predicting student acceptance and success in academia. Using these tools for education-related data analysis, may enable the evaluation of programs, resources and curriculum. Currently, research is needed to examine application, admissions, and retention data in order to address equity in college computer science programs. However, most student-level data sets contain sensitive data that cannot be made public. To …
Understanding And Improving The System: The Effects Of Weighting On The Accuracy Of Political Polling In Arkansas, Beck Williams
Understanding And Improving The System: The Effects Of Weighting On The Accuracy Of Political Polling In Arkansas, Beck Williams
Political Science Undergraduate Honors Theses
In an effort to increase the accuracy of statewide political polling in Arkansas, we explore the statistical strategy of weighting with a focus on one yearly opinion poll: The Arkansas Poll. We conduct over 70 weighting experiments on the 2016 and 2020 Arkansas Polls using a variety of variables and opinion questions. From these experiments, we find that while some weighted variables tend to create larger changes, weighting typically results in a single-digit percentage change that does not substantially shift or “flip” the majorities. Due to a greater rate of change through weighting in the 2020 Poll compared to the …
A Monte Carlo Analysis Of Seven Dichotomous Variable Confidence Interval Equations, Morgan Juanita Dubose
A Monte Carlo Analysis Of Seven Dichotomous Variable Confidence Interval Equations, Morgan Juanita Dubose
Masters Theses & Specialist Projects
Department of Psychological Sciences Western Kentucky University There are two options to estimate a range of likely values for the population mean of a continuous variable: one for when the population standard deviation is known and another for when the population standard deviation is unknown. There are seven proposed equations to calculate the confidence interval for the population mean of a dichotomous variable: normal approximation interval, Wilson interval, Jeffreys interval, Clopper-Pearson, Agresti-Coull, arcsine transformation, and logit transformation. In this study, I compared the percent effectiveness of each equation using a Monte Carlo analysis and the interval range over a range …
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Honors Theses and Capstones
COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …
Trade Bait: Season 3, Ben Bagley
Trade Bait: Season 3, Ben Bagley
WWU Honors College Senior Projects
A 5-episode podcast series dissecting the use of statistics in the NFL and NFL Media
An Introduction To Calling Bullshit: Learning To Think Outside The Black Box, Jevin D. West, Carl T. Bergstrom
An Introduction To Calling Bullshit: Learning To Think Outside The Black Box, Jevin D. West, Carl T. Bergstrom
Numeracy
Bergstrom, Carl T. and Jevin D. West. 2020. Calling Bullshit: The Art of Skepticism in a Data-Driven World. (New York: Random House) 336 pp. ISBN 978-0525509202.
While statistical methods receive greater attention, the art of critically evaluating information in everyday life more commonly depends on thinking outside the black box of the algorithm. In this piece we introduce readers to our book and associated online teaching materials—for readers who want to more capably call “bullshit” or to teach their students to do the same.
Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki
Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki
Dissertations, Theses, and Capstone Projects
In the United States, a significant population is facing an uphill battle trying to thrive in an industry that has seen exponential growth in recent years. Women, who account for approximately 50.8% of the U.S. population are statistically underpaid and underrepresented in science, technology, engineering, and mathematics (STEM). Despite women-led technology teams establishing a 21% greater return on investment than teams who don’t, and young women largely outperforming men in math according to a 2015 study, there are only three fortune 500 companies led by women, and they comprise only 10% of internet entrepreneurs. Research generates hundreds of articles, infographics, …
Guidelines For Regression Analysis In Sas And R: A Case Study, Sarah Milligan
Guidelines For Regression Analysis In Sas And R: A Case Study, Sarah Milligan
Honors Program Theses and Projects
When a player is a free agent, an individual who is able to sign to any team, one wonders what their best option is. Will signing with Team A or Team B provide them with the largest salary? What factors will affect their salary the most? Does last year’s statistics have a strong impact on next year’s salary? These questions can be answered by performing a regression analysis on previous years data. The primary focus of this project is to determine the most important variables related to an NBA salary. Likewise, the statistical programs SAS and R will be compared …
How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel
How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel
Thinking Matters Symposium
Due to the pandemic, people have started relying more on televisions, news, social media, and other news outlets for guidance. Moreover, with the increasing amount of news, data, and information there is also an increase in the amount of misleading statistics. People’s opinions and decisions significantly depend on the data, statistics, and information that they are exposed to, as well as their sources. For this project, we want to look at how information and its sources are affecting the decision made by the general public for the usage of the Portland Transit System. It is very important to know why …
Fourth Down Decision Making: Challenging The Conservative Nature Of Nfl Coaches, Will Palmquist, Ryan Elmore, Benjamin Williams
Fourth Down Decision Making: Challenging The Conservative Nature Of Nfl Coaches, Will Palmquist, Ryan Elmore, Benjamin Williams
DU Undergraduate Research Journal Archive
This thesis analyzes the hypothesis that coaches in the National Football League are often too conservative in their decision making on fourth downs. I used R Studio and NFL play-by-play data to simulate actual football plays and drives according to different fourth down strategies. By measuring expected points per drive over thousands of simulated drives, we are able to evaluate the effectiveness of different fourth down strategies. This research points to a number of conclusions regarding the nature of NFL coaches on fourth downs as well as the complexity of modeling and simulating decision making in a complex sport such …
Ensemble Protein Inference Evaluation, Kyle Lee Lucke
Ensemble Protein Inference Evaluation, Kyle Lee Lucke
Graduate Student Theses, Dissertations, & Professional Papers
The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …
Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang
Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang
Statistical Science Theses and Dissertations
This dissertation investigates: (1) A Bayesian Semi-supervised Approach to Keyphrase Extraction with Only Positive and Unlabeled Data, (2) Jackknife Empirical Likelihood Confidence Intervals for Assessing Heterogeneity in Meta-analysis of Rare Binary Events.
In the big data era, people are blessed with a huge amount of information. However, the availability of information may also pose great challenges. One big challenge is how to extract useful yet succinct information in an automated fashion. As one of the first few efforts, keyphrase extraction methods summarize an article by identifying a list of keyphrases. Many existing keyphrase extraction methods focus on the unsupervised setting, …
Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu
Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu
Statistical Science Theses and Dissertations
In this dissertation, improved statistical methods for time-series and lifetime data are developed. First, an improved trend test for time series data is presented. Then, robust parametric estimation methods based on system lifetime data with known system signatures are developed.
In the first part of this dissertation, we consider a test for the monotonic trend in time series data proposed by Brillinger (1989). It has been shown that when there are highly correlated residuals or short record lengths, Brillinger’s test procedure tends to have significance level much higher than the nominal level. This could be related to the discrepancy between …
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Access*: Interdisciplinary Journal of Student Research and Scholarship
The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …
Bayesian Topological Machine Learning, Christopher A. Oballe
Bayesian Topological Machine Learning, Christopher A. Oballe
Doctoral Dissertations
Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …
Using Stability To Select A Shrinkage Method, Dean Dustin
Using Stability To Select A Shrinkage Method, Dean Dustin
Department of Statistics: Dissertations, Theses, and Student Work
Shrinkage methods are estimation techniques based on optimizing expressions to find which variables to include in an analysis, typically a linear regression. The general form of these expressions is the sum of an empirical risk plus a complexity penalty based on the number of parameters. Many shrinkage methods are known to satisfy an ‘oracle’ property meaning that asymptotically they select the correct variables and estimate their coefficients efficiently. In Section 1.2, we show oracle properties in two general settings. The first uses a log likelihood in place of the empirical risk and allows a general class of penalties. The second …
Analysis Of Gas Mileage Of A Car, Joshua Ballard-Myer
Analysis Of Gas Mileage Of A Car, Joshua Ballard-Myer
Georgia College Student Research Events
The objective of this work is to analyze a data set, Auto, from the R package ISLR: Introduction to Statistical Learning in R. The data set includes information for 392 observations on 9 variables including gas mileage, horsepower, weight in pounds, and engine displacement in cubic inches. The data set was taken from the StatLib library maintained at Carnegie Mellon University. The primary response variable will be gas mileage in miles per gallon, with all other variables serving as predictors, but other relationships with other response variables such as acceleration will be explored. Results were similar to expected; traits desirable …
The Importance Of Type I Error Rates When Studying Bias In Monte Carlo Studies In Statistics, Michael Harwell
The Importance Of Type I Error Rates When Studying Bias In Monte Carlo Studies In Statistics, Michael Harwell
Journal of Modern Applied Statistical Methods
Two common outcomes of Monte Carlo studies in statistics are bias and Type I error rate. Several versions of bias statistics exist but all employ arbitrary cutoffs for deciding when bias is ignorable or non-ignorable. This article argues Type I error rates should be used when assessing bias.
The Author’S Reflections On No B.S. (Bad Stats): Black People Need People Who Believe In Black People Enough Not To Believe Every Bad Thing They Hear About Black People, Ivory A. Toldson
Numeracy
Toldson, Ivory. A. 2019. No BS (Bad Stats): Black People Need People Who Believe in Black People Enough Not to Believe Every Bad Thing They Hear About Black People (Boston, MA: Brill-Sense) 194 pp. ISBN 978-9004397026.
This essay provides an introduction to No BS (Bad Stats): Black People Need People Who Believe in Black People Enough Not to Believe Every Bad Thing They Hear About Black People. In the essay, the author discusses how cynical views about the educational potential of Black children motivated him to write a book that challenges negative statistics. The essay also outlines the harmful …
Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford
Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford
Student Research Poster Presentations 2020
The purpose of this analysis is to determine the sample size needed for a study that will be used to discover if there is a difference in the caloric intake of children who help with meal preparation and children who do not help with meal preparation.
Predicting Diabetes Diagnoses, Sarah Netchert
Predicting Diabetes Diagnoses, Sarah Netchert
Student Research Poster Presentations 2020
This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened for diabetes out of 1,046 …
The Role Of Topography, Soil, And Remotely Sensed Vegetation Condition Towards Predicting Crop Yield, Trenton E. Franz, Sayli Pokal, Justin P. Gibson, Yuzhen Zhou, Hamed Gholizadeh, Fatima Amor Tenorio, Daran Rudnick, Derek M. Heeren, Matthew F. Mccabe, Matteo Ziliani, Zhenong Jin, Kaiyu Guan, Ming Pan, John Gates, Brian Wardlow
The Role Of Topography, Soil, And Remotely Sensed Vegetation Condition Towards Predicting Crop Yield, Trenton E. Franz, Sayli Pokal, Justin P. Gibson, Yuzhen Zhou, Hamed Gholizadeh, Fatima Amor Tenorio, Daran Rudnick, Derek M. Heeren, Matthew F. Mccabe, Matteo Ziliani, Zhenong Jin, Kaiyu Guan, Ming Pan, John Gates, Brian Wardlow
School of Natural Resources: Faculty Publications
Foreknowledge of the spatiotemporal drivers of crop yield would provide a valuable source of information to optimize on-farm inputs and maximize profitability. In recent years, an abundance of spatial data providing information on soils, topography, and vegetation condition have become available from both proximal and remote sensing platforms. Given the wide range of data costs (between USD $0−50/ha), it is important to understand where often limited financial resources should be directed to optimize field production. Two key questions arise. First, will these data actually aid in better fine-resolution yield prediction to help optimize crop management and farm economics? Second, what …