Applications Of The Fractional-Random-Weight Bootstrap, 2020 Virginia Tech
Applications Of The Fractional-Random-Weight Bootstrap, Li Xu, Chris Gotwalt, Yili Hong, Caleb B. King, William Q. Meeker
The bootstrap, based on resampling, has, for several decades, been a widely used method for computing confidence intervals for applications where no exact method is available and when sample sizes are not large enough to be able to rely on easy-to-compute large-sample approximate methods, such a Wald (normal-approximation) confidence intervals. Simulation based bootstrap intervals have been proven useful in that their actual coverage probabilities are close to the nominal confidence level in small samples. Small samples analytical approximations such as the Wald method, however, tend to have coverage probabilities that greatly exceed the nominal confidence level. There are, however, many ...
Extracting Agronomic Information From Smos Vegetation Optical Depth In The Us Corn Belt Using A Nonlinear Hierarchical Model, Colin Lewis-Beck, Victoria A. Walker, Jarad Niemi, Petrutza Caragea, Brian K. Hornbuckle
Remote sensing observations that vary in response to plant growth and senescence can be used to monitor crop development within and across growing seasons. Identifying when crops reach specific growth stages can improve harvest yield prediction and quantify climate change. Using the Level 2 vegetation optical depth (VOD) product from the European Space Agency’s Soil Moisture and Ocean Salinity (SMOS) satellite, we retrospectively estimate the timing of a key crop development stage in the United States Corn Belt. We employ nonlinear curves nested within a hierarchical modeling framework to extract the timing of the third reproductive development stage of ...
Knot Selection In Sparse Gaussian Processes With A Variational Objective Function, 2020 Iowa State University
Knot Selection In Sparse Gaussian Processes With A Variational Objective Function, Nathaniel Garton, Jarad Niemi, Alicia Carriquiry
Sparse, knot‐based Gaussian processes have enjoyed considerable success as scalable approximations of full Gaussian processes. Certain sparse models can be derived through specific variational approximations to the true posterior, and knots can be selected to minimize the Kullback‐Leibler divergence between the approximate and true posterior. While this has been a successful approach, simultaneous optimization of knots can be slow due to the number of parameters being optimized. Furthermore, there have been few proposed methods for selecting the number of knots, and no experimental results exist in the literature. We propose using a one‐at‐a‐time knot selection ...
Employing Very High Frequency (Vhf) Radio Telemetry To Recreate Monarch Butterfly Flight Paths, 2020 Iowa State University
Employing Very High Frequency (Vhf) Radio Telemetry To Recreate Monarch Butterfly Flight Paths, Kelsey E. Fisher, James S. Adelman, Steven P. Bradbury
Natural Resource Ecology and Management Publications
The overwintering population of eastern North American monarch butterflies (Danaus plexippus) has declined significantly. Loss of milkweed (Asclepias sp.), the monarch’s obligate host plant in the Midwest United States, is considered to be a major cause of the decline. Restoring breeding habitat is an actionable step towards population recovery. Monarch butterflies are highly vagile; therefore, the spatial arrangement of milkweed in the landscape influences movement patterns, habitat utilization, and reproductive output. Empirical studies of female movement patterns within and between habitat patches in representative agricultural landscapes support recommendations for habitat restoration. To track monarch movement at distances beyond human ...
K-Means Stock Clustering Analysis Based On Historical Price Movements And Financial Ratios, 2020 Claremont McKenna College
K-Means Stock Clustering Analysis Based On Historical Price Movements And Financial Ratios, Shu Bin
CMC Senior Theses
The 2015 article Creating Diversified Portfolios Using Cluster Analysis proposes an algorithm that uses the Sharpe ratio and results from K-means clustering conducted on companies' historical financial ratios to generate stock market portfolios. This project seeks to evaluate the performance of the portfolio-building algorithm during the beginning period of the COVID-19 recession. S&P 500 companies' historical stock price movement and their historical return on assets and asset turnover ratios are used as dissimilarity metrics for K-means clustering. After clustering, stock with the highest Sharpe ratio from each cluster is picked to become a part of the portfolio. The economic ...
Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, 2020 University of Michigan - Ann Arbor
Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang
Faculty Journal Articles
Research Report 202 describes a study led by Dr. Stuart Batterman at the University of Michigan, Ann Arbor and colleagues. The investigators evaluated the ability to predict traffic-related air pollution using a variety of methods and models, including a line source air pollution dispersion model and sophisticated spatiotemporal Bayesian data fusion methods. Exposure assessment for traffic-related air pollution is challenging because the pollutants are a complex mixture and vary greatly over space and time. Because extensive direct monitoring is difficult and expensive, a number of modeling approaches have been developed, but each model has its own limitations and errors.
Predicting Crop Yields And Soil‐Plant Nitrogen Dynamics In The Us Corn Belt, 2020 Iowa State University
Predicting Crop Yields And Soil‐Plant Nitrogen Dynamics In The Us Corn Belt, Sotirios V. Archontoulis, Michael J. Castellano, Mark A. Licht, Virginia Nichols, Mitch Baum, Isaiah Huber, Rafael Martinez-Feria, Laila Puntel, Raziel A. Ordonez, Javed Iqbal, Emily E. Wright, Ranae N. Dietzel, Matthew Helmers, Andy Vanloocke, Matt Liebman, Jerry L. Hatfield, Daryl Herzmann, S. Carolina Córdova, Patrick Edmonds, Kaitlin Togliatti, Ashlyn Kessler, Gerasimos Danalatos, Heather Pasley, Carl Pederson, Kendall R. Lamkey
We used the Agricultural Production Systems sIMulator (APSIM) to predict and explain maize and soybean yields, phenology, and soil water and nitrogen (N) dynamics during the growing season in Iowa, USA. Historical, current and forecasted weather data were used to drive simulations, which were released in public four weeks after planting. In this paper, we (1) describe the methodology used to perform forecasts; (2) evaluate model prediction accuracy against data collected from 10 locations over four years; and (3) identify inputs that are key in forecasting yields and soil N dynamics. We found that the predicted median yield at planting ...
Estimating Arthropod Survival Probability From Field Counts: A Case Study With Monarch Butterflies, 2020 Iowa State University
Estimating Arthropod Survival Probability From Field Counts: A Case Study With Monarch Butterflies, Tyler J. Grant, D. T. Tyler Flockhart, Teresa R. Blader, Richard L. Hellmich, Grace M. Pitman, Sam Tyner, D. Ryan Norris, Steven P. Bradbury
Survival probability is fundamental for understanding population dynamics. Methods for estimating survival probability from field data typically require marking individuals, but marking methods are not possible for arthropod species that molt their exoskeleton between life stages. We developed a novel Bayesian state‐space model to estimate arthropod larval survival probability from stage‐structured count data. We performed simulation studies to evaluate estimation bias due to detection probability, individual variation in stage duration, and study design (sampling frequency and sample size). Estimation of cumulative survival probability from oviposition to pupation was robust to potential sources of bias. Our simulations also provide ...
Can Climatic Variables Improve Phenological Predictions For Butterfly Species?, 2020 Iowa State University
Can Climatic Variables Improve Phenological Predictions For Butterfly Species?, Bret J. Lang, Mark P. Widrlechner, Philip M. Dixon, Janette Thompson
Changes in butterfly phenology due to climate changes have led to the need for models based on factors other than calendar date to predict butterfly development, allowing those monitoring their populations to increase the effectiveness of field surveys. In this study, we developed two simple climatic models, one using yearly accumulated growing degree days (GDD) and the other using yearly accumulated shortwave radiation flux densities (SRAD), to determine if these variables can predict first emergence of three butterfly species with less error than an approach based on the average ordinal date of first observation at a site. Furthermore, we investigated ...
Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, 2020 University College Dublin
Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana
Illegal markets are notoriously difficult to study. Police data offer an increasingly exploited source of evidence. However, their secondary nature poses challenges for researchers. A key issue is that researchers often have to deal with two sets of actors: targeted and non-targeted. This work develops a latent space model for interdependent ego-networks purposely created to deal with the targeted nature of police evidence. By treating targeted offenders as egos and their contacts as alters, the model (a) leverages on the full information available and (b) mirrors the specificity of the data collection strategy. The paper then applies this approach to ...
Projecting Regions Of North Atlantic Right Whale, Eubalaena Glacialis, Habitat Suitability In The Gulf Of Maine In 2050, Camille Ross
North Atlantic right whales (Eubalaena glacialis) are endangered. Understanding the role environmental conditions play in habitat suitability is key to determining the regions in need of protection for conservation of the species, particularly as climate change shifts suitable habitat. This thesis uses three species distribution modeling algorithms, together with historical data on whale abundance(1993 to 2009) and environmental covariates to build monthly ensemble models of past E. glacialis habitat suitability in the Gulf of Maine. Then, the models are projected onto the year 2050 for a range of climate scenarios. Specifically, the distribution of the species was modeled using ...
Modeling Perennial Groundcover Effects On Annual Maize Grain Crop Growth With The Agricultural Production Systems Simulator, C. A. Bartel, Sotirios V. Archontoulis, Andrew W. Lenssen, Kenneth J. Moore, Isaiah L. Huber, D. A. Laird, Shuizhang Fei, Philip M. Dixon
The inclusion of perennial groundcover (PGC) in maize production offers a tenable solution to natural resources-related concerns associated with conventional maize; however, insight into system management and key information gaps is needed to guide future research. We therefore extended the Agricultural Production Systems sIMulator (APSIM) to an annual and perennial intercrop by integrating annual and perennial APSIM modules. These were parameterized for Kentucky bluegrass (KB) (Poa pratensis L.) or creeping red fescue (CF) (Festuca rubra L.) as PGC using a three-year dataset. Our objectives for this intercropping modeling study were to: i) simultaneously model a PGC and annual cash crop ...
Identifying Customer Churn In After-Market Operations Using Machine Learning Algorithms, 2019 Southern Methodist University
Identifying Customer Churn In After-Market Operations Using Machine Learning Algorithms, Vitaly Briker, Richard Farrow, William Trevino, Brent Allen
SMU Data Science Review
This paper presents a comparative study on machine learning methods as they are applied to product associations, future purchase predictions, and predictions of customer churn in aftermarket operations. Association rules are used help to identify patterns across products and find correlations in customer purchase behaviour. Studying customer behaviour as it pertains to Recency, Frequency, and Monetary Value (RFM) helps inform customer segmentation and identifies customers with propensity to churn. Lastly, Flowserve’s customer purchase history enables the establishment of churn thresholds for each customer group and assists in constructing a model to predict future churners. The aim of this model ...
Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, 2019 Southern Methodist University
Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas
SMU Data Science Review
In the age of hyper-connectivity, 24/7 news cycles, and instant news alerts via social media, mental health researchers don't have a way to automatically detect news content which is associated with triggering anxiety or depression in mental health patients. Using the Associated Press news wire, a semantic network was built with 1,056 news articles containing over 500,000 connections across multiple topics to provide a personalized algorithm which detects problematic news content for a given reader. We make use of Semantic Network Analysis to surface the relationship between news article text and anxiety in readers who struggle ...
Ordinal Hyperplane Loss, 2019 Kennesaw State University
Ordinal Hyperplane Loss, Bob Vanderheyden
Analytics and Data Science Dissertations
This research presents the development of a new framework for analyzing ordered class data, commonly called “ordinal class” data. The focus of the work is the development of classifiers (predictive models) that predict classes from available data. Ratings scales, medical classification scales, socio-economic scales, meaningful groupings of continuous data, facial emotional intensity and facial age estimation are examples of ordinal data for which data scientists may be asked to develop predictive classifiers. It is possible to treat ordinal classification like any other classification problem that has more than two classes. Specifying a model with this strategy does not fully utilize ...
Estimation Of The Representative Elementary Volume Of A Fractured Till: A Field And Groundwater Modeling Approach, Nathan L. Young, William W. Simpkins, Jacqueline E. Reber, Martin F. Helmke
Geological and Atmospheric Sciences Publications
Fractured till is often represented as an equivalent porous medium (EPM) in groundwater models. Knowledge of the representative elementary volume (REV) is necessary for proper application of an EPM model. While REV estimation and hydraulic conductivity tensor determinations are common in fractured rock studies, they are rarely applied to materials with a permeable matrix, such as fractured till. This study uses field fracture measurements, model simulations, and the FracKFinder toolbox to estimate the REV and determine hydraulic conductivity tensors for the fractured, late Wisconsinan till of the Dows Formation in central Iowa (USA), at depths of 1.0–1.5 ...
A Transactive Energy Approach To Distribution System Design: Household Formulation, 2019 Iowa State University
A Transactive Energy Approach To Distribution System Design: Household Formulation, Swathi Battula, Leigh Tesfatsion, Zhaoyu Wang
Economics Working Papers
A household model is formulated to facilitate careful development and performance testing of bid-based transactive energy system (TES) designs with voluntary customer participation. The optimal general bid-function form for households with thermostatically controlled loads is derived from dynamic programming principles, based solely on general household thermal dynamic and welfare attributes. Quantitative forms are determined for these optimal bid functions, given quantitative forms for these attributes. These quantitative attributes are used to construct representative household types based on clusterings of correlated parameter values. Bid comparison, peak-load reduction, and load-matching test cases conducted for a 123-bus distribution system operating under a generic ...
Evaluation Of Modern Missing Data Handling Methods For Coefficient Alpha, 2019 University of Nebraska - Lincoln
Evaluation Of Modern Missing Data Handling Methods For Coefficient Alpha, Katerina Matysova
Public Access Theses, Dissertations, and Student Research from the College of Education and Human Sciences
When assessing a certain characteristic or trait using a multiple item measure, quality of that measure can be assessed by examining the reliability. To avoid multiple time points, reliability can be represented by internal consistency, which is most commonly calculated using Cronbach’s coefficient alpha. Almost every time human participants are involved in research, there is missing data involved. Missing data means that even though complete data were expected to be collected, some data are missing. Missing data can follow different patterns as well as be the result of different mechanisms. One traditional way to deal with missing data is ...
Linking Bedrock Discontinuities To Glacial Quarrying, 2019 University of Wisconsin-Madison
Linking Bedrock Discontinuities To Glacial Quarrying, J. B. Woodard, L. K. Zoet, Neal R. Iverson, C. Helanow
Geological and Atmospheric Sciences Publications
Quarrying and abrasion are the two principal processes responsible for glacial erosion of bedrock. The morphologies of glacier hard beds depend on the relative effectiveness of these two processes, as abrasion tends to smooth bedrock surfaces and quarrying tends to roughen them. Here we analyze concentrations of bedrock discontinuities in the Tsanfleuron forefield, Switzerland, to help determine the geologic conditions that favor glacial quarrying over abrasion. Aerial discontinuity concentrations are measured from scaled drone-based photos where fractures and bedding planes in the bedrock are manually mapped. A Tukey honest significant difference test indicates that aerial concentration of bed-normal bedrock discontinuities ...
A Note On Propensity Score Weighting Method Using Paradata In Survey Sampling, 2019 Dartmouth College
A Note On Propensity Score Weighting Method Using Paradata In Survey Sampling, Seho Park, Jae Kwang Kim, Kimin Kim
Paradata is often collected during the survey process to monitor the quality of the survey response. One such paradata is a respondent behavior, which can be used to construct response models. The propensity score weight using the respondent behavior information can be applied to the final analysis to reduce the nonresponse bias. However, including the surrogate variable in the propensity score weighting does not always guarantee the efficiency gain. We show that the surrogate variable is useful only when it is correlated with the study variable. Results from a limited simulation study confirm the finding. A real data application using ...