Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2523 Full-Text Articles 3071 Authors 453226 Downloads 81 Institutions

All Articles in Applied Statistics

Faceted Search

2523 full-text articles. Page 1 of 58.

Penalized Nonparametric Scalar-On-Function Regression Via Principal Coordinates, Philip T. Reiss, David L. Miller, Pei-Shien Wu, Wen-Yu Hua 2016 New York University School of Medicine

Penalized Nonparametric Scalar-On-Function Regression Via Principal Coordinates, Philip T. Reiss, David L. Miller, Pei-Shien Wu, Wen-Yu Hua

Philip T. Reiss

A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. The core idea is to regress the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, the proposed ...


A Multi-Indexed Logistic Model For Time Series, Xiang Liu 2016 East Tennessee State University

A Multi-Indexed Logistic Model For Time Series, Xiang Liu

Electronic Theses and Dissertations

In this thesis, we explore a multi-indexed logistic regression (MILR) model, with particular emphasis given to its application to time series. MILR includes simple logistic regression (SLR) as a special case, and the hope is that it will in some instances also produce significantly better results. To motivate the development of MILR, we consider its application to the analysis of both simulated sine wave data and stock data. We looked at well-studied SLR and its application in the analysis of time series data. Using a more sophisticated representation of sequential data, we then detail the implementation of MILR. We compare ...


Human Exposure Modeling Using Sheds, Luther Smith, William Graham Glen 2016 Alion Science & Technology Inc

Human Exposure Modeling Using Sheds, Luther Smith, William Graham Glen

Annual Symposium on Biomathematics and Ecology: Education and Research

No abstract provided.


Advances In Portmanteau Diagnostic Tests, Jinkun Xiao 2016 The University of Western Ontario

Advances In Portmanteau Diagnostic Tests, Jinkun Xiao

Electronic Thesis and Dissertation Repository

Portmanteau test serves an important role in model diagnostics for Box-Jenkins Modelling procedures. A large number of Portmanteau test based on the autocorrelation function are proposed for a general purpose goodness-of-fit test. Since the asymptotic distributions for the statistics has a complicated form which makes it hard to obtain the p-value directly, the gamma approximation is introduced to obtain the p-value. But the approximation will inevitably introduce approximation errors and needs a large number of observations to yield a good approximation. To avoid some pitfalls in the approximation, the Lin-Mcleod Test is further proposed to obtain a numeric solution to ...


Projects In Geospatial Data Analysis: Spring 2016, Robert Crimi, Elanor Hoak, Ishita Srivastava, Jesse Wisniewski, Jonathan Blackstock, Keerthi Chikalbettu Pai, Melissa Bica, Michelle Bray, Mikhail Chowdhury, Monal Narasimhamurthy, Nika Shafranov, Sachin Muralidhara, Satchel Spencer, Saurabh Sood, Caleb Phillips 2016 University of Colorado, Boulder

Projects In Geospatial Data Analysis: Spring 2016, Robert Crimi, Elanor Hoak, Ishita Srivastava, Jesse Wisniewski, Jonathan Blackstock, Keerthi Chikalbettu Pai, Melissa Bica, Michelle Bray, Mikhail Chowdhury, Monal Narasimhamurthy, Nika Shafranov, Sachin Muralidhara, Satchel Spencer, Saurabh Sood, Caleb Phillips

Computer Science Technical Reports

This document contains semester projects for students in CSCI 4380/7000 Geospatial Data Analysis (GSA). The course explores the technical aspects of programmatic geospatial data analysis with a focus on GIS concepts, custom GIS programming, analytical and statistical methods, and open source tools and frameworks.


Design Optimization Of A Stochastic Multi-Objective Problem: Gaussian Process Regressions For Objective Surrogates, Juan Sebastian Martinez, Piyush Pandita, Rohit K. Tripathy, Ilias Bilionis 2016 Universidad de Los Andes - Colombia

Design Optimization Of A Stochastic Multi-Objective Problem: Gaussian Process Regressions For Objective Surrogates, Juan Sebastian Martinez, Piyush Pandita, Rohit K. Tripathy, Ilias Bilionis

The Summer Undergraduate Research Fellowship (SURF) Symposium

Multi-objective optimization (MOO) problems arise frequently in science and engineering situations. In an optimization problem, we want to find the set of input parameters that generate the set of optimal outputs, mathematically known as the Pareto frontier (PF). Solving the MOO problem is a challenge since expensive experiments can be performed only a constrained number of times and there is a limited set of data to work with, e.g. a roll-to-roll microwave plasma chemical vapor deposition (MPCVD) reactor for manufacturing high quality graphene. State-of-the-art techniques, e.g. evolutionary algorithms; particle swarm optimization, require a large amount of observations and ...


Utilizing Computed Tomography Image Features To Advance Prediction Of Radiation Pneumonitis, Shane P. Krafft 2016 The University of Texas Graduate School of Biomedical Sciences at Houston

Utilizing Computed Tomography Image Features To Advance Prediction Of Radiation Pneumonitis, Shane P. Krafft

UT GSBS Dissertations and Theses (Open Access)

Improving outcomes for non-small-cell lung cancer patients treated with radiation therapy (RT) requires optimizing the balance between local tumor control and risk of normal tissue toxicity. In approximately 20% of patients, severe acute symptomatic lung toxicity, termed radiation pneumonitis (RP), still occurs. Identifying the individuals at risk of RP prior to or early during treatment offers tremendous potential to improve RT by providing the physician with information to assist in making clinical decisions that enhance therapy. Our central goal for this work was to demonstrate the potential gain in predictive accuracy of normal tissue complication probability models for RP by ...


Newsvendor Models With Monte Carlo Sampling, Ijeoma W. Ekwegh 2016 East Tennessee State University

Newsvendor Models With Monte Carlo Sampling, Ijeoma W. Ekwegh

Electronic Theses and Dissertations

Newsvendor Models with Monte Carlo Sampling by Ijeoma Winifred Ekwegh The newsvendor model is used in solving inventory problems in which demand is random. In this thesis, we will focus on a method of using Monte Carlo sampling to estimate the order quantity that will either maximizes revenue or minimizes cost given that demand is uncertain. Given data, the Monte Carlo approach will be used in sampling data over scenarios and also estimating the probability density function. A bootstrapping process yields an empirical distribution for the order quantity that will maximize the expected profit. Finally, this method will be used ...


Multilevel Models For Longitudinal Data, Aastha Khatiwada 2016 East Tennessee State University

Multilevel Models For Longitudinal Data, Aastha Khatiwada

Electronic Theses and Dissertations

Longitudinal data arise when individuals are measured several times during an ob- servation period and thus the data for each individual are not independent. There are several ways of analyzing longitudinal data when different treatments are com- pared. Multilevel models are used to analyze data that are clustered in some way. In this work, multilevel models are used to analyze longitudinal data from a case study. Results from other more commonly used methods are compared to multilevel models. Also, comparison in output between two software, SAS and R, is done. Finally a method consisting of fitting individual models for each ...


Spatio-Temporal Analysis Of Point Patterns, Abdul-Nasah Soale 2016 East Tennessee State University

Spatio-Temporal Analysis Of Point Patterns, Abdul-Nasah Soale

Electronic Theses and Dissertations

In this thesis, the basic tools of spatial statistics and time series analysis are applied to the case study of the earthquakes in a certain geographical region and time frame. Then some of the existing methods for joint analysis of time and space are described and applied. Finally, additional research questions about the spatial-temporal distribution of the earthquakes are posed and explored using statistical plots and models. The focus in the last section is in the relationship between number of events per year and maximum magnitude and its effect on how clustered the spatial distribution is and the relationship between ...


The Influence Of The Electric Supply Industry On Economic Growth In Less Developed Countries, Edward Richard Bee 2016 University of Southern Mississippi

The Influence Of The Electric Supply Industry On Economic Growth In Less Developed Countries, Edward Richard Bee

Dissertations

This study measures the impact that electrical outages have on manufacturing production in 135 less developed countries using stochastic frontier analysis and data from World Bank’s Investment Climate surveys. Outages of electricity, for firms with and without backup power sources, are the most frequently cited constraint on manufacturing growth in these surveys.

Outages are shown to reduce output below the production frontier by almost five percent in Africa and by a lower percentage in South Asia, Southeast Asia and the Middle East and North Africa. Production response to outages is quadratic in form. Outages also increase labor cost, reduce ...


Advanced Sequential Monte Carlo Methods And Their Applications To Sparse Sensor Network For Detection And Estimation, Kai Kang 2016 University of Tennessee, Knoxville

Advanced Sequential Monte Carlo Methods And Their Applications To Sparse Sensor Network For Detection And Estimation, Kai Kang

Doctoral Dissertations

The general state space models present a flexible framework for modeling dynamic systems and therefore have vast applications in many disciplines such as engineering, economics, biology, etc. However, optimal estimation problems of non-linear non-Gaussian state space models are analytically intractable in general. Sequential Monte Carlo (SMC) methods become a very popular class of simulation-based methods for the solution of optimal estimation problems. The advantages of SMC methods in comparison with classical filtering methods such as Kalman Filter and Extended Kalman Filter are that they are able to handle non-linear non-Gaussian scenarios without relying on any local linearization techniques. In this ...


Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro 2016 University of Tennessee, Knoxville

Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro

Doctoral Dissertations

This dissertation is a collection of examples, algorithms, and techniques for researchers interested in selecting influential variables from statistical regression models. Chapters 1, 2, and 3 provide background information that will be used throughout the remaining chapters, on topics including but not limited to information complexity, model selection, covariance estimation, stepwise variable selection, penalized regression, and especially the genetic algorithm (GA) approach to variable subsetting.

In chapter 4, we fully develop the framework for performing GA subset selection in logistic regression models. We present advantages of this approach against stepwise and elastic net regularized regression in selecting variables from a ...


Teaching The Quandary Of Statistical Jurisprudence: A Review-Essay On Math On Trial By Schneps And Colmez, Noah Giansiracusa 2016 University of Georgia

Teaching The Quandary Of Statistical Jurisprudence: A Review-Essay On Math On Trial By Schneps And Colmez, Noah Giansiracusa

Journal of Humanistic Mathematics

This review-essay on the mother-and-daughter collaboration Math on Trial stems from my recent experience using this book as the basis for a college freshman seminar on the interactions between math and law. I discuss the strengths and weaknesses of this book as an accessible introduction to this enigmatic yet deeply important topic. For those considering teaching from this text (a highly recommended endeavor) I offer some curricular suggestions.


Joint Analysis Of Zero-Heavy Longitudinal Outcomes: Models And Comparison Of Study Designs, Erin R. Lundy 2016 The University of Western Ontario

Joint Analysis Of Zero-Heavy Longitudinal Outcomes: Models And Comparison Of Study Designs, Erin R. Lundy

Electronic Thesis and Dissertation Repository

Understanding the patterns and mechanisms of the process of desistance from criminal activity is imperative for the development of effective sanctions and legal policy. Methodological challenges in the analysis of longitudinal criminal behaviour data include the need to develop methods for multivariate longitudinal discrete data, incorporating modulating exposure variables and several possible sources of zero-inflation. We develop new tools for zero-heavy joint outcome analysis which address these challenges and provide novel insights on processes related to offending patterns. Comparisons with existing approaches demonstrate the benefits of utilizing modeling frameworks which incorporate distinct sources of zeros. An additional concern in this ...


Using A Data Quality Framework To Clean Data Extracted From The Electronic Health Record: A Case Study., Oliwier Dziadkowiec, Tiffany Callahan, Mustafa Ozkaynak, Blaine Reeder, John Welton 2016 University of Colorado, College of Nursing, Anschutz Medical Campus

Using A Data Quality Framework To Clean Data Extracted From The Electronic Health Record: A Case Study., Oliwier Dziadkowiec, Tiffany Callahan, Mustafa Ozkaynak, Blaine Reeder, John Welton

eGEMs (Generating Evidence & Methods to improve patient outcomes)

Objectives: Examine (1) the appropriateness of using a data quality (DQ) framework developed for relational databases as a data-cleaning tool for a dataset extracted from two EPIC databases; and (2) the differences in statistical parameter estimates on a dataset cleaned with the DQ framework and dataset not cleaned with the DQ framework.

Background: The use of data contained within electronic health records (EHRs) has the potential to open doors for a new wave of innovative research. Without adequate preparation of such large datasets for analysis, the results might be erroneous, which might affect clinical decision making or results of Comparative ...


Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma 2016 University of Southampton

Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma

International Conference on Gambling and Risk Taking

Fundamental form characteristics like how fast a horse ran at its last start, are widely used to help predict the outcome of horse racing events. The exception being in races where horses haven’t previously competed, such as Maiden races, where there is little or no publicly available past performance information. In these types of events bettors need only consider a simplified suite of factors however this is offset by a higher level of uncertainty. This paper examines the inherent information content embedded within a horse’s ancestry and the extent to which this information is discounted in the United ...


Classification Trees And Rule-Based Modeling Using The C5.0 Algorithm For Self-Image Across Sex And Race In St. Louis, Rohan Shirali 2016 Washington University in St. Louis

Classification Trees And Rule-Based Modeling Using The C5.0 Algorithm For Self-Image Across Sex And Race In St. Louis, Rohan Shirali

Arts & Sciences Electronic Theses and Dissertations

The study population comprised children, adolescents, and adults who were residents of the city of St. Louis at the time of data collection in 2015. The data collected includes sex, age, race, measured height and weight, self-reported height and weight, zip code, educational background, exercise and diet habits, and descriptions and strategies of participants' weight (i.e. overweight and trying to lose weight, respectively). I use the C5.0 algorithm to create classification trees and rule-based models to analyze this population. Specifically, I model a binary self-image variable as a function of sex, age, race, zip code, and a ratio ...


Failure Of Surface Color Cues Under Natural Changes In Lighting, David H. Foster, Iván Marín-Franch 2016 University of Manchester

Failure Of Surface Color Cues Under Natural Changes In Lighting, David H. Foster, Iván Marín-Franch

MODVIS Workshop

Color allows us to effortlessly discriminate and identify surfaces and objects by their reflected light. Although the reflected spectrum changes with the illumination spectrum, cone photoreceptor signals can be transformed to give useful cues for surface color. But what happens when both the spectrum and the geometry of the illumination change, as with lighting from the sun and sky? Is it possible, as a matter of principle, to obtain reliable cues by processing cone signals alone? This question was addressed here by estimating the information provided by cone signals from time-lapse hyperspectral radiance images of five outdoor scenes under natural ...


Automated Sea State Classification From Parameterization Of Survey Observations And Wave-Generated Displacement Data, Jason A. Teichman 2016 University of New Orleans, New Orleans

Automated Sea State Classification From Parameterization Of Survey Observations And Wave-Generated Displacement Data, Jason A. Teichman

University of New Orleans Theses and Dissertations

Sea state is a subjective quantity whose accuracy depends on an observer’s ability to translate local wind waves into numerical scales. It provides an analytical tool for estimating the impact of the sea on data quality and operational safety. Tasks dependent on the characteristics of local sea surface conditions often require accurate and immediate assessment. An attempt to automate sea state classification using eleven years of ship motion and sea state observation data is made using parametric modeling of distribution-based confidence and tolerance intervals and a probabilistic model using sea state frequencies. Models utilizing distribution intervals are not able ...


Digital Commons powered by bepress