Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Applied Statistics (20)
- Biostatistics (9)
- Statistical Methodology (8)
- Computer Sciences (6)
- Multivariate Analysis (6)
-
- Applied Mathematics (5)
- Engineering (5)
- Probability (5)
- Mathematics (4)
- Medicine and Health Sciences (4)
- Social and Behavioral Sciences (4)
- Statistical Theory (4)
- Artificial Intelligence and Robotics (3)
- Categorical Data Analysis (3)
- Environmental Sciences (3)
- Life Sciences (3)
- Survival Analysis (3)
- Bioinformatics (2)
- Civil and Environmental Engineering (2)
- Design of Experiments and Sample Surveys (2)
- Electrical and Computer Engineering (2)
- Natural Resources Management and Policy (2)
- Natural Resources and Conservation (2)
- Numerical Analysis and Computation (2)
- Numerical Analysis and Scientific Computing (2)
- Other Applied Mathematics (2)
- Other Mathematics (2)
- Institution
-
- Western University (5)
- Georgia Southern University (4)
- Utah State University (4)
- University of Kentucky (3)
- University of Tennessee, Knoxville (3)
-
- Washington University in St. Louis (3)
- Bowling Green State University (2)
- The Texas Medical Center Library (2)
- University of Arkansas, Fayetteville (2)
- Bard College (1)
- Boise State University (1)
- California Polytechnic State University, San Luis Obispo (1)
- California State University, San Bernardino (1)
- East Tennessee State University (1)
- Marquette University (1)
- Murray State University (1)
- Stephen F. Austin State University (1)
- University of Missouri, St. Louis (1)
- University of New Mexico (1)
- Virginia Commonwealth University (1)
- World Maritime University (1)
- Keyword
-
- Statistics (5)
- Bayesian (2)
- Data Science (2)
- Machine Learning (2)
- Machine learning (2)
-
- Q-learning (2)
- "hot hand" (1)
- Adaptive (1)
- Adaptive Sampling Methods (1)
- Age-at-onset penetrance (1)
- Agriculture (1)
- Algorithm (1)
- Alternative tobacco products; environment; genes; nicotine; tobacco; young adulthood (1)
- Alzheimer (1)
- Alzheimer's Disease (1)
- Artificial Intelligence (1)
- Association selection (1)
- Audio signal processing (1)
- Baseball (1)
- Basketball (1)
- Bayes' rule (1)
- Bayesian Analysis (1)
- Bayesian Inference (1)
- Bayesian hierarchical model (1)
- Bayesian linear regression (1)
- Bayesian statistics (1)
- Big Data (1)
- Binary covariate (1)
- Biostatistics (1)
- Blood Plasma Proteins (1)
- Publication
-
- Electronic Theses and Dissertations (6)
- Electronic Thesis and Dissertation Repository (5)
- All Graduate Plan B and other Reports, Spring 1920 to Spring 2023 (3)
- Arts & Sciences Electronic Theses and Dissertations (3)
- Dissertations & Theses (Open Access) (2)
-
- Graduate Theses and Dissertations (2)
- Honors Projects (2)
- Theses and Dissertations--Statistics (2)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (1)
- Boise State University Theses and Dissertations (1)
- Chancellor’s Honors Program Projects (1)
- Computer Science and Software Engineering (1)
- Doctoral Dissertations (1)
- Electrical and Computer Engineering ETDs (1)
- Electronic Theses, Projects, and Dissertations (1)
- Master's Theses (2009 -) (1)
- Masters Theses (1)
- Murray State Theses and Dissertations (1)
- Senior Projects Spring 2017 (1)
- Theses (1)
- Theses and Dissertations (1)
- Theses and Dissertations--Epidemiology and Biostatistics (1)
- World Maritime University Dissertations (1)
Articles 1 - 30 of 40
Full-Text Articles in Statistical Models
Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell
Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
Movement and habitat selection by Greater Sage-grouse (Centrocercus uropasianus) is of great interest to wildlife managers tasked with applying conservation measures for this iconic western species. Current technology has created small and lightweight GPS (Global Positioning Systems) transmitters that can be attached to sage-grouse. Using GIS software and statistical programs such as Program R, land managers can analyze GPS location data to assess how sage-grouse are geospatially interacting with their habitats. Within the Panguitch Sage-Grouse Management Area (SGMA) thousands of acres of land have been restored or manipulated to enhance sage-grouse habitat; this usually involves removal of pinyon pine …
Statistical Analysis Of Momentum In Basketball, Mackenzi Stump
Statistical Analysis Of Momentum In Basketball, Mackenzi Stump
Honors Projects
The “hot hand” in sports has been debated for as long as sports have been around. The debate involves whether streaks and slumps in sports are true phenomena or just simply perceptions in the mind of the human viewer. This statistical analysis of momentum in basketball analyzes the distribution of time between scoring events for the BGSU Women’s Basketball team from 2011-2017. We discuss how the distribution of time between scoring events changes with normal game factors such as location of the game, game outcome, and several other factors. If scoring events during a game were always randomly distributed, or …
Making Models With Bayes, Pilar Olid
Making Models With Bayes, Pilar Olid
Electronic Theses, Projects, and Dissertations
Bayesian statistics is an important approach to modern statistical analyses. It allows us to use our prior knowledge of the unknown parameters to construct a model for our data set. The foundation of Bayesian analysis is Bayes' Rule, which in its proportional form indicates that the posterior is proportional to the prior times the likelihood. We will demonstrate how we can apply Bayesian statistical techniques to fit a linear regression model and a hierarchical linear regression model to a data set. We will show how to apply different distributions to Bayesian analyses and how the use of a prior affects …
Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai
Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai
All Graduate Theses and Dissertations, Spring 1920 to Summer 2023
Other research reported that genetic mechanism plays a major role in the development process of biological shapes. The primary goal of this dissertation is to develop novel statistical models to investigate the quantitative relationships between biological shapes and genetic variants. However, these problems can be extremely challenging to traditional statistical models for a number of reasons: 1) the biological phenotypes cannot be effectively represented by single-valued traits, while traditional regression only handles one dependent variable; 2) in real-life genetic data, the number of candidate genes to be investigated is extremely large, and the signal-to-noise ratio of candidate genes is expected …
Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea
Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea
Graduate Theses and Dissertations
Outlier detection is one of the most important challenges with many present-day applications. Outliers can occur due to uncertainty in data generating mechanisms or due to an error in data recording/processing. Outliers can drastically change the study's results and make predictions less reliable. Detecting outliers in longitudinal studies is quite challenging because this kind of study is working with observations that change over time. Therefore, the same subject can produce an outlier at one point in time produce regular observations at all other time points. A Bayesian hierarchical modeling assigns parameters that can quantify whether each observation is an outlier …
Developing Leading And Lagging Indicators To Enhance Equipment Reliability In A Lean System, Dhanush Agara Mallesh
Developing Leading And Lagging Indicators To Enhance Equipment Reliability In A Lean System, Dhanush Agara Mallesh
Masters Theses
With increasing complexity in equipment, the failure rates are becoming a critical metric due to the unplanned maintenance in a production environment. Unplanned maintenance in manufacturing process is created issues with downtimes and decreasing the reliability of equipment. Failures in equipment have resulted in the loss of revenue to organizations encouraging maintenance practitioners to analyze ways to change unplanned to planned maintenance. Efficient failure prediction models are being developed to learn about the failures in advance. With this information, failures predicted can reduce the downtimes in the system and improve the throughput.
The goal of this thesis is to predict …
Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu
Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu
Electronic Thesis and Dissertation Repository
Motivated by some real problems, our thesis puts forward two general two-period pricing models and explore optimal buying and selling strategies in two states of the two-period decision, when buyer/seller's decisions in the two periods are uncertain: commodity valuations may or may not be independent, may or may not follow the same distribution, be heavily or just lightly influenced by exogenous economic conditions, and so on. For both the example of buying laptops and the example of selling houses, the connections between each example and the two-envelope paradox encourage us to explore optimal strategies based on the works of McDonnell …
Data-Adaptive Kernel Support Vector Machine, Xin Liu
Data-Adaptive Kernel Support Vector Machine, Xin Liu
Electronic Thesis and Dissertation Repository
In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges …
On The Estimation Of Penetrance In The Presence Of Competing Risks With Family Data, Daniel Prawira
On The Estimation Of Penetrance In The Presence Of Competing Risks With Family Data, Daniel Prawira
Electronic Thesis and Dissertation Repository
In family studies, we are interested in estimating the penetrance function of the event of interest in the presence of competing risks. Failure to account for competing risks may lead to bias in the estimation of the penetrance function. In this thesis, three statistical challenges are addressed: clustering, missing data, and competing risks. We proposed the cause-specific model with shared frailty and ascertainment correction to account for clustering and competing risks along with ascertainment of families into study. Multiple imputation is used to account for missing data. The simulation study showed good performance of our proposed model in estimating the …
Models And Policies Of Port Carbon Emission Reduction: A Case Study Of The Port Of Dalian, Jiaqiong Zhao
Models And Policies Of Port Carbon Emission Reduction: A Case Study Of The Port Of Dalian, Jiaqiong Zhao
World Maritime University Dissertations
No abstract provided.
Annuity Product Valuation And Risk Measurement Under Correlated Financial And Longevity Risks, Soohong Park
Annuity Product Valuation And Risk Measurement Under Correlated Financial And Longevity Risks, Soohong Park
Electronic Thesis and Dissertation Repository
Longevity risk is a non-diversifiable risk and regarded as a pressing socio-economic challenge of the century. Its accurate assessment and quantification is therefore critical to enable pension-fund companies provide sustainable old-age security and maintain a resilient global insurance market. Fluctuations and a decreasing trend in mortality rates, which give rise to longevity risk, as well as the uncertainty in interest-rate dynamics constitute the two fundamental determinants in pricing and risk management of longevity-dependent products. We also note that historical data reveal some evidence of strong correlation between mortality and interest rates and must be taken into account when modelling their …
Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek
Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek
Electronic Theses and Dissertations
ABSTRACT
Examination and Comparison of the Performance of Common Non-Parametric and Robust Regression Models
By
Gregory Frank Malek
Stephen F. Austin State University, Masters in Statistics Program,
Nacogdoches, Texas, U.S.A.
This work investigated common alternatives to the least-squares regression method in the presence of non-normally distributed errors. An initial literature review identified a variety of alternative methods, including Theil Regression, Wilcoxon Regression, Iteratively Re-Weighted Least Squares, Bounded-Influence Regression, and Bootstrapping methods. These methods were evaluated using a simple simulated example data set, as well as various real data sets, including math proficiency data, Belgian telephone call data, and faculty …
Using Mountain Snowpack To Predict Summer Water Availability In Semiarid Mountain Watersheds, Rebecca Dawn Garst
Using Mountain Snowpack To Predict Summer Water Availability In Semiarid Mountain Watersheds, Rebecca Dawn Garst
Boise State University Theses and Dissertations
In the mountainous landscapes of the western United States, water resources are dominated by snowpack. As temperatures rise in spring and summer, the melting snow produces an increase in river flow levels. Reservoirs are used during this increase to retain surplus water, which is released to supplement growing season water supply once the peak flows decrease to below water demands. Once there is no longer surplus natural flow of water, the water accounting changes – referred to as the day of allocation (DOA), and water previously retained within the reservoir is used to supplement the lower flow levels. The amount …
Data Analysis Methods Using Persistence Diagrams, Andrew Marchese
Data Analysis Methods Using Persistence Diagrams, Andrew Marchese
Doctoral Dissertations
In recent years, persistent homology techniques have been used to study data and dynamical systems. Using these techniques, information about the shape and geometry of the data and systems leads to important information regarding the periodicity, bistability, and chaos of the underlying systems. In this thesis, we study all aspects of the application of persistent homology to data analysis. In particular, we introduce a new distance on the space of persistence diagrams, and show that it is useful in detecting changes in geometry and topology, which is essential for the supervised learning problem. Moreover, we introduce a clustering framework directly …
Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li
Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li
Electronic Thesis and Dissertation Repository
Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered.
1. Sparse data and convergence behavior. How different properties of a dataset, such as …
Visualizing Lab And Phenotype Associations Using Phewas And Electronic Health Records, Brenda Emerson, Miriam Goldman, Sahiti Kolli
Visualizing Lab And Phenotype Associations Using Phewas And Electronic Health Records, Brenda Emerson, Miriam Goldman, Sahiti Kolli
Honors Projects
As the digitization of patient health records is becoming more common, we are given a great opportunity to analyze these records and hopefully make discoveries about diseases or medicines. Being given large datasets of Electronic Health Records, I and two other students decided to look for novel phenotype associations with mean lab values, look to see whether the presence of a lab had associations with a phenotype, and create an interactive application to visual the associations between labs and phenotypes.
Factor Based Statistical Arbitrage In The U.S. Equity Market With A Model Breakdown Detection Process, Seoungbyung Park
Factor Based Statistical Arbitrage In The U.S. Equity Market With A Model Breakdown Detection Process, Seoungbyung Park
Master's Theses (2009 -)
Many researchers have studied different strategies of statistical arbitrage to provide a steady stream of returns that are unrelated to the market condition. Among different strategies, factor-based mean reverting strategies have been popular and covered by many. This thesis aims to add value by evaluating the generalized pairs trading strategy and suggest enhancements to improve out-of-sample performance. The enhanced strategy generated the daily Sharpe ratio of 6.07% in the out-of-sample period from January 2013 through October 2016 with the correlation of -.03 versus S&P 500. During the same period, S&P 500 generated the Sharpe ratio of 6.03%. This thesis is …
Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney
Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney
Computer Science and Software Engineering
Gridiron Gurus is a desktop application that allows for the creation of custom AI profiles to help advise and compete against in a Fantasy Football setting. Our AI are capable of performing statistical prediction of players on both a season long and week to week basis giving them the ability to both draft and manage a fantasy football team throughout a season.
Mortgage Transition Model Based On Loanperformance Data, Shuyao Yang
Mortgage Transition Model Based On Loanperformance Data, Shuyao Yang
Arts & Sciences Electronic Theses and Dissertations
The unexpected increase in loan default on the mortgage market is widely considered to be one of the main cause behind the economic crisis. To provide some insight on loan delinquency and default, I analyze the mortgage performance data from Fannie Mae website and investigate how economic factors and individual loan and borrower information affect the events of default and prepaid. Various delinquency status including default and prepaid are treated as discrete states of a Markov chain. One-step transition probabilities are estimated via multinomial logistic models. We find that in general current loan-to-value ratio, credit score, unemployment rate, and interest …
Multidataset Independent Subspace Analysis: A Framework For Analysis Of Multimodal, Multi-Subject Brain Imaging Data, Rogers F. Silva
Multidataset Independent Subspace Analysis: A Framework For Analysis Of Multimodal, Multi-Subject Brain Imaging Data, Rogers F. Silva
Electrical and Computer Engineering ETDs
Mental illnesses are serious disorders of the brain that have devastating effects on individuals and society. In addition to their disabling and impairing effects, mental illnesses have deep social and economical implications, accounting for an estimated loss of 12 billion working days and a care cost surge to $6 trillion a year by 2030. For diseases such as depression and anxiety, enhancing preventive programs and treatment accessibility, in combination with accurate early diagnosis and personalized treatments, are projected to result in a four-fold return on every dollar invested, a strategy that can drastically help curtail those losses. Notably, within the …
Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers
Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
Survival analysis methods are a mainstay of the biomedical fields but are finding increasing use in other disciplines including finance and engineering. A widely used tool in survival analysis is the Cox proportional hazards regression model. For this model, all the predicted survivor curves have the same basic shape, which may not be a good approximation to reality. In contrast the Random Survival Forests does not make the proportional hazards assumption and has the flexibility to model survivor curves that are of quite different shapes for different groups of subjects. We applied both techniques to a number of publicly available …
A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone
A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
A community ecologist provided a motivating data set involving a certain animal species with two behavior groups, along with a pairwise genetic distance matrix among individuals. Many community ecologists have analyzed similar data sets with a method known as the Hopkins method, testing for an association between the subject-level covariate (behavior group) and the pairwise distance. This community ecologist wanted to know if they used the Hopkins method, would their results be meaningful? Their question inspired this thesis work, where a different data set was used for confidentiality reasons. Multiple methods (Hopkins method, ADONIS, ANOSIM, and Distance Regression) were used …
Statistical Analysis Of Markovian Queueing Models Of Limit Order Books, Yiyao Luo
Statistical Analysis Of Markovian Queueing Models Of Limit Order Books, Yiyao Luo
Arts & Sciences Electronic Theses and Dissertations
The objective of this thesis is to investigate the suitability of some Markovian queueing models in being able to effectively describe the dynamical properties of a limit order book more specifically. We review and compare the assumptions proposed by Huang et al.[Quantitative Finance,12,547-557(2012)] and Cont et al.[SIAM Journal for Financial Mathematics,4,1- 25(2013)], and estimate the intensity parameters in both ways, based on real data of a stock on the Nasdaq Stock Market. Trough comparing by cumulative distribution functions of first-passage time to state 0, we will hsow that the estimators of Cont’s model fit our data better and we put …
Performance Of Imputation Algorithms On Artificially Produced Missing At Random Data, Tobias O. Oketch
Performance Of Imputation Algorithms On Artificially Produced Missing At Random Data, Tobias O. Oketch
Electronic Theses and Dissertations
Missing data is one of the challenges we are facing today in modeling valid statistical models. It reduces the representativeness of the data samples. Hence, population estimates, and model parameters estimated from such data are likely to be biased.
However, the missing data problem is an area under study, and alternative better statistical procedures have been presented to mitigate its shortcomings. In this paper, we review causes of missing data, and various methods of handling missing data. Our main focus is evaluating various multiple imputation (MI) methods from the multiple imputation of chained equation (MICE) package in the statistical software …
Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li
Dissertations & Theses (Open Access)
My dissertation is focused on quantitative methodology development and application for two important topics in translational and clinical cancer research.
The first topic was motivated by the challenge of applying transcriptome sequencing (RNA-seq) to formalin-fixation and paraffin-embedding (FFPE) tumor samples for reliable diagnostic development. We designed a biospecimen study to directly compare gene expression results from different protocols to prepare libraries for RNA-seq from human breast cancer tissues, with randomization to fresh-frozen (FF) or FFPE conditions. To comprehensively evaluate the FFPE RNA-seq data quality for expression profiling, we developed multiple computational methods for assessment, such as the uniformity and continuity …
Modelling Cash Crop Growth In Tn, Spencer Weston
Modelling Cash Crop Growth In Tn, Spencer Weston
Chancellor’s Honors Program Projects
No abstract provided.
A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang
A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang
Graduate Theses and Dissertations
This thesis first describes the general idea behind Bayes Inference, various sampling methods based on Bayes theorem and many examples. Then a Bayes approach to model selection, called Stochastic Search Variable Selection (SSVS) is discussed. It was originally proposed by George and McCulloch (1993). In a normal regression model where the number of covariates is large, only a small subset tend to be significant most of the times. This Bayes procedure specifies a mixture prior for each of the unknown regression coefficient, the mixture prior was originally proposed by Geweke (1996). This mixture prior will be updated as data becomes …
On Post-Selection Confidence Intervals In Linear Regression, Xinwei Zhang
On Post-Selection Confidence Intervals In Linear Regression, Xinwei Zhang
Arts & Sciences Electronic Theses and Dissertations
The general goal of this thesis is to investigate and examine some issues about post-selection inference which arises from the setting where statistical inference is carried out after a datadriven model selection step. In this setting, the classical inference theory which requires a fixed priori model becomes invalid since the selected model is a result of random event. Hence, a common practice in applied research which ignores the model selection and builds up confidence interval will result in misleading or even false conclusion. In this thesis, specifically, we first discusses some examples to show how the classical inference theory loses …
Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane
Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane
Theses
Alzheimer Disease (AD) is difficult to diagnose by using genetic testing or other traditional methods. Unlike diseases with simple genetic risk components, there exists no single marker determining as to whether someone will develop AD. Furthermore, AD is highly heterogeneous and different subgroups of individuals develop the disease due to differing factors. Traditional diagnostic methods using perceivable cognitive deficiencies are often too little too late due to the brain having suffered damage from decades of disease progression. In order to observe AD at early stages prior to the observation of cognitive deficiencies, biomarkers with greater accuracy are required. By using …
Further Advances For The Sequential Multiple Assignment Randomized Trial (Smart), Tianjiao Dai
Further Advances For The Sequential Multiple Assignment Randomized Trial (Smart), Tianjiao Dai
Dissertations & Theses (Open Access)
ABSTRACT
FURTHER ADVANCES FOR THE SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZED TRIAL (SMART)
Tianjiao Dai, M.S.
Advisory Professor: Sanjay Shete, Ph.D.
Sequential multiple assignment randomized trial (SMART) designs have been developed these years for studying adaptive interventions. In my Ph.D. study, I mainly investigate how to further improve SMART designs and optimize the interventions for each individual in the trial. My dissertation has focused on two topics of SMART designs.
1) Developing a novel SMART design that can reduce the cost and side effects associated with the interventions and proposing the corresponding analytic methods. I have developed a time-varying SMART design in …