Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- University of Massachusetts Amherst (6)
- Western University (5)
- Georgia Southern University (4)
- Selected Works (4)
- University of Kentucky (4)
-
- Utah State University (4)
- Himmelfarb Health Sciences Library, The George Washington University (3)
- Old Dominion University (3)
- University of Tennessee, Knoxville (3)
- Virginia Commonwealth University (3)
- Washington University in St. Louis (3)
- Bowling Green State University (2)
- Illinois State University (2)
- Murray State University (2)
- Purdue University (2)
- Tennessee State University (2)
- The Texas Medical Center Library (2)
- University of Arkansas, Fayetteville (2)
- University of Nebraska - Lincoln (2)
- Bard College (1)
- Boise State University (1)
- COBRA (1)
- California Polytechnic State University, San Luis Obispo (1)
- California State University, San Bernardino (1)
- Chapman University (1)
- Chicago-Kent College of Law (1)
- City University of New York (CUNY) (1)
- Claremont Colleges (1)
- East Tennessee State University (1)
- Florida International University (1)
- Keyword
-
- Statistics (7)
- Bayesian (3)
- Logistic regression (3)
- Data Science (2)
- Epidemiology (2)
-
- Machine Learning (2)
- Machine learning (2)
- Probability (2)
- Protein (2)
- Q-learning (2)
- "hot hand" (1)
- Adaptive (1)
- Adaptive Sampling Methods (1)
- Age-at-onset penetrance (1)
- Agent-based simulation (1)
- Agriculture (1)
- Algorithm (1)
- Algorithms (1)
- Alternative tobacco products; environment; genes; nicotine; tobacco; young adulthood (1)
- Alzheimer (1)
- Alzheimer's Disease (1)
- Amino acid (1)
- Artificial Intelligence (1)
- Artificial intelligence (1)
- Association selection (1)
- Atopic dermatitis; Burden of disease; Comorbidities; Dermatology; Disease severity; Healthcare resource utilization; Insurance claims database (1)
- Audio signal processing (1)
- Baseball (1)
- Basketball (1)
- Bayes' rule (1)
- Publication
-
- Electronic Theses and Dissertations (6)
- Doctoral Dissertations (5)
- Electronic Thesis and Dissertation Repository (5)
- All Graduate Plan B and other Reports, Spring 1920 to Spring 2023 (3)
- Arts & Sciences Electronic Theses and Dissertations (3)
-
- Masters Theses (3)
- Annual Symposium on Biomathematics and Ecology Education and Research (2)
- Articles (2)
- Biology and Medicine Through Mathematics Conference (2)
- Dissertations & Theses (Open Access) (2)
- Global Health Faculty Publications (2)
- Graduate Theses and Dissertations (2)
- Honors Projects (2)
- Theses and Dissertations--Statistics (2)
- All Faculty Scholarship (1)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (1)
- Basic Science Engineering (1)
- Boise State University Theses and Dissertations (1)
- Chancellor’s Honors Program Projects (1)
- Computer Science and Software Engineering (1)
- Department of Mathematics Facuty Scholarship and Creative Works (1)
- Department of Statistics: Dissertations, Theses, and Student Work (1)
- Electrical and Computer Engineering ETDs (1)
- Electronic Theses, Projects, and Dissertations (1)
- Engineering Management & Systems Engineering Faculty Publications (1)
- Epidemiology Faculty Publications (1)
- FIU Electronic Theses and Dissertations (1)
- Journal of Humanistic Mathematics (1)
- Kathryn Wissel, MBA, MI (1)
- Luca De Benedictis (1)
- Publication Type
Articles 31 - 60 of 81
Full-Text Articles in Physical Sciences and Mathematics
Visualizing Lab And Phenotype Associations Using Phewas And Electronic Health Records, Brenda Emerson, Miriam Goldman, Sahiti Kolli
Visualizing Lab And Phenotype Associations Using Phewas And Electronic Health Records, Brenda Emerson, Miriam Goldman, Sahiti Kolli
Honors Projects
As the digitization of patient health records is becoming more common, we are given a great opportunity to analyze these records and hopefully make discoveries about diseases or medicines. Being given large datasets of Electronic Health Records, I and two other students decided to look for novel phenotype associations with mean lab values, look to see whether the presence of a lab had associations with a phenotype, and create an interactive application to visual the associations between labs and phenotypes.
Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei
Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei
Publications and Research
Comparative data on the burden of atopic dermatitis (AD) in adults relative to the general population are limited. We performed a large-scale evaluation of the burden of disease among US adults with AD relative to matched non-AD controls, encompassing comorbidities, healthcare resource utilization (HCRU), and costs, using healthcare claims data. The impact of AD disease severity on these outcomes was also evaluated.
Information Metrics For Predictive Modeling And Machine Learning, Kostantinos Gourgoulias
Information Metrics For Predictive Modeling And Machine Learning, Kostantinos Gourgoulias
Doctoral Dissertations
The ever-increasing complexity of the models used in predictive modeling and data science and their use for prediction and inference has made the development of tools for uncertainty quantification and model selection especially important. In this work, we seek to understand the various trade-offs associated with the simulation of stochastic systems. Some trade-offs are computational, e.g., execution time of an algorithm versus accuracy of simulation. Others are analytical: whether or not we are able to find tractable substitutes for quantities of interest, e.g., distributions, ergodic averages, etc. The first two chapters of this thesis deal with the study of the …
Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu
Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu
Doctoral Dissertations
In this thesis, we propose statistical models for addressing commonly encountered data types and study designs in large epidemiologic investigations aimed at understanding the molecular basis of complex disorders. The motivating applications come from diverse disease areas in Women's Health, including the study of type II diabetes in the Women's Health Initiative (WHI), invasive breast cancer in the Nurses' Health Study and the study of the metabolomic underpinnings of cardiovascular disease in the WHI. We have also put significant effort into making the implementation of the proposed methods accessible through freely available, user-friendly software packages in R. The first chapter …
Factor Based Statistical Arbitrage In The U.S. Equity Market With A Model Breakdown Detection Process, Seoungbyung Park
Factor Based Statistical Arbitrage In The U.S. Equity Market With A Model Breakdown Detection Process, Seoungbyung Park
Master's Theses (2009 -)
Many researchers have studied different strategies of statistical arbitrage to provide a steady stream of returns that are unrelated to the market condition. Among different strategies, factor-based mean reverting strategies have been popular and covered by many. This thesis aims to add value by evaluating the generalized pairs trading strategy and suggest enhancements to improve out-of-sample performance. The enhanced strategy generated the daily Sharpe ratio of 6.07% in the out-of-sample period from January 2013 through October 2016 with the correlation of -.03 versus S&P 500. During the same period, S&P 500 generated the Sharpe ratio of 6.03%. This thesis is …
Mixture Models For Undiagnosed Prevalent Disease And Interval-Censored Incident Disease: Applications To A Cohort Assembled From Electronic Health Records., Li C Cheung, Qing Pan, Noorie Hyun, Mark Schiffman, Barbara Fetterman, Philip E Castle, Thomas Lorey, Hormuzd A Katki
Mixture Models For Undiagnosed Prevalent Disease And Interval-Censored Incident Disease: Applications To A Cohort Assembled From Electronic Health Records., Li C Cheung, Qing Pan, Noorie Hyun, Mark Schiffman, Barbara Fetterman, Philip E Castle, Thomas Lorey, Hormuzd A Katki
Epidemiology Faculty Publications
For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early …
Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney
Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney
Computer Science and Software Engineering
Gridiron Gurus is a desktop application that allows for the creation of custom AI profiles to help advise and compete against in a Fantasy Football setting. Our AI are capable of performing statistical prediction of players on both a season long and week to week basis giving them the ability to both draft and manage a fantasy football team throughout a season.
Mechanistic Mathematical Models: An Underused Platform For Hpv Research, Marc Ryser, Patti Gravitt, Evan R. Myers
Mechanistic Mathematical Models: An Underused Platform For Hpv Research, Marc Ryser, Patti Gravitt, Evan R. Myers
Global Health Faculty Publications
Health economic modeling has become an invaluable methodology for the design and evaluation of clinical and public health interventions against the human papillomavirus (HPV) and associated diseases. At the same time, relatively little attention has been paid to a different yet complementary class of models, namely that of mechanistic mathematical models. The primary focus of mechanistic mathematical models is to better understand the intricate biologic mechanisms and dynamics of disease. Inspired by a long and successful history of mechanistic modeling in other biomedical fields, we highlight several areas of HPV research where mechanistic models have the potential to advance the …
Firing Rate Heterogeneity And Consequences For Coding In Feedforward Circuits, Cheng Ly, Gary Marsat
Firing Rate Heterogeneity And Consequences For Coding In Feedforward Circuits, Cheng Ly, Gary Marsat
Biology and Medicine Through Mathematics Conference
No abstract provided.
Methods For Parameter Estimation Of A Stochastic Seir Model, Kaitlyn Martinez
Methods For Parameter Estimation Of A Stochastic Seir Model, Kaitlyn Martinez
Biology and Medicine Through Mathematics Conference
No abstract provided.
Shape Features Underlying The Perception Of Liquids, Jan Jaap R. Van Assen, Pascal Barla, Roland W. Fleming
Shape Features Underlying The Perception Of Liquids, Jan Jaap R. Van Assen, Pascal Barla, Roland W. Fleming
MODVIS Workshop
No abstract provided.
Mortgage Transition Model Based On Loanperformance Data, Shuyao Yang
Mortgage Transition Model Based On Loanperformance Data, Shuyao Yang
Arts & Sciences Electronic Theses and Dissertations
The unexpected increase in loan default on the mortgage market is widely considered to be one of the main cause behind the economic crisis. To provide some insight on loan delinquency and default, I analyze the mortgage performance data from Fannie Mae website and investigate how economic factors and individual loan and borrower information affect the events of default and prepaid. Various delinquency status including default and prepaid are treated as discrete states of a Markov chain. One-step transition probabilities are estimated via multinomial logistic models. We find that in general current loan-to-value ratio, credit score, unemployment rate, and interest …
A Multifactorial Obesity Model Developed From Nationwide Public Health Exposome Data And Modern Computational Analyses, Lisaann S. Gittner, Barbara J. Kilbourne, Ravi Vadapalli, Hafiz M.K. Khan, Michael A. Langston
A Multifactorial Obesity Model Developed From Nationwide Public Health Exposome Data And Modern Computational Analyses, Lisaann S. Gittner, Barbara J. Kilbourne, Ravi Vadapalli, Hafiz M.K. Khan, Michael A. Langston
Sociology Faculty Research
Summary
Statement of the problem
Obesity is both multifactorial and multimodal, making it difficult to identify, unravel and distinguish causative and contributing factors. The lack of a clear model of aetiology hampers the design and evaluation of interventions to prevent and reduce obesity.
Methods
Using modern graph-theoretical algorithms, we are able to coalesce and analyse thousands of inter-dependent variables and interpret their putative relationships to obesity. Our modelling is different from traditional approaches; we make no a priori assumptions about the population, and model instead based on the actual characteristics of a population. Paracliques, noise-resistant collections of highly-correlated variables, are …
Multidataset Independent Subspace Analysis: A Framework For Analysis Of Multimodal, Multi-Subject Brain Imaging Data, Rogers F. Silva
Multidataset Independent Subspace Analysis: A Framework For Analysis Of Multimodal, Multi-Subject Brain Imaging Data, Rogers F. Silva
Electrical and Computer Engineering ETDs
Mental illnesses are serious disorders of the brain that have devastating effects on individuals and society. In addition to their disabling and impairing effects, mental illnesses have deep social and economical implications, accounting for an estimated loss of 12 billion working days and a care cost surge to $6 trillion a year by 2030. For diseases such as depression and anxiety, enhancing preventive programs and treatment accessibility, in combination with accurate early diagnosis and personalized treatments, are projected to result in a four-fold return on every dollar invested, a strategy that can drastically help curtail those losses. Notably, within the …
Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons
Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons
Student Scholar Symposium Abstracts and Posters
After going on the Warner Brothers Tour in December of 2015, I created a Gilmore Girls Instagram account. This account, which started off as a way for me to create edits of the show and post my photos from the tour turned into something bigger than I ever could have imagined. In just over a year I have over 55,000 followers. I post content including revival news, merchandise, and edits of the show that have been featured in Entertainment Weekly, Bustle, E! News, People Magazine, Yahoo News, & GilmoreNews.
I created a dataset of qualitative and quantitative outcomes from my …
Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers
Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
Survival analysis methods are a mainstay of the biomedical fields but are finding increasing use in other disciplines including finance and engineering. A widely used tool in survival analysis is the Cox proportional hazards regression model. For this model, all the predicted survivor curves have the same basic shape, which may not be a good approximation to reality. In contrast the Random Survival Forests does not make the proportional hazards assumption and has the flexibility to model survivor curves that are of quite different shapes for different groups of subjects. We applied both techniques to a number of publicly available …
A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone
A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
A community ecologist provided a motivating data set involving a certain animal species with two behavior groups, along with a pairwise genetic distance matrix among individuals. Many community ecologists have analyzed similar data sets with a method known as the Hopkins method, testing for an association between the subject-level covariate (behavior group) and the pairwise distance. This community ecologist wanted to know if they used the Hopkins method, would their results be meaningful? Their question inspired this thesis work, where a different data set was used for confidentiality reasons. Multiple methods (Hopkins method, ADONIS, ANOSIM, and Distance Regression) were used …
Inference On The Stress-Strength Model From Weibull Gamma Distribution, Mahmoud Mansour, Rashad El-Sagheer, M. A. W. Mahmoud Prof.
Inference On The Stress-Strength Model From Weibull Gamma Distribution, Mahmoud Mansour, Rashad El-Sagheer, M. A. W. Mahmoud Prof.
Basic Science Engineering
No abstract provided.
Performance Of Imputation Algorithms On Artificially Produced Missing At Random Data, Tobias O. Oketch
Performance Of Imputation Algorithms On Artificially Produced Missing At Random Data, Tobias O. Oketch
Electronic Theses and Dissertations
Missing data is one of the challenges we are facing today in modeling valid statistical models. It reduces the representativeness of the data samples. Hence, population estimates, and model parameters estimated from such data are likely to be biased.
However, the missing data problem is an area under study, and alternative better statistical procedures have been presented to mitigate its shortcomings. In this paper, we review causes of missing data, and various methods of handling missing data. Our main focus is evaluating various multiple imputation (MI) methods from the multiple imputation of chained equation (MICE) package in the statistical software …
A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang
A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang
Graduate Theses and Dissertations
This thesis first describes the general idea behind Bayes Inference, various sampling methods based on Bayes theorem and many examples. Then a Bayes approach to model selection, called Stochastic Search Variable Selection (SSVS) is discussed. It was originally proposed by George and McCulloch (1993). In a normal regression model where the number of covariates is large, only a small subset tend to be significant most of the times. This Bayes procedure specifies a mixture prior for each of the unknown regression coefficient, the mixture prior was originally proposed by Geweke (1996). This mixture prior will be updated as data becomes …
Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li
Dissertations & Theses (Open Access)
My dissertation is focused on quantitative methodology development and application for two important topics in translational and clinical cancer research.
The first topic was motivated by the challenge of applying transcriptome sequencing (RNA-seq) to formalin-fixation and paraffin-embedding (FFPE) tumor samples for reliable diagnostic development. We designed a biospecimen study to directly compare gene expression results from different protocols to prepare libraries for RNA-seq from human breast cancer tissues, with randomization to fresh-frozen (FF) or FFPE conditions. To comprehensively evaluate the FFPE RNA-seq data quality for expression profiling, we developed multiple computational methods for assessment, such as the uniformity and continuity …
Modelling Cash Crop Growth In Tn, Spencer Weston
Modelling Cash Crop Growth In Tn, Spencer Weston
Chancellor’s Honors Program Projects
No abstract provided.
On Post-Selection Confidence Intervals In Linear Regression, Xinwei Zhang
On Post-Selection Confidence Intervals In Linear Regression, Xinwei Zhang
Arts & Sciences Electronic Theses and Dissertations
The general goal of this thesis is to investigate and examine some issues about post-selection inference which arises from the setting where statistical inference is carried out after a datadriven model selection step. In this setting, the classical inference theory which requires a fixed priori model becomes invalid since the selected model is a result of random event. Hence, a common practice in applied research which ignores the model selection and builds up confidence interval will result in misleading or even false conclusion. In this thesis, specifically, we first discusses some examples to show how the classical inference theory loses …
Statistical Analysis Of Markovian Queueing Models Of Limit Order Books, Yiyao Luo
Statistical Analysis Of Markovian Queueing Models Of Limit Order Books, Yiyao Luo
Arts & Sciences Electronic Theses and Dissertations
The objective of this thesis is to investigate the suitability of some Markovian queueing models in being able to effectively describe the dynamical properties of a limit order book more specifically. We review and compare the assumptions proposed by Huang et al.[Quantitative Finance,12,547-557(2012)] and Cont et al.[SIAM Journal for Financial Mathematics,4,1- 25(2013)], and estimate the intensity parameters in both ways, based on real data of a stock on the Nasdaq Stock Market. Trough comparing by cumulative distribution functions of first-passage time to state 0, we will hsow that the estimators of Cont’s model fit our data better and we put …
Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane
Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane
Theses
Alzheimer Disease (AD) is difficult to diagnose by using genetic testing or other traditional methods. Unlike diseases with simple genetic risk components, there exists no single marker determining as to whether someone will develop AD. Furthermore, AD is highly heterogeneous and different subgroups of individuals develop the disease due to differing factors. Traditional diagnostic methods using perceivable cognitive deficiencies are often too little too late due to the brain having suffered damage from decades of disease progression. In order to observe AD at early stages prior to the observation of cognitive deficiencies, biomarkers with greater accuracy are required. By using …
A General Approach For Predicting The Behavior Of The Supreme Court Of The United States, Daniel Katz
A General Approach For Predicting The Behavior Of The Supreme Court Of The United States, Daniel Katz
All Faculty Scholarship
Building on developments in machine learning and prior work in the science of judicial prediction, we construct a model designed to predict the behavior of the Supreme Court of the United States in a generalized, out-of-sample context. To do so, we develop a time-evolving random forest classifier that leverages unique feature engineering to predict more than 240,000 justice votes and 28,000 cases outcomes over nearly two centuries (1816-2015). Using only data available prior to decision, our model outperforms null (baseline) models at both the justice and case level under both parametric and non-parametric tests. Over nearly two centuries, we achieve …
Implementing Propensity Score Matching With Network Data: The Effect Of Gatt On Bilateral Trade, Luca De Benedictis, Bruno Arpino, Alessandra Mattei
Implementing Propensity Score Matching With Network Data: The Effect Of Gatt On Bilateral Trade, Luca De Benedictis, Bruno Arpino, Alessandra Mattei
Luca De Benedictis
Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr
Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr
Scholars Week
Timing methods and performance metrics are important in the heavily industrialized world we live in. Industrial plants use metrics to measure quality of production, help make decisions, and drive the strategy of the organization. However, there are many factors to be considered when measuring performance based on a metric; of which we will be analyzing the importance of product variation. We will be analyzing assembly line timings, whilst controlling for product variance, to show the importance differences between products makes in one’s ability to predict performance. In addition, we will be analyzing the current “statistical” methods used by an industrial …
Maximum Likelihood Estimation Of Parameters In Exponential Power Distribution With Upper Record Values, Tianchen Zhi
Maximum Likelihood Estimation Of Parameters In Exponential Power Distribution With Upper Record Values, Tianchen Zhi
FIU Electronic Theses and Dissertations
The exponential power (EP) distribution is a very important distribution that was used by survival analysis and related with asymmetrical EP distribution. Many researchers have discussed statistical inference about the parameters in EP distribution using i.i.d random samples. However, sometimes available data might contain only record values, or it is more convenient for researchers to collect record values. We aim to resolve this problem. We estimated two parameters of the EP distribution by MLE using upper record values. According to simulation study, we used the Bias and MSE of the estimators for studying the efficiency of the proposed estimation method. …
Shining A Light On A University Special Collection With Data Visualization, Lisa Deluca, Katie M. Wissel