Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology Commons

Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 31 - 60 of 78

Full-Text Articles in Statistical Methodology

Identifying Key Factors Associated With High Risk Asthma Patients To Reduce The Cost Of Health Resources Utilization, Amani Ahmad Oct 2018

Identifying Key Factors Associated With High Risk Asthma Patients To Reduce The Cost Of Health Resources Utilization, Amani Ahmad

LSU Master's Theses

Asthma is associated with frequent use of primary health services and places a burden on the United States economy. Identifying key factors associated with increased cost of asthma is an essential step to improve practices of asthma management.

The aim of this study was to identify factors associated with over utilization of primary health services and increased cost via claims data and to explore the effectiveness of case management program in reducing overall asthma related cost.

Claims data analysis for Medicaid insured asthma patients in Louisiana was conducted. Asthma patients were identified using their ICD-9 and ICD-10 codes, forward variable …


Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John Aug 2018

Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

SMU Data Science Review

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age, …


Australian Herring And West Australian Salmon Scientific Workshop Report, October 2017, Department Of Primary Industries And Regional Development, Western Australia Jul 2018

Australian Herring And West Australian Salmon Scientific Workshop Report, October 2017, Department Of Primary Industries And Regional Development, Western Australia

Fisheries research reports

No abstract provided.


Satellite Communications In The V And W Band: Tropospheric Effects, Bertus A. Shelters Mar 2018

Satellite Communications In The V And W Band: Tropospheric Effects, Bertus A. Shelters

Theses and Dissertations

An investigation into the use of Weather Cubes compiled by the atmospheric characterization package, Laser Environmental Effects Definition and Reference (LEEDR), to develop accurate, long-term attenuation statistics for link-budget analysis is presented. A Weather Cube is a three-dimensional mesh of numerical weather prediction (NWP) data plus LEEDR calculations that allows for the quantification of rain, cloud, aerosol, and molecular effects at any UV to RF wavelength on any path contained within the cube. The development of this methodology is motivated by the potential use of V (40-75 GHz) and W (75-110 GHz) band frequencies for the satellite communication application, as …


Building A Better Risk Prevention Model, Steven Hornyak Mar 2018

Building A Better Risk Prevention Model, Steven Hornyak

National Youth Advocacy and Resilience Conference

This presentation chronicles the work of Houston County Schools in developing a risk prevention model built on more than ten years of longitudinal student data. In its second year of implementation, Houston At-Risk Profiles (HARP), has proven effective in identifying those students most in need of support and linking them to interventions and supports that lead to improved outcomes and significantly reduces the risk of failure.


Campus Climate Sexual Assault Survey (2015) Analysis, Felicia Rosin Jan 2018

Campus Climate Sexual Assault Survey (2015) Analysis, Felicia Rosin

Williams Honors College, Honors Research Projects

The issue of sexual assault has garnered widespread attention in recent years, as is evident by the growing number of high-profile cases and mainstream social movements. With this increasingly bright spotlight, it is no surprise that The University of Akron has interest in improving the sexual violence education programs offered to students. In 2015, the university conducted a survey to gather information on the campus climate surrounding sexual assault. This analysis dives into a deeper analysis of the data gathered in an attempt to pinpoint areas that require the university’s attention. The analysis covers topics identified by Dean of Students …


Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett Jan 2018

Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett

Williams Honors College, Honors Research Projects

The purpose of this Senior Honors Project is to research, study, and demonstrate newfound knowledge of various machine learning statistical techniques that are not covered in the University of Akron’s statistics major curriculum. This report will be an overview of three machine-learning methods that were used to predict NCAA Basketball results, specifically, the March Madness tournament. The variables used for these methods, models, and tests will include numerous variables kept throughout the season for each team, along with a couple variables that are used by the selection committee when tournament teams are being picked. The end goal is to find …


Data-Adaptive Kernel Support Vector Machine, Xin Liu Nov 2017

Data-Adaptive Kernel Support Vector Machine, Xin Liu

Electronic Thesis and Dissertation Repository

In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges …


On The Three Dimensional Interaction Between Flexible Fibers And Fluid Flow, Bogdan Nita, Ryan Allaire Jan 2017

On The Three Dimensional Interaction Between Flexible Fibers And Fluid Flow, Bogdan Nita, Ryan Allaire

Department of Mathematics Facuty Scholarship and Creative Works

In this paper we discuss the deformation of a flexible fiber clamped to a spherical body and immersed in a flow of fluid moving with a speed ranging between 0 and 50 cm/s by means of three dimensional numerical simulation developed in COMSOL . The effects of flow speed and initial configuration angle of the fiber relative to the flow are analyzed. A rigorous analysis of the numerical procedure is performed and our code is benchmarked against well established cases. The flow velocity and pressure are used to compute drag forces upon the fiber. Of particular interest is the behavior …


What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg Jan 2017

What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg

Publications and Research

We believe that students learn best, are actively engaged, and are genuinely interested when working on real-world problems. This can be done by giving students the opportunity to work collaboratively on projects that investigate authentic, familiar problems. This article shares one such project that was used in an introductory statistics course. We describe the steps taken to investigate why customers are charged more for iced coffee than hot coffee, which included collecting data and using descriptive and inferential statistical analysis. Interspersed throughout the article, we describe strategies that can help teachers implement the project and scaffold material to assist students …


Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang Jun 2016

Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang

Chongzhi Di

Recently, Qin and Liang (Biometrics, 2011) considered a semiparametric mixture case-control model and proposed a score test for homogeneity. The mixture model is semiparametric in the sense that the density ratio of two distributions is assumed to be of exponential form, while the baseline density is unspecified. In a family of parametric admixture models, Di and Liang (Biometrics, 2011) showed that the likelihood ratio test statistics, which is equivalent to a supremum statistics, could improve power over score tests. We generalize the likelihood ratio or supremum statistics to the semiparametric mixture model and demonstrate the power gain over the score …


Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha May 2016

Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha

FIU Electronic Theses and Dissertations

AlNiCo magnets are known for high-temperature stability and superior corrosion resistance and have been widely used for various applications. Reported magnetic energy density ((BH) max) for these magnets is around 10 MGOe. Theoretical calculations show that ((BH) max) of 20 MGOe is achievable which will be helpful in covering the gap between AlNiCo and Rare-Earth Elements (REE) based magnets. An extended family of AlNiCo alloys was studied in this dissertation that consists of eight elements, and hence it is important to determine composition-property relationship between each of the alloying elements and their influence on the bulk properties.

In …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Evaluating The Effects Of Standardized Patient Care Pathways On Clinical Outcomes, Anna V. Romanova Aug 2015

Evaluating The Effects Of Standardized Patient Care Pathways On Clinical Outcomes, Anna V. Romanova

Doctoral Dissertations

The main focus of this study is to create a standardized approach to evaluating the impact of the patient care pathways across all major disease categories and key outcome measures in a hospital setting when randomized clinical trials are not feasible. Toward this goal I identify statistical methods, control factors, and adjustments that can correct for potential confounding in observational studies. I investigate the efficiency of existing bias correction methods under varying conditions of imbalanced samples through a Monte Carlo simulation. The simulation results are then utilized in a case study for one of the largest primary diagnosis areas, chronic …


Scientific Awareness At Ursinus College, Frank G. Devone Apr 2015

Scientific Awareness At Ursinus College, Frank G. Devone

Mathematics Honors Papers

Ursinus College prides itself on creating well-rounded students, and recent initiatives, such as the Fellowships in the Ursinus Transition to the Undergraduate Research Experience Program and the Center for Science and the Common Good suggest that science is a vital part of the Ursinus liberal arts mission. A scientific awareness pilot survey was administered to a sample of Ursinus students drawn from the Class of 2014 and students residing at Ursinus during summer 2014. Experience and data collected from this pilot were used to create a final survey which was made available to all students at Ursinus College. The survey …


A Review Of Frequentist Tests For The 2x2 Binomial Trial, Chris Lloyd Dec 2014

A Review Of Frequentist Tests For The 2x2 Binomial Trial, Chris Lloyd

Chris J. Lloyd

The 2x2 binomial trial is the simplest of data structures yet its statistical analysis and the issues it raises have been debated and revisited for over 70 years. Which analysis should biomedical researchers use in applications? In this review, we consider frequentist tests only, specifically tests with control size either exactly or very close to exactly. These procedures can be classified as conditional and unconditional. Amongst tests motivated by a conditional model, Lancaster’s mid-p and Liebermeister’s test are less conservative than Fisher’s classical test, but do not control type 1 error. Within the conditional framework, only Fisher’s test can be …


On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang Jan 2014

On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang

Chongzhi Di

In parametric models, when one or more parameters disappear under the null hypothesis, the likelihood ratio test statistic does not converge to chi-square distributions. Rather, its limiting distribution is shown to be equivalent to that of the supremum of a squared Gaussian process. However, the limiting distribution is analytically intractable for most of examples, and approximation or simulation based methods must be used to calculate the p values. In this article, we investigate conditions under which the asymptotic distributions have analytically tractable forms, based on the principal component decomposition of Gaussian processes. When these conditions are not satisfied, the principal …


Group Testing Regression Models, Boan Zhang Nov 2012

Group Testing Regression Models, Boan Zhang

Department of Statistics: Dissertations, Theses, and Student Work

Group testing, where groups of individual specimens are composited to test for the presence or absence of a disease (or some other binary characteristic), is a procedure commonly used to reduce the costs of screening a large number of individuals. Statistical research in group testing has traditionally focused on a homogeneous population, where individuals are assumed to have the same probability of having a disease. However, individuals often have different risks of positivity, so recent research has examined regression models that allow for heterogeneity among individuals within the population. This dissertation focuses on two problems involving group testing regression models. …


Adventures In Library Salary Surveys, Scott L. Schaffer Aug 2012

Adventures In Library Salary Surveys, Scott L. Schaffer

UVM Libraries Conference Day

Salary surveys are an important tool for the library community and the administrators and boards responsible for the oversight of libraries. However, such assessments must be constructed and analyzed with great care. The Vermont Library Association Personnel Committee has conducted three salary surveys over the past several years, one focusing on academic libraries and two on public libraries. Significant issues have included confidentiality, participation rate, definitions, length and difficulty of questions, collection of data, and representativeness. Suggestions and lessons learned will be shared.


Investigation Of Trends And Predictive Effectiveness Of Crash Severity Models, James E. Mooradian Jun 2012

Investigation Of Trends And Predictive Effectiveness Of Crash Severity Models, James E. Mooradian

Master's Theses

This thesis describes analysis using ordinal logistic regression to uncover temporal patterns in the severity level (fatal, serious injury, minor injury, slight injury or no injury) for persons involved in highway crashes in Connecticut, focusing on the demographic split between senior travelers (65 years and over) and non-senior travelers. Existing state sources provide data describing the time and weather conditions for each crash and the vehicles and persons involved over the time period from 1995 to 2009 as well as the traffic volumes and the characteristics of the roads on which these crashes occurred. Findings indicate an overall increase in …


Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison May 2012

Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison

Statistics

As a student, I noticed that the statistical package R (http://www.r-project.org) would have several benefits of its usage in the classroom. One benefit to the package is its free and open-source nature. This would be a great benefit for instructors and students alike since it would be of no cost to use, unlike other statistical packages. Due to this, students could continue using the program after their statistical courses and into their professional careers. It would be good to expose students while they are in school to a tool that professionals use in industry. R also has powerful …


R Code: A Non-Iterative Implementation Of Tango's Score Confidence Interval For A Paired Difference Of Proportions, Zhao Yang Jan 2012

R Code: A Non-Iterative Implementation Of Tango's Score Confidence Interval For A Paired Difference Of Proportions, Zhao Yang

Zhao (Tony) Yang, Ph.D.

For matched-pair binary data, a variety of approaches have been proposed for the construction of a confidence interval (CI) for the difference of marginal probabilities between two procedures. The score-based approximate CI has been shown to outperform other asymptotic CIs. Tango’s method provides a score CI by inverting a score test statistic using an iterative procedure. In the developed R code, we propose an efficient non-iterative method with closed-form expression to calculate Tango’s CIs. Examples illustrate the practical application of the new approach.


The Bivariate Rank-Based Concordance Index For Ordinal And Tied Data, Emanuela Raffinetti, Pier Alda Ferrari Jan 2012

The Bivariate Rank-Based Concordance Index For Ordinal And Tied Data, Emanuela Raffinetti, Pier Alda Ferrari

Emanuela Raffinetti

No abstract provided.


Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di Jan 2012

Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di

Chongzhi Di

To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …


A Unified Approach To Non-Negative Matrix Factorization And Probabilistic Latent Semantic Indexing, Karthik Devarajan, Guoli Wang, Nader Ebrahimi Jul 2011

A Unified Approach To Non-Negative Matrix Factorization And Probabilistic Latent Semantic Indexing, Karthik Devarajan, Guoli Wang, Nader Ebrahimi

COBRA Preprint Series

Non-negative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two matrices, W and H, each with nonnegative entries, V ~ WH. NMF has been shown to have a unique parts-based, sparse representation of the data. The nonnegativity constraints in NMF allow only additive combinations of the data which enables it to learn parts that have distinct physical representations in reality. In the last few years, NMF has been successfully applied in a variety of areas such as natural language processing, information retrieval, image processing, speech recognition …


Determinants Of Health Care Use Among Rural, Low-Income Mothers And Children: A Simultaneous Systems Approach To Negative Binomial Regression Modeling, Swetha Valluri Jan 2011

Determinants Of Health Care Use Among Rural, Low-Income Mothers And Children: A Simultaneous Systems Approach To Negative Binomial Regression Modeling, Swetha Valluri

Masters Theses 1911 - February 2014

The determinants of health care use among rural, low-income mothers and their children were assessed using a multi-state, longitudinal data set, Rural Families Speak. The results indicate that rural mothers’ decisions regarding health care utilization for themselves and for their child can be best modeled using a simultaneous systems approach to negative binomial regression. Mothers’ visits to a health care provider increased with higher self-assessed depression scores, increased number of child’s doctor visits, greater numbers of total children in the household, greater numbers of chronic conditions, need for prenatal or post-partum care, development of a new medical condition, and …


Multilevel Latent Class Models With Dirichlet Mixing Distribution, Chong-Zhi Di, Karen Bandeen-Roche Jan 2011

Multilevel Latent Class Models With Dirichlet Mixing Distribution, Chong-Zhi Di, Karen Bandeen-Roche

Chongzhi Di

Latent class analysis (LCA) and latent class regression (LCR) are widely used for modeling multivariate categorical outcomes in social sciences and biomedical studies. Standard analyses assume data of different respondents to be mutually independent, excluding application of the methods to familial and other designs in which participants are clustered. In this paper, we consider multilevel latent class models, in which sub-population mixing probabilities are treated as random effects that vary among clusters according to a common Dirichlet distribution. We apply the Expectation-Maximization (EM) algorithm for model fitting by maximum likelihood (ML). This approach works well, but is computationally intensive when …


Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang Jan 2011

Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang

Chongzhi Di

We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a special case of two component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. It has been widely used in genetic linkage analysis under heterogeneity, in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard and the LRT statistic does not converge to a conventional 2 distribution. In this paper, we investigate the asymptotic behavior of the LRT for …


Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin Apr 2010

Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Software Internationalization: A Framework Validated Against Industry Requirements For Computer Science And Software Engineering Programs, John Huân Vũ Mar 2010

Software Internationalization: A Framework Validated Against Industry Requirements For Computer Science And Software Engineering Programs, John Huân Vũ

Master's Theses

View John Huân Vũ's thesis presentation at http://youtu.be/y3bzNmkTr-c.

In 2001, the ACM and IEEE Computing Curriculum stated that it was necessary to address "the need to develop implementation models that are international in scope and could be practiced in universities around the world." With increasing connectivity through the internet, the move towards a global economy and growing use of technology places software internationalization as a more important concern for developers. However, there has been a "clear shortage in terms of numbers of trained persons applying for entry-level positions" in this area. Eric Brechner, Director of Microsoft Development Training, suggested …