Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

268 Full-Text Articles 369 Authors 67,053 Downloads 60 Institutions

All Articles in Categorical Data Analysis

Faceted Search

268 full-text articles. Page 1 of 10.

Application Of Bradford’S Law Of Scattering On Research Publication In Astronomy & Astrophysics Of India, Satish Kumar, Senthilkumar R. 2018 Bharathiar University, Coimbatore & IIT(ISM) Dhanbad

Application Of Bradford’S Law Of Scattering On Research Publication In Astronomy & Astrophysics Of India, Satish Kumar, Senthilkumar R.

Library Philosophy and Practice (e-journal)

The present study is focused on examining the application of Bradford’s law of scattering on research articles published in the field of Astronomy & Astrophysics by Indian scientist during 1988-2017. The bibliographic data was retrieved from Web of Science (WoS) bibliographic data base for different period of time. Total 18,877 journal’s article have been published by Indian scientist in the field of Astronomy & Astrophysics during 1988-2017 which was further retrieved and analyzed separately for different blocks of 10 years as well as for 30 years consolidated too. The core journal of the field was identified. The Bradford law ...


Role Of Misclassification Estimates In Estimating Disease Prevalence And A Non-Linear Approach To Study Synchrony Using Heart Rate Variability In Chickens, Dola Pathak 2018 University of Nebraska-Lincoln

Role Of Misclassification Estimates In Estimating Disease Prevalence And A Non-Linear Approach To Study Synchrony Using Heart Rate Variability In Chickens, Dola Pathak

Dissertations and Theses in Statistics

Infectious disease assays can be imperfect. When estimating disease prevalence, these imperfections are accounted for by incorporating assay sensitivity and specificity into point and variance estimates. Unfortunately, these accuracy measures are often treated as fixed constants, rather than acknowledging that they are estimates from an assay validation process. The purpose of this study is to show the detrimental effect of not taking into account this sampling variability when samples are obtained through group testing (aka, pooled testing). We show that confidence interval coverage can dramatically decline as the sample size increases for the main sample of interest. As a remedy ...


Seasonal Warranty Prediction Based On Recurrent Event Data, Qianqian Shan, Yili Hong, William Q. Meeker Jr. 2018 Iowa State University

Seasonal Warranty Prediction Based On Recurrent Event Data, Qianqian Shan, Yili Hong, William Q. Meeker Jr.

Statistics Preprints

Warranty return data from repairable systems, such as vehicles, usually result in recurrent event data. The non-homogeneous Poisson process (NHPP) model is used widely to describe such data. Seasonality in the repair frequencies and other variabilities, however, complicate the modeling of recurrent event data. Not much work has been done to address the seasonality, and this paper provides a general approach for the application of NHPP models with dynamic covariates to predict seasonal warranty returns. A hierarchical clustering method is used to stratify the population into groups that are more homogeneous than the than the overall population. The stratification facilitates ...


Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John 2018 Southern Methodist University

Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

SMU Data Science Review

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age ...


Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra 2018 Southern Methodist University

Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra

SMU Data Science Review

In this paper, we present a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over \$160 billion dollars in combined value. However, both Bitcoin and Ethereum have experienced significant price swings on both daily and long term valuations. Twitter is increasingly used as a news source influencing purchase decisions by informing users of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide a purchasing and selling advantage to ...


Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry 2018 Purdue University

Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry

The Summer Undergraduate Research Fellowship (SURF) Symposium

Flight tests have been conducted at Purdue University using a computer-based flying simulator in an attempt to determine and measure the effects of Enhanced Flight Vision Systems (EFVS) on the performance of pilots during landing. Knowledge of these effects could help guide future design and implementation of EFVS in modern commercial aircraft, and further increase pilots’ ability to control the aircraft in low-visibility conditions. The problem that has faced researchers in the past has revolved around the difficulty in interpreting the data which is generated by these tests. The difficulty in making a generalized conclusion based on the large amount ...


Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina 2018 Washington University in St Louis

Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina

Arts & Sciences Electronic Theses and Dissertations

Multinomial logistic regression model (MNL) is a powerful and easily tractable way for measuring the probabilistic impact of input variables on individual categorical choices. Crucially, the standard MNL assumes that all subjects of the study have the same choice sets. In the meanwhile, especially in political science and economics, this condition is frequently violated. Probably, the most graphical example of varying choice sets (VCS) is partially contested elections. Furthermore, the MNL implicitly implies the Independence of the Irregular Alternatives (IIA) assumption by requiring i.i.d errors that contrasts the MNL and the multinomial probit (MNP) and mixed logit (MXL ...


Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma 2018 Illinois State University

Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma

Stevenson Center for Community and Economic Development to Stevenson Center for Community and Economic Development—Student Research

Actuarial risk assessment tools increasingly have been employed in jurisdictions across the U.S. to assist courts in the decision of whether someone charged with a crime should be detained or released prior to their trial. These tools should be continually monitored and researched by independent 3rd parties to ensure that these powerful tools are being administered properly and used in the most proficient way as to provide socially optimal results. McLean County, Illinois began using the Public Safety Assessment-CourtTM (PSA-Court or simply PSA) risk assessment tool beginning in 2016. This study culls data from the McLean County ...


Data Center Application Security: Lateral Movement Detection Of Malware Using Behavioral Models, Harinder Pal Singh Bhasin, Elizabeth Ramsdell, Albert Alva, Rajiv Sreedhar, Medha Bhadkamkar 2018 Southen Methodist University, Dallas, Texas

Data Center Application Security: Lateral Movement Detection Of Malware Using Behavioral Models, Harinder Pal Singh Bhasin, Elizabeth Ramsdell, Albert Alva, Rajiv Sreedhar, Medha Bhadkamkar

SMU Data Science Review

Data center security traditionally is implemented at the external network access points, i.e., the perimeter of the data center network, and focuses on preventing malicious software from entering the data center. However, these defenses do not cover all possible entry points for malicious software, and they are not 100% effective at preventing infiltration through the connection points. Therefore, security is required within the data center to detect malicious software activity including its lateral movement within the data center. In this paper, we present a machine learning-based network traffic analysis approach to detect the lateral movement of malicious software within ...


Predicting Game Day Outcomes In National Football League Games, Josh Klein, Anna Frowein, Chris Irwin 2018 Southern Methodist University

Predicting Game Day Outcomes In National Football League Games, Josh Klein, Anna Frowein, Chris Irwin

SMU Data Science Review

In this paper, we present a model for predicting the game day outcomes of National Football League games. 3 of the most popular sources for game day predictions are analyzed for comparison. Player data and outcomes from previous games are used, but we also incorporate several weather factors into our models. Over 1,700 games were incorporated and 3 separate models are created using simple regression, principal component analysis, and a recursive model. We also discuss the ethicality of using data science techniques by individuals with the knowledge in order to gain an advantage over a population lacking this specialized ...


Examining Multimorbidities Using Association Rule Learning, Kaylee Dudley 2018 Brigham Young University

Examining Multimorbidities Using Association Rule Learning, Kaylee Dudley

Undergraduate Honors Theses

All insurance companies, regardless of the kind of insurance they offer, do their best to predict the future by comparing current to historical information. Any statistically significant correlation, regardless of expectations and hidden factors, can help to actuarially model future behavior. Using deidentified data from over 6 million health insurance policies over one year, we looked for any significant groupings of medical issues. The medical issues are defined based on the commercial “Episode Treatment Groups” (ETGs) classification, and our claims contain 347 different ETGs. We performed different kinds of analysis, including Bayesian posterior cluster analysis, k-means cluster analysis, and association ...


Text Analytics Approach To Extract Course Improvement Suggestions From Students’ Feedback, Swapna GOTTIPATI, Venky SHANKARARAMAN, Jeff Rongsheng LIN 2018 Singapore Management University

Text Analytics Approach To Extract Course Improvement Suggestions From Students’ Feedback, Swapna Gottipati, Venky Shankararaman, Jeff Rongsheng Lin

Research Collection School Of Information Systems

In academic institutions, it is normal practice that at the end of each term, students are required to complete a questionnaire that is designed to gather students’ perceptions of the instructor and their learning experience in the course. Students’ feedback includes numerical answers to Likert scale questions and textual comments to open-ended questions. Within the textual comments given by the students are embedded suggestions. A suggestion can be explicit or implicit. Any suggestion provides useful pointers on how the instructor can further enhance the student learning experience. However, it is tedious to manually go through all the qualitative comments and ...


A Convolutional Neural Network Model For Species Classification Of Camera Trap Images, Annie Casey 2018 Boise State University

A Convolutional Neural Network Model For Species Classification Of Camera Trap Images, Annie Casey

Mathematics Undergraduate Theses

The overall purpose of this study was to automate the manual process of tagging species found in camera trap images using machine learning. The basic design of this study was to implement a Convolutional Neural Network model in Python using the Keras and Tensorflow modules that learn to recognize patterns in images in order to classify what species is in a given image and to label it accordingly. Results of the analysis highlight the importance of a large sample size, the degree of accuracy according to various arguments in the model, effectiveness of multiple layers that include Max Pooling, and ...


Under The Influence, Leonardo Cavicchio 2018 Bryant University

Under The Influence, Leonardo Cavicchio

Honors Projects in Mathematics

The purpose of this Honors Capstone entitled Under the Influence is to assess the validity of claims concerning the possible influence of roommates on one another, concerning alcohol on college campuses. This will be done by examining data collected in a prior study conducted over a two-year period. This analysis will focus on how alcohol consumption changes in correlation with the personality factors of roommates over an extended period of time. This secondary analysis of de-identified data will focus on primary and secondary subquestions. The primary question that will be addressed with the data set collected from the University of ...


Understanding Natural Keyboard Typing Using Convolutional Neural Networks On Mobile Sensor Data, Travis Siems 2018 Southern Methodist University

Understanding Natural Keyboard Typing Using Convolutional Neural Networks On Mobile Sensor Data, Travis Siems

Computer Science and Engineering Theses and Dissertations

Mobile phones and other devices with embedded sensors are becoming increasingly ubiquitous. Audio and motion sensor data may be able to detect information that we did not think possible. Some researchers have created models that can predict computer keyboard typing from a nearby mobile device; however, certain limitations to their experiment setup and methods compelled us to be skeptical of the models’ realistic prediction capability. We investigate the possibility of understanding natural keyboard typing from mobile phones by performing a well-designed data collection experiment that encourages natural typing and interactions. This data collection helps capture realistic vulnerabilities of the security ...


Satellite Communications In The V And W Band: Tropospheric Effects, Bertus A. Shelters 2018 Air Force Institute of Technology

Satellite Communications In The V And W Band: Tropospheric Effects, Bertus A. Shelters

Theses and Dissertations

An investigation into the use of Weather Cubes compiled by the atmospheric characterization package, Laser Environmental Effects Definition and Reference (LEEDR), to develop accurate, long-term attenuation statistics for link-budget analysis is presented. A Weather Cube is a three-dimensional mesh of numerical weather prediction (NWP) data plus LEEDR calculations that allows for the quantification of rain, cloud, aerosol, and molecular effects at any UV to RF wavelength on any path contained within the cube. The development of this methodology is motivated by the potential use of V (40-75 GHz) and W (75-110 GHz) band frequencies for the satellite communication application, as ...


Default Priors For The Intercept Parameter In Logistic Regressions, Philip S. Boonstra, Ryan P. Barbaro, Ananda Sen 2018 The University Of Michigan

Default Priors For The Intercept Parameter In Logistic Regressions, Philip S. Boonstra, Ryan P. Barbaro, Ananda Sen

The University of Michigan Department of Biostatistics Working Paper Series

In logistic regression, separation refers to the situation in which a linear combination of predictors perfectly discriminates the binary outcome. Because finite-valued maximum likelihood parameter estimates do not exist under separation, Bayesian regressions with informative shrinkage of the regression coefficients offer a suitable alternative. Little focus has been given on whether and how to shrink the intercept parameter. Based upon classical studies of separation, we argue that efficiency in estimating regression coefficients may vary with the intercept prior. We adapt alternative prior distributions for the intercept that downweight implausibly extreme regions of the parameter space rendering less sensitivity to separation ...


Building A Better Risk Prevention Model, Steven Hornyak 2018 Houston County Schools

Building A Better Risk Prevention Model, Steven Hornyak

National Youth-At-Risk Conference Savannah

This presentation chronicles the work of Houston County Schools in developing a risk prevention model built on more than ten years of longitudinal student data. In its second year of implementation, Houston At-Risk Profiles (HARP), has proven effective in identifying those students most in need of support and linking them to interventions and supports that lead to improved outcomes and significantly reduces the risk of failure.


Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright 2018 University of Windsor

Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright

Major Papers

Injuries and hospitalizations due to accidental falls among seniors represent a major expense for the Canadian public health system. It is highly desirable to be able to predict risk of falls for senior individuals in order to place them in prevention programs. Recently, sensor technologies have been used to predict risk of falls and levels of frailty of individuals. A commonly used test for assessing risk of falls is known as QTUG (Quantitative `Timed Up and Go'). The QTUG data often consist of a small set of survey answers about the individuals' historic variables (e.g., number of falls in ...


Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett 2018 The University of Akron

Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett

Honors Research Projects

The purpose of this Senior Honors Project is to research, study, and demonstrate newfound knowledge of various machine learning statistical techniques that are not covered in the University of Akron’s statistics major curriculum. This report will be an overview of three machine-learning methods that were used to predict NCAA Basketball results, specifically, the March Madness tournament. The variables used for these methods, models, and tests will include numerous variables kept throughout the season for each team, along with a couple variables that are used by the selection committee when tournament teams are being picked. The end goal is to ...


Digital Commons powered by bepress