Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

248 Full-Text Articles 299 Authors 53938 Downloads 45 Institutions

All Articles in Categorical Data Analysis

Faceted Search

248 full-text articles. Page 1 of 8.

Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr 2017 Murray State University

Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr

Scholars Week

Timing methods and performance metrics are important in the heavily industrialized world we live in. Industrial plants use metrics to measure quality of production, help make decisions, and drive the strategy of the organization. However, there are many factors to be considered when measuring performance based on a metric; of which we will be analyzing the importance of product variation. We will be analyzing assembly line timings, whilst controlling for product variance, to show the importance differences between products makes in one’s ability to predict performance. In addition, we will be analyzing the current “statistical” methods used by an ...


Efficient Motif Discovery In Spatial Trajectories Using Discrete Fréchet Distance, Bo TANG, Man Lung YIU, Kyriakos MOURATIDIS, Kai WANG 2017 Singapore Management University

Efficient Motif Discovery In Spatial Trajectories Using Discrete Fréchet Distance, Bo Tang, Man Lung Yiu, Kyriakos Mouratidis, Kai Wang

Research Collection School Of Information Systems

The discrete Fréchet distance (DFD) captures perceptual and geographicalsimilarity between discrete trajectories. It has been successfullyadopted in a multitude of applications, such as signatureand handwriting recognition, computer graphics, as well as geographicapplications. Spatial applications, e.g., sports analysis,traffic analysis, etc. require discovering the pair of most similarsubtrajectories, be them parts of the same or of different input trajectories.The identified pair of subtrajectories is called a motif.The adoption of DFD as the similarity measure in motif discovery,although semantically ideal, is hindered by the high computationalcomplexity of DFD calculation. In this paper, we propose asuite of novel lower ...


2014 Reporting Of Sexual Assault: Institutional Comparisons, M. E. Karns 2017 Cornell University

2014 Reporting Of Sexual Assault: Institutional Comparisons, M. E. Karns

Research Studies and Reports

Institutions of higher education are required to submit annual reports of sexual assault crimes to the Department of Education under the Clery Act. The Department of Education makes this data publicly available. Two primary measures are used to assess reporting of assault on campus: the Assault Reporting Ratio (ARR) and the Reporting Rate per 10,000 students (R10K). These measures are easily calculated and can be used to assess practices and policies that impact the reporting of sexual assault on campus.

The ARR and R10K are rate comparisons, a method widely used in public health. These rate comparisons measure how ...


A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz 2016 Washington University in St. Louis

A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz

Doctor of Business Administration Dissertations

At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with ...


Effects Of Prescribed Fire On The Forest Structure And Composition At Land Between The Lakes National Recreation Area, Ky, Miranda Thompson 2016 Murray State University

Effects Of Prescribed Fire On The Forest Structure And Composition At Land Between The Lakes National Recreation Area, Ky, Miranda Thompson

Honors College Theses

With a regular fire regime present on the landscape, open canopies and herbaceous understories characterize oak forests in western Kentucky. However, a long period of fire suppression has changed the structure and composition of many forests in the Southeast. Forest managers at Land Between the Lakes have started using prescribed fire in an attempt to replicate aspects of a natural fire regime and increase the amount of open oak woodlands and savannas in the area. The prescribed fires in our study area were conducted during the dormant season and are very low intensity ground fires.

To understand how prescribed fire ...


Hilbe Mcd E-Book2016 Errata 03nov2016, Joseph M. Hilbe 2016 Arizona State University

Hilbe Mcd E-Book2016 Errata 03nov2016, Joseph M. Hilbe

Joseph M Hilbe

Errata, clarifications and additions for the newly corrected e-book version of Modeling Count Data.


Calculating Odds Ratios From Probabillities, Joseph M. Hilbe 2016 Arizona State University

Calculating Odds Ratios From Probabillities, Joseph M. Hilbe

Joseph M Hilbe

Method demonstrated for calculating logistic model odds ratios from model probabilities. Details shown for models with binary, categorical and continuous predictors, and multiple predictors.


Cost Sensitive Online Multiple Kernel Classification, Doyen SAHOO, Peilin ZHAO, HOI, Steven C. H. 2016 Singapore Management University

Cost Sensitive Online Multiple Kernel Classification, Doyen Sahoo, Peilin Zhao, Hoi, Steven C. H.

Research Collection School Of Information Systems

Learning from data streams has been an important open research problem in the era ofbig data analytics. This paper investigates supervised machine learning techniques formining data streams with application to online anomaly detection. Unlike conventionalmachine learning tasks, machine learning from data streams for online anomaly detectionhas several challenges: (i) data arriving sequentially and increasing rapidly, (ii) highlyclass-imbalanced distributions; and (iii) complex anomaly patterns that could evolve dynamically.To tackle these challenges, we propose a novel Cost-Sensitive Online MultipleKernel Classification (CSOMKC) scheme for comprehensively mining data streams anddemonstrate its application to online anomaly detection. Specifically, CSOMKC learns akernel-based cost-sensitive prediction model ...


Converting A Logistic Model Odds Ratio To A Risk Ratio, Joseph M. Hilbe 2016 Arizona State University

Converting A Logistic Model Odds Ratio To A Risk Ratio, Joseph M. Hilbe

Joseph M Hilbe

Demonstrate how a logistic model odds ratio for a single predictor can be converted to a risk ratio.


Hilbe Mcd Errata 03nov2016 Update, Joseph M. Hilbe 2016 Arizona State University

Hilbe Mcd Errata 03nov2016 Update, Joseph M. Hilbe

Joseph M Hilbe

Errata and additions for Modeling Count Data (2014), Cambridge University Press.


Modeling Count Data Rcode Update 15sep2016, Joseph M. Hilbe 2016 Arizona State University

Modeling Count Data Rcode Update 15sep2016, Joseph M. Hilbe

Joseph M Hilbe

Updated R code for Modeling Count Data. Code amended for the corrected printing 2016.


Hilbe-Mcd-Stata-Code-Revised-15sep2016.Pdf, Joseph M. Hilbe 2016 Arizona State University

Hilbe-Mcd-Stata-Code-Revised-15sep2016.Pdf, Joseph M. Hilbe

Joseph M Hilbe

Stata commands (code) used in Modeling Count Data for the corrected printing, 2016.


Addition To Pglr Chap 6, Joseph M. Hilbe 2016 Arizona State University

Addition To Pglr Chap 6, Joseph M. Hilbe

Joseph M Hilbe

Addition to Chapter 6 in Practical Guide to Logistic Regression. Added section on Bayesian logistic regression using Stata.


Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G KC Chan, C Zheng, KY Liang 2016 Fred Hutchinson Cancer Research Center

Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang

Chongzhi Di

Recently, Qin and Liang (Biometrics, 2011) considered a semiparametric mixture case-control model and proposed a score test for homogeneity. The mixture model is semiparametric in the sense that the density ratio of two distributions is assumed to be of exponential form, while the baseline density is unspecified. In a family of parametric admixture models, Di and Liang (Biometrics, 2011) showed that the likelihood ratio test statistics, which is equivalent to a supremum statistics, could improve power over score tests. We generalize the likelihood ratio or supremum statistics to the semiparametric mixture model and demonstrate the power gain over the score ...


Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha 2016 Florida International University

Combined Computational-Experimental Design Of High-Temperature, High-Intensity Permanent Magnetic Alloys With Minimal Addition Of Rare-Earth Elements, Rajesh Jha

FIU Electronic Theses and Dissertations

AlNiCo magnets are known for high-temperature stability and superior corrosion resistance and have been widely used for various applications. Reported magnetic energy density ((BH) max) for these magnets is around 10 MGOe. Theoretical calculations show that ((BH) max) of 20 MGOe is achievable which will be helpful in covering the gap between AlNiCo and Rare-Earth Elements (REE) based magnets. An extended family of AlNiCo alloys was studied in this dissertation that consists of eight elements, and hence it is important to determine composition-property relationship between each of the alloying elements and their influence on the bulk properties.

In the present ...


Walking To Recovery - The Effects Of Postsurgical Ambulation On Patient Recovery Times, Trent William Stethen 2016 University of Tennessee, Knoxville

Walking To Recovery - The Effects Of Postsurgical Ambulation On Patient Recovery Times, Trent William Stethen

University of Tennessee Honors Thesis Projects

No abstract provided.


Does Academic Performance Predict Workplace Productivity?, Jodie-gaye Hunter 2016 Bryant University

Does Academic Performance Predict Workplace Productivity?, Jodie-Gaye Hunter

Honors Projects in Economics

This research examines if college GPA affects productivity and compensation in the workplace. It uses data collected from a survey of approximately 23,000 Bryant University graduates in different stages of their career. About 10 percent of the alumni surveyed completed the survey. The econometric model used in this study allows estimating the effect of GPA on income after controlling for various demographic and socioeconomic variables, including education, major, occupation, gender, among others. The empirical work provides evidence that GPA has a positive and statistically significant impact on workplace productivity for females, but GPA seems to be a weaker predictor ...


Hilbe-Pglr-Errata-And-Comments, Joseph M. Hilbe 2016 Arizona State University

Hilbe-Pglr-Errata-And-Comments, Joseph M. Hilbe

Joseph M Hilbe

Errata and Comments for Practical Guide to Logistic Regression


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang 2016 Fox Chase Cancer Center

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for ...


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret 2016 University of Washington - Seattle Campus

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the ...


Digital Commons powered by bepress