Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

366 Full-Text Articles 602 Authors 148,231 Downloads 81 Institutions

All Articles in Categorical Data Analysis

Faceted Search

366 full-text articles. Page 1 of 15.

A Comparison Of The Localized Aviation Mos Program (Lamp) And Terminal Aerodrome Forecast (Taf) Accuracy For General Aviation, Douglas D. Boyd, Thomas A. Guinn 2021 Embry Riddle Aeronautical University

A Comparison Of The Localized Aviation Mos Program (Lamp) And Terminal Aerodrome Forecast (Taf) Accuracy For General Aviation, Douglas D. Boyd, Thomas A. Guinn

Journal of Aviation Technology and Engineering

Background. For general aviation (GA) pilots, operations in instrument meteorological conditions (IMC) carry an elevated risk of a fatal accident. As to whether a general aviation flight can be safely undertaken, aerodrome-specific forecasts (TAF, LAMP) provide guidance. Although LAMP forecasts are more common for GA-frequented aerodromes, nevertheless, the FAA recommends that for such aerodromes (and for which a TAF is not issued) the airman uses the TAF generated for the geographically closest airport for pre-flight weather evaluation. Herein, for non-TAF-issuing airports, the LAMP (sLAMP) predictive accuracy for visual (VFR) and instrument (IFR) flight rules flight category was determined.

Method. sLAMP ...


How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel 2021 University of Southern Maine

How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel

Thinking Matters Symposium

Due to the pandemic, people have started relying more on televisions, news, social media, and other news outlets for guidance. Moreover, with the increasing amount of news, data, and information there is also an increase in the amount of misleading statistics. People’s opinions and decisions significantly depend on the data, statistics, and information that they are exposed to, as well as their sources. For this project, we want to look at how information and its sources are affecting the decision made by the general public for the usage of the Portland Transit System. It is very important to know ...


Node Classification On Relational Graphs Using Deep-Rgcns, Nagasai Chandra 2021 California Polytechnic State University, San Luis Obispo

Node Classification On Relational Graphs Using Deep-Rgcns, Nagasai Chandra

Master's Theses

Knowledge Graphs are fascinating concepts in machine learning as they can hold usefully structured information in the form of entities and their relations. Despite the valuable applications of such graphs, most knowledge bases remain incomplete. This missing information harms downstream applications such as information retrieval and opens a window for research in statistical relational learning tasks such as node classification and link prediction. This work proposes a deep learning framework based on existing relational convolutional (R-GCN) layers to learn on highly multi-relational data characteristic of realistic knowledge graphs for node property classification tasks. We propose a deep and improved variant ...


Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec 2020 DePauw University

Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec

Student Research

The release of Statcast data in 2015 was revolutionary for data analysis in the game of baseball. Many analysts have begun using this data regularly, but none have used it exclusively. Often older, less reliable statistics (on-base percentage) are still used in favor of the newer statistics (weighted runs created plus). In this paper, we attempt to explain the variation in weighted runs created plus (wRC+) using Statcast variables such as exit velocity and launch angle. We find that exit velocity along with other Statcast variables, can explain as much as 70% of the variation in wRC+. Launch angle can ...


Developing A Tourism Opportunity Index Regarding The Prospective Of Overtourism In Nepal, Susan Phuyal 2020 Missouri State University

Developing A Tourism Opportunity Index Regarding The Prospective Of Overtourism In Nepal, Susan Phuyal

MSU Graduate Theses

This research explores Nepal's overtourism scenario based on the capacity of a locality to manage sustainable tourism practices. Environmental degradation, local infrastructure degradation, negative tourist experience and local resident responses regarding visitors are the four main variables used in this study to analyze overtourism. In order to analyze the case study of overtourism, we select the three top touristic cities of Nepal, Kathmandu, Pokhara, and Chitwan based on the number of annual visitors. Nepal's case analysis of overtourism conditions reviews the overall threat of over-tourism and establishes a metric by which tourism can be viewed as potentially detrimental ...


Direct Questioning Of Sensitive Topics In Public Health Studies: A Simulation Study, Jessica K. Fox, Evrim Oral 2020 LSU Health Sciences Center, School of Public Health, Biostatistics Program

Direct Questioning Of Sensitive Topics In Public Health Studies: A Simulation Study, Jessica K. Fox, Evrim Oral

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman 2020 University of Washington, Tacoma

Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman

Access*: Interdisciplinary Journal of Student Research and Scholarship

The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model ...


Predicting Postoperative Delirium Risk For Intracranial Surgery: A Statistical Machine Learning Approach, Juliet Aygun, Alaina Bartfeld, Sahana Rayan 2020 Purdue University

Predicting Postoperative Delirium Risk For Intracranial Surgery: A Statistical Machine Learning Approach, Juliet Aygun, Alaina Bartfeld, Sahana Rayan

The Journal of Purdue Undergraduate Research

No abstract provided.


“Playing The Whole Game”: A Data Collection And Analysis Exercise With Google Calendar, Albert Y. Kim, Johanna Hardin 2020 Smith College

“Playing The Whole Game”: A Data Collection And Analysis Exercise With Google Calendar, Albert Y. Kim, Johanna Hardin

Statistical and Data Sciences: Faculty Publications

We provide a computational exercise suitable for early introduction in an undergraduate statistics or data science course that allows students to “play the whole game” of data science: performing both data collection and data analysis. While many teaching resources exist for data analysis, such resources are not as abundant for data collection given the inherent difficulty of the task. Our proposed exercise centers around student use of Google Calendar to collect data with the goal of answering the question “How do I spend my time?” On the one hand, the exercise involves answering a question with near universal appeal, but ...


Analyzing The Fractal Dimension Of Various Musical Pieces, Nathan Clark 2020 University of Arkansas, Fayetteville

Analyzing The Fractal Dimension Of Various Musical Pieces, Nathan Clark

Industrial Engineering Undergraduate Honors Theses

One of the most common tools for evaluating data is regression. This technique, widely used by industrial engineers, explores linear relationships between predictors and the response. Each observation of the response is a fixed linear combination of the predictors with an added error element. The method is built on the assumption that this error is normally distributed across all observations and has a mean of zero. In some cases, it has been found that the inherent variation is not the result of a random variable, but is instead the result of self-symmetric properties of the observations. For data with these ...


Integrating Data Science Ethics Into An Undergraduate Major, Benjamin Baumer, Randi L. Garcia, Albert Y. Kim, Katherine M. Kinnaird, Miles Q. Ott 2020 Smith College

Integrating Data Science Ethics Into An Undergraduate Major, Benjamin Baumer, Randi L. Garcia, Albert Y. Kim, Katherine M. Kinnaird, Miles Q. Ott

Statistical and Data Sciences: Faculty Publications

We present a programmatic approach to incorporating ethics into an undergraduate major in statistical and data sciences. We discuss departmental-level initiatives designed to meet the National Academy of Sciences recommendation for weaving ethics into the curriculum from top-to-bottom as our majors progress from our introductory courses to our senior capstone course, as well as from side-to-side through co-curricular programming. We also provide six examples of data science ethics modules used in five different courses at our liberal arts college, each focusing on a different ethical consideration. The modules are designed to be portable such that they can be flexibly incorporated ...


Improving The Quality And Design Of Retrospective Clinical Outcome Studies That Utilize Electronic Health Records, Oliwier Dziadkowiec, Jeffery Durbin, Vignesh Jayaraman Muralidharan, Megan Novak, Brendon Cornett 2020 HCA Healthcare Mountain MidAmerica and Continental Divisions

Improving The Quality And Design Of Retrospective Clinical Outcome Studies That Utilize Electronic Health Records, Oliwier Dziadkowiec, Jeffery Durbin, Vignesh Jayaraman Muralidharan, Megan Novak, Brendon Cornett

HCA Healthcare Journal of Medicine

Electronic health records (EHRs) are an excellent source for secondary data analysis. Studies based on EHR-derived data, if designed properly, can answer previously unanswerable clinical research questions. In this paper we will highlight the benefits of large retrospective studies from secondary sources such as EHRs, examine retrospective cohort and case-control study design challenges, as well as methodological and statistical adjustment that can be made to overcome some of the inherent design limitations, in order to increase the generalizability, validity and reliability of the results obtained from these studies.


Learning Networks With Categorical Data Using Distance Correlation, And A Novel Graph-Based Multivariate Test, Jian Tinker 2020 University of Arkansas, Fayetteville

Learning Networks With Categorical Data Using Distance Correlation, And A Novel Graph-Based Multivariate Test, Jian Tinker

Theses and Dissertations

We study the use of distance correlation for statistical inference on categorical data, especially the induction of probability networks. Szekely et al. first defined distance correlation for continuous variables in [42], and Zhang translated the concept into the categorical setting in [57] by defining dCor(X,Y) for categorical variables X = (x1,...,xI) and Y = (y1,...,yJ) where P(X=xi)=[pi]i and P(Y=yi)=[pi]j with the formula [Please open the document]

Part I of the dissertation covers the background we need to understand this formula, and prepares us to analyze the properties and performance of ...


Next-Term Grade Prediction: A Machine Learning Approach, Audrey Tedja WIDJAJA, Lei WANG, Nghia TRUONG TRONG, Aldy GUNAWAN, Ee-peng LIM 2020 Singapore Management University

Next-Term Grade Prediction: A Machine Learning Approach, Audrey Tedja Widjaja, Lei Wang, Nghia Truong Trong, Aldy Gunawan, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

As students progress in their university programs, they have to face many course choices. It is important for them to receive guidance based on not only their interest, but also the "predicted" course performance so as to improve learning experience and optimise academic performance. In this paper, we propose the next-term grade prediction task as a useful course selection guidance. We propose a machine learning framework to predict course grades in a specific program term using the historical student-course data. In this framework, we develop the prediction model using Factorization Machine (FM) and Long Short Term Memory combined with FM ...


Analysis Of Gameplay Strategies In Hearthstone: A Data Science Approach, Connor W. Watson 2020 New Jersey Institute of Technology

Analysis Of Gameplay Strategies In Hearthstone: A Data Science Approach, Connor W. Watson

Theses

In recent years, games have been a popular test bed for AI research, and the presence of Collectible Card Games (CCGs) in that space is still increasing. One such CCG for both competitive/casual play and AI research is Hearthstone, a two-player adversarial game where players seeks to implement one of several gameplay strategies to defeat their opponent and decrease all of their Health points to zero. Although some open source simulators exist, some of their methodologies for simulated agents create opponents with a relatively low skill level. Using evolutionary algorithms, this thesis seeks to evolve agents with a higher ...


Metabolomic Profiling Of Nicotiana Spp. Nectars Indicate That Pollinator Feeding Preference Is A Stronger Determinant Than Plant Phylogenetics In Shaping Nectar Diversity, Fredy A. Silva, Elizabeth C. Chatt, Siti-Nabilla Mahalim, Adel Guirgis, Xingche Guo, Dan S. Nettleton, Basil J. Nikolau, Robert W. Thornburg 2020 Iowa State University

Metabolomic Profiling Of Nicotiana Spp. Nectars Indicate That Pollinator Feeding Preference Is A Stronger Determinant Than Plant Phylogenetics In Shaping Nectar Diversity, Fredy A. Silva, Elizabeth C. Chatt, Siti-Nabilla Mahalim, Adel Guirgis, Xingche Guo, Dan S. Nettleton, Basil J. Nikolau, Robert W. Thornburg

Statistics Publications

Floral nectar is a rich secretion produced by the nectary gland and is offered as reward to attract pollinators leading to improved seed set. Nectars are composed of a complex mixture of sugars, amino acids, proteins, vitamins, lipids, organic and inorganic acids. This composition is influenced by several factors, including floral morphology, mechanism of nectar secretion, time of flowering, and visitation by pollinators. The objective of this study was to determine the contributions of flowering time, plant phylogeny, and pollinator selection on nectar composition in Nicotiana. The main classes of nectar metabolites (sugars and amino acids) were quantified using gas ...


Decision Tree For Predicting The Party Of Legislators, Afsana Mimi 2020 CUNY New York City College of Technology

Decision Tree For Predicting The Party Of Legislators, Afsana Mimi

Publications and Research

The motivation of the project is to identify the legislators who voted frequently against their party in terms of their roll call votes using Office of Clerk U.S. House of Representatives Data Sets collected in 2018 and 2019. We construct a model to predict the parties of legislators based on their votes. The method we used is Decision Tree from Data Mining. Python was used to collect raw data from internet, SAS was used to clean data, and all other calculations and graphical presentations are performed using the R software.


First-Year Computer Science Students: Pathways And Perceptions In Introductory Computer Science Courses, Christina A. LeBlanc 2020 University of Maine

First-Year Computer Science Students: Pathways And Perceptions In Introductory Computer Science Courses, Christina A. Leblanc

Electronic Theses and Dissertations

This study examined student perceptions and experiences of an introductory Computer Science course at the University of Maine; COS 125: Introduction to Problem Solving Using Computer Programs. It also explored the pathways that students pursue after taking COS 125, depending on their success in the course, and their motivation to persist. Through characterizing student populations and their performance in their first semester in the Computer Science program, they can be placed into one of three categories that explain their path; a “continuer” (passed COS 125 and decided to stay in the major), a “persister” (did not pass COS 125 and ...


Act Scores Across Minnesota's Congressional Districts, Katie Moynihan 2020 Concordia University St. Paul

Act Scores Across Minnesota's Congressional Districts, Katie Moynihan

Research and Scholarship Symposium Posters

Data analysis was conducted to test factors which could affect the ACT scores of Minnesota high school students. Average composite scores across the state’s eight congressional districts were evaluated. Factors studied include family income, parental education, diversity, district location, graduation class size, and graduation rate. Methodology and results will be discussed.


Do We Need To Reconsider The Cmam Admission And Discharge Criteria?; An Analysis Of Cmam Data In South Sudan, Eunyong Ahn, Cyprian Ouma, Mesfin Loha, Asrat Dibaba, Wendy Dyment, Jae Kwang Kim, Nam Seon Beck, Taesung Park 2020 Seoul National University

Do We Need To Reconsider The Cmam Admission And Discharge Criteria?; An Analysis Of Cmam Data In South Sudan, Eunyong Ahn, Cyprian Ouma, Mesfin Loha, Asrat Dibaba, Wendy Dyment, Jae Kwang Kim, Nam Seon Beck, Taesung Park

Statistics Publications

Background: Weight-for-height Z-score (WHZ) and Mid Upper Arm Circumference (MUAC) are both commonly used as acute malnutrition screening criteria. However, there exists disparity between the groups identified as malnourished by them. Thus, here we aim to investigate the clinical features and linkage with chronicity of the acute malnutrition cases identified by either WHZ or MUAC. Besides, there exists evidence indicating that fat restoration is disproportionately rapid compared to that of muscle gain in hospitalized malnourished children but related research at community level is lacking. In this study we suggest proxy measure to inspect body composition restoration responding to malnutrition management ...


Digital Commons powered by bepress