Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

448 Full-Text Articles 668 Authors 259,886 Downloads 101 Institutions

All Articles in Categorical Data Analysis

Faceted Search

448 full-text articles. Page 5 of 18.

Application Of Cycle-By-Cycle Analysis To Eeg Data From Individuals With Phelan-Mcdermid Syndrome, Naomi Miller 2021 Dartmouth College

Application Of Cycle-By-Cycle Analysis To Eeg Data From Individuals With Phelan-Mcdermid Syndrome, Naomi Miller

ENGS 88 Honors Thesis (AB Students)

This study aimed to analyze a novel method of processing data from electroencephalography (EEG) recordings, which implements time-domain cycle-by-cycle analysis. This "bycycle" method, developed by the Cole & Voytek laboratory, was implemented on a EEG dataset of children with and without Phelan-McDermid Syndrome in the hopes of uncovering network-level explanations for the genetic disorder. A supplemental Python pipeline was developed to organize and visualize the data. This led to the discovery of group-level differences in measures of cycle symmetry in alpha band waves over the sensorimotor electrodes. Through the same pipeline, the bycycle tool was validated as a sound EEG …


Analyzing Student Experience On Group Work With The Application Of Different Group Allocation Approaches, An Yee Tan 2021 California Polytechnic State University, San Luis Obispo

Analyzing Student Experience On Group Work With The Application Of Different Group Allocation Approaches, An Yee Tan

Management and HR

Working as a group can be as challenging as working by oneself. Common issues like ineffective group work, unequal work contribution, and poor communication are believed to be the reasons why many students preferred to work individually. The purpose of this study is to understand if there is a disparity in student experience on group work by implementing different methods of group formation, which are, intentional group formation and random assignment. Topics around team well-being, team communication, and team effectiveness are the main focus of this study. The second emphasis of this study is students’ opinions on whether or not …


Node Classification On Relational Graphs Using Deep-Rgcns, Nagasai Chandra 2021 California Polytechnic State University, San Luis Obispo

Node Classification On Relational Graphs Using Deep-Rgcns, Nagasai Chandra

Master's Theses

Knowledge Graphs are fascinating concepts in machine learning as they can hold usefully structured information in the form of entities and their relations. Despite the valuable applications of such graphs, most knowledge bases remain incomplete. This missing information harms downstream applications such as information retrieval and opens a window for research in statistical relational learning tasks such as node classification and link prediction. This work proposes a deep learning framework based on existing relational convolutional (R-GCN) layers to learn on highly multi-relational data characteristic of realistic knowledge graphs for node property classification tasks. We propose a deep and improved variant, …


Behavior Of Lightning In Developing Storms, Erick A. Tello 2021 Air Force Institute of Technology

Behavior Of Lightning In Developing Storms, Erick A. Tello

Theses and Dissertations

Air Force weather squadrons issue a warning when lightning activity is observed within 5 nautical miles (NM) of protected areas. Upon receiving this warning, personnel outdoors are expected to pause work and move inside. Studies sponsored by the 45th Weather Squadron (45 WS) have concluded that the 5 NM warning radius can be safely reduced for well-developed storms. This thesis investigates whether radii for storms in early development can also be reduced. Our research develops algorithms to partition lightning sensor data into storms. Next, storms are filtered to their earliest lightning events, and the study calculates distances between successive early …


Carbon Dioxide And Particulate Matter Concentration On Hampton Roads Air Quality, Gregory Hubbard 2021 Old Dominion University

Carbon Dioxide And Particulate Matter Concentration On Hampton Roads Air Quality, Gregory Hubbard

OUR Journal: ODU Undergraduate Research Journal

Hampton Roads has been a maritime crossroads for the last 400 years. Industrialization has impacted the coastal region for the last 250 years. The expansion of the Port of Virginia in 2019 has created dense traffic in the region resulting in impacts to air quality. Two waste products that affect humans are particulate matter and carbon dioxide. Both respective emissions can cause adverse effects on humans, such as asthma, some lung cancers, and other respiratory distress. Scientists and health practitioners are studying the effects of particulate matter on human health. Hampton Roads, in particular, because of its unique location on …


Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao 2021 Virginia Commonwealth University

Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao

Theses and Dissertations

Drug addiction can lead to many health-related problems and social concerns. Functional connectivity obtained from functional magnetic resonance imaging (fMRI) data promotes a variety of fundamental understandings in such association. Due to its complex correlation structure and large dimensionality, the modeling and analysis of the functional connectivity from neuroimage are challenging. By proposing a spatio-temporal model for multi-subject neuroimage data, we incorporate voxel-level spatio-temporal dependencies of whole-brain measurements to improve the accuracy of statistical inference. To tackle large-scale spatio-temporal neuroimage data, we develop a computationally efficient algorithm to estimate the parameters. Our method is used to identify functional connectivity and …


An Evaluation Of The Performance Of Proc Arima's Identify Statement: A Data-Driven Approach Using Covid-19 Cases And Deaths In Florida, Fahmida Akter Shahela 2021 University of Central Florida

An Evaluation Of The Performance Of Proc Arima's Identify Statement: A Data-Driven Approach Using Covid-19 Cases And Deaths In Florida, Fahmida Akter Shahela

Electronic Theses and Dissertations, 2020-

Understanding data on novel coronavirus (COVID-19) pandemic, and modeling such data over time are crucial for decision making at managing, fighting, and controlling the spread of this emerging disease. This thesis work looks at some aspects of exploratory analysis and modeling of COVID-19 data obtained from the Florida Department of Health (FDOH). In particular, the present work is devoted to data collection, preparation, description, and modeling of COVID-19 cases and deaths reported by FDOH between March 12, 2020, and April 30, 2021. For modeling data on both cases and deaths, this thesis utilized an autoregressive integrated moving average (ARIMA) times …


Neither “Post-War” Nor Post-Pregnancy Paranoia: How America’S War On Drugs Continues To Perpetuate Disparate Incarceration Outcomes For Pregnant, Substance-Involved Offenders, Becca S. Zimmerman 2021 Pitzer College

Neither “Post-War” Nor Post-Pregnancy Paranoia: How America’S War On Drugs Continues To Perpetuate Disparate Incarceration Outcomes For Pregnant, Substance-Involved Offenders, Becca S. Zimmerman

Pitzer Senior Theses

This thesis investigates the unique interactions between pregnancy, substance involvement, and race as they relate to the War on Drugs and the hyper-incarceration of women. Using ordinary least square regression analyses and data from the Bureau of Justice Statistics’ 2016 Survey of Prison Inmates, I examine if (and how) pregnancy status, drug use, race, and their interactions influence two length of incarceration outcomes: sentence length and amount of time spent in jail between arrest and imprisonment. The results collectively indicate that pregnancy decreases length of incarceration outcomes for those offenders who are not substance-involved but not evenhandedly -- benefitting white …


Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper 2021 University of Kentucky

Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper

Theses and Dissertations--Public Health (M.P.H. & Dr.P.H.)

Background: Extraction of coal through mountaintop removal mining (MTR) alters many dimensions of the landscape, and explosive blasts, exposed rock, and coal washing have the potential to pollute air and water with substances known to increase risk of developmental and birth anomalies. Previous research suggests that infants born to mothers living in MTR coal mining counties have higher prevalence of most types of birth defects.

Objectives: This study seeks to examine further the relationship between MTR activity and birth defects by employing individual level exposure estimation through precise satellite data of MTR activity in the Appalachian region and maternal residence …


Novel Nonparametric Testing Approaches For Multivariate Growth Curve Data: Finite-Sample, Resampling And Rank-Based Methods, Ting Zeng 2021 University of Kentucky

Novel Nonparametric Testing Approaches For Multivariate Growth Curve Data: Finite-Sample, Resampling And Rank-Based Methods, Ting Zeng

Theses and Dissertations--Statistics

Multivariate growth curve data naturally arise in various fields, for example, biomedical science, public health, agriculture, social science and so on. For data of this type, the classical approach is to conduct multivariate analysis of variance (MANOVA) based on Wilks' Lambda and other multivariate statistics, which require the assumptions of multivariate normality and homogeneity of within-cell covariance matrices. However, data being analyzed nowadays show marked departure from multivariate normal distribution and homoscedasticity. In this dissertation, we investigate nonparametric testing approaches for multivariate growth curve data from three aspects, i.e., finite-sample, resampling and rank-based methods.

The first project proposes an approximate …


Adaptive Ensemble Of Classifiers With Regularization For Imbalanced Data Classification, Chen Wang, Chengyuan Deng, Zhoulu Yu, Dafeng Hui, Xiaofeng Gong, Ruisen Luo 2020 Sichuan University

Adaptive Ensemble Of Classifiers With Regularization For Imbalanced Data Classification, Chen Wang, Chengyuan Deng, Zhoulu Yu, Dafeng Hui, Xiaofeng Gong, Ruisen Luo

Biology Faculty Research

The dynamic ensemble selection of classifiers is an effective approach for processing label-imbalanced data classifications. However, such a technique is prone to overfitting, owing to the lack of regularization methods and the dependence on local geometry of data. In this study, focusing on binary imbalanced data classification, a novel dynamic ensemble method, namely adaptive ensemble of classifiers with regularization (AER), is proposed, to overcome the stated limitations. The method solves the overfitting problem through a new perspective of implicit regularization. Specifically, it leverages the properties of stochastic gradient descent to obtain the solution with the minimum norm, thereby achieving regularization; …


Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec 2020 DePauw University

Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec

Student Research

The release of Statcast data in 2015 was revolutionary for data analysis in the game of baseball. Many analysts have begun using this data regularly, but none have used it exclusively. Often older, less reliable statistics (on-base percentage) are still used in favor of the newer statistics (weighted runs created plus). In this paper, we attempt to explain the variation in weighted runs created plus (wRC+) using Statcast variables such as exit velocity and launch angle. We find that exit velocity along with other Statcast variables, can explain as much as 70% of the variation in wRC+. Launch angle can …


Developing A Tourism Opportunity Index Regarding The Prospective Of Overtourism In Nepal, Susan Phuyal 2020 Missouri State University

Developing A Tourism Opportunity Index Regarding The Prospective Of Overtourism In Nepal, Susan Phuyal

MSU Graduate Theses

This research explores Nepal's overtourism scenario based on the capacity of a locality to manage sustainable tourism practices. Environmental degradation, local infrastructure degradation, negative tourist experience and local resident responses regarding visitors are the four main variables used in this study to analyze overtourism. In order to analyze the case study of overtourism, we select the three top touristic cities of Nepal, Kathmandu, Pokhara, and Chitwan based on the number of annual visitors. Nepal's case analysis of overtourism conditions reviews the overall threat of over-tourism and establishes a metric by which tourism can be viewed as potentially detrimental to sustainability. …


Direct Questioning Of Sensitive Topics In Public Health Studies: A Simulation Study, Jessica K. Fox, Evrim Oral 2020 LSU Health Sciences Center, School of Public Health, Biostatistics Program

Direct Questioning Of Sensitive Topics In Public Health Studies: A Simulation Study, Jessica K. Fox, Evrim Oral

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman 2020 University of Washington, Tacoma

Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman

Access*: Interdisciplinary Journal of Student Research and Scholarship

The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …


Predicting Postoperative Delirium Risk For Intracranial Surgery: A Statistical Machine Learning Approach, Juliet Aygun, Alaina Bartfeld, Sahana Rayan 2020 Purdue University

Predicting Postoperative Delirium Risk For Intracranial Surgery: A Statistical Machine Learning Approach, Juliet Aygun, Alaina Bartfeld, Sahana Rayan

The Journal of Purdue Undergraduate Research

No abstract provided.


Evaluation Of China Shipping Hub-And-Spoke Network Based On Herfindahl-Hirschmann Index (Hhi), Wenjin Sun 2020 World Maritime University

Evaluation Of China Shipping Hub-And-Spoke Network Based On Herfindahl-Hirschmann Index (Hhi), Wenjin Sun

World Maritime University Dissertations

No abstract provided.


“Playing The Whole Game”: A Data Collection And Analysis Exercise With Google Calendar, Albert Y. Kim, Johanna Hardin 2020 Smith College

“Playing The Whole Game”: A Data Collection And Analysis Exercise With Google Calendar, Albert Y. Kim, Johanna Hardin

Statistical and Data Sciences: Faculty Publications

We provide a computational exercise suitable for early introduction in an undergraduate statistics or data science course that allows students to “play the whole game” of data science: performing both data collection and data analysis. While many teaching resources exist for data analysis, such resources are not as abundant for data collection given the inherent difficulty of the task. Our proposed exercise centers around student use of Google Calendar to collect data with the goal of answering the question “How do I spend my time?” On the one hand, the exercise involves answering a question with near universal appeal, but …


Analyzing The Fractal Dimension Of Various Musical Pieces, Nathan Clark 2020 University of Arkansas, Fayetteville

Analyzing The Fractal Dimension Of Various Musical Pieces, Nathan Clark

Industrial Engineering Undergraduate Honors Theses

One of the most common tools for evaluating data is regression. This technique, widely used by industrial engineers, explores linear relationships between predictors and the response. Each observation of the response is a fixed linear combination of the predictors with an added error element. The method is built on the assumption that this error is normally distributed across all observations and has a mean of zero. In some cases, it has been found that the inherent variation is not the result of a random variable, but is instead the result of self-symmetric properties of the observations. For data with these …


Integrating Data Science Ethics Into An Undergraduate Major, Benjamin Baumer, Randi L. Garcia, Albert Y. Kim, Katherine M. Kinnaird, Miles Q. Ott 2020 Smith College

Integrating Data Science Ethics Into An Undergraduate Major, Benjamin Baumer, Randi L. Garcia, Albert Y. Kim, Katherine M. Kinnaird, Miles Q. Ott

Statistical and Data Sciences: Faculty Publications

We present a programmatic approach to incorporating ethics into an undergraduate major in statistical and data sciences. We discuss departmental-level initiatives designed to meet the National Academy of Sciences recommendation for weaving ethics into the curriculum from top-to-bottom as our majors progress from our introductory courses to our senior capstone course, as well as from side-to-side through co-curricular programming. We also provide six examples of data science ethics modules used in five different courses at our liberal arts college, each focusing on a different ethical consideration. The modules are designed to be portable such that they can be flexibly incorporated …


Digital Commons powered by bepress