Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis

PDF

2021

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 31

Full-Text Articles in Physical Sciences and Mathematics

Smoking, Alcohol Consumption, And Depression In Association With Incidence Of Type 2 Diabetes Among Mexican Americans In Starr County, Texas, Gabriela Rubannelsonkumar Dec 2021

Smoking, Alcohol Consumption, And Depression In Association With Incidence Of Type 2 Diabetes Among Mexican Americans In Starr County, Texas, Gabriela Rubannelsonkumar

Honors Program Theses and Research Projects

Previous studies on conditions like obesity, hypertension, and type 2 diabetes mellitus (T2DM) have explored the correlations between them and various other human conditions, including aortic stiffness, left ventricular hypertrophy and sleep apnea, as they predict possibilities of developing certain diseases in Mexican Americans. This study aims to observe the correlation between lifestyle decisions that could relate to the onset of the depression in normal, prediabetic, and diabetic individuals. These include smoking habits and alcohol consumption. Many papers have previously conducted research on these lifestyle habits as they relate to obesity, hypertension, diabetes, however, have done so in a singular …


Aspect-Based Sentiment Analysis Of Movie Reviews, Samuel Onalaja, Eric Romero, Bosang Yun Dec 2021

Aspect-Based Sentiment Analysis Of Movie Reviews, Samuel Onalaja, Eric Romero, Bosang Yun

SMU Data Science Review

This study investigates a comparison of classification models used to determine aspect based separated text sentiment and predict binary sentiments of movie reviews with genre and aspect specific driving factors. To gain a broader classification analysis, five machine and deep learning algorithms were compared: Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM), and Recurrent Neural Network Long-Short-Term Memory (RNN LSTM). The various movie aspects that are utilized to separate the sentences are determined through aggregating aspect words from lexicon-base, supervised and unsupervised learning. The driving factors are randomly assigned to various movie aspects and their impact tied to …


Identification And Characterization Of Forest Fire Risk Zones Leveraging Machine Learning Methods, Joshua Balson, Matt Chinchilla, Cam Lu, Jeff Washburn, Nibhrat Lohia Dec 2021

Identification And Characterization Of Forest Fire Risk Zones Leveraging Machine Learning Methods, Joshua Balson, Matt Chinchilla, Cam Lu, Jeff Washburn, Nibhrat Lohia

SMU Data Science Review

Across the United States, record numbers of wildfires are observed costing billions of dollars in property damage, polluting the environment, and putting lives at risk. The ability of emergency management professionals, city planners, and private entities such as insurance companies to determine if an area is at higher risk of a fire breaking out has never been greater. This paper proposes a novel methodology for identifying and characterizing zones with increased risks of forest fires. Methods involving machine learning techniques use the widely available and recorded data, thus making it possible to implement the tool quickly.


The Development Of Authentic Virtual Reality Scenarios To Measure Individuals’ Level Of Systems Thinking Skills And Learning Abilities, Vidanelage L. Dayarathna Dec 2021

The Development Of Authentic Virtual Reality Scenarios To Measure Individuals’ Level Of Systems Thinking Skills And Learning Abilities, Vidanelage L. Dayarathna

Theses and Dissertations

This dissertation develops virtual reality modules to capture individuals’ learning abilities and systems thinking skills in dynamic environments. In the first chapter, an immersive queuing theory teaching module is developed using virtual reality technology. The objective of the study is to present systems engineering concepts in a more sophisticated environment and measure students learning abilities. Furthermore, the study explores the performance gaps between male and female students in manufacturing systems concepts. To investigate the gender biases toward the performance of developed VR module, three efficacy measures (simulation sickness questionnaire, systems usability scale, and presence questionnaire) and two effectiveness measures (NASA …


Integration Of Blockchain Technology Into Automobiles To Prevent And Study The Causes Of Accidents, John Kim Dec 2021

Integration Of Blockchain Technology Into Automobiles To Prevent And Study The Causes Of Accidents, John Kim

Electronic Theses, Projects, and Dissertations

Automobile collisions occur daily. We now live in an information-driven world, one where technology is quickly evolving. Blockchain technology can change the automotive industry, the safety of the motoring public and its surrounding environment by incorporating this vast array of information. It can place safety and efficiency at the forefront to pedestrians, public establishments, and provide public agencies with pertinent information securely and efficiently. Other industries where Blockchain technology has been effective in are as follows: supply chain management, logistics, and banking. This paper reviews some statistical information regarding automobile collisions, Blockchain technology, Smart Contracts, Smart Cities; assesses the feasibility …


Ecological Risk Assessment For The Temperate Demersal Elasmobranch Resource, Department Of Primary Industries And Regional Development, Western Australia Oct 2021

Ecological Risk Assessment For The Temperate Demersal Elasmobranch Resource, Department Of Primary Industries And Regional Development, Western Australia

Fisheries research reports

No abstract provided.


Otoliths Of South-Western Australian Fish: A Photographic Catalogue, Chris Dowling, Kim Smith, Elain Lek, Joshua Brown Sep 2021

Otoliths Of South-Western Australian Fish: A Photographic Catalogue, Chris Dowling, Kim Smith, Elain Lek, Joshua Brown

Fisheries research reports

No abstract provided.


Squid And Cuttlefish Resources Of Western Australia, Daniel Yeoh, Danielle J. Johnston Phd, David C. Harris Sep 2021

Squid And Cuttlefish Resources Of Western Australia, Daniel Yeoh, Danielle J. Johnston Phd, David C. Harris

Fisheries research reports

No abstract provided.


Why Does An Ex-Offender Reoffend?, Jacob Rybak Aug 2021

Why Does An Ex-Offender Reoffend?, Jacob Rybak

Symposium of Student Scholars

What leads to an offender to go back to prison? Iowa has collected data tracking recidivism to evaluate the effectiveness of its programs for released offenders. This data set includes the following for all of the offenders: age groups, type of release (parole vs being discharged at the end of their sentence), race, sex, year of release, supervising district, original offense, and whether they recidivated. For the offenders who return to prison, the data set includes measures on days to return, type of recidivism (technicality or new crime), and what the specific offense was that caused their return.

In the …


Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin Aug 2021

Bayesian Variable Selection Strategies In Longitudinal Mixture Models And Categorical Regression Problems., Md Nazir Uddin

Electronic Theses and Dissertations

In this work, we seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. To develop this method, we consider data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as the longitudinal response variable, we consider a Bayesian mixture model with $K$ components. The data consist of a large collection of demographic, financial, and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster-level predictors is fit to the data through an MCMC …


Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang Jul 2021

Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang

Doctoral Dissertations

In the process of statistical modeling, the descriptive modeling plays an essential role in accelerating the formulation of plausible hypotheses in the subsequent explanatory modeling and facilitating the selection of potential variables in the subsequent predictive modeling. Especially, for multivariate categorical data analysis, it is desirable to use the descriptive modeling methods for uncovering and summarizing the potential association structure among multiple categorical variables in a compact manner. However, many classical methods in this case either rely on strong assumptions for parametric models or become infeasible when the data dimension is higher. To this end, we propose a model-free method …


Statistical Modeling For High-Dimensional Compositional Data With Applications To The Human Microbiome, Thy Dao Jul 2021

Statistical Modeling For High-Dimensional Compositional Data With Applications To The Human Microbiome, Thy Dao

Graduate Theses and Dissertations

Compositional data refer to the data that lie on a simplex, which are common in many scientific domains such as genomics, geology, and economics. As the components in a composition must sum to one, traditional tests based on unconstrained data become inappropriate, and new statistical methods are needed to analyze this special type of data. This dissertation is motivated by some statistical problems arising in the analysis of compositional data. In particular, we focus on the high-dimensional and over-dispersed setting, where the dimensionality of compositions is greater than the sample size and the dispersion parameter is moderate or large. In …


Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao Jul 2021

Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao

Graduate Theses and Dissertations

Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users' data may contain private information that needs to be protected.

Cloud computing has become more and more popular in …


Knowledge Discovery From Complex Event Time Data With Covariates, Samira Karimi Jul 2021

Knowledge Discovery From Complex Event Time Data With Covariates, Samira Karimi

Graduate Theses and Dissertations

In particular engineering applications, such as reliability engineering, complex types of data are encountered which require novel methods of statistical analysis. Handling covariates properly while managing the missing values is a challenging task. These type of issues happen frequently in reliability data analysis. Specifically, accelerated life testing (ALT) data are usually conducted by exposing test units of a product to severer-than-normal conditions to expedite the failure process. The resulting lifetime and/or censoring data are often modeled by a probability distribution along with a life-stress relationship. However, if the probability distribution and life-stress relationship selected cannot adequately describe the underlying failure …


Grizzly Bears Mortalities And The Survival Of The Species, Courtney Swanson Jun 2021

Grizzly Bears Mortalities And The Survival Of The Species, Courtney Swanson

Senior Seminars and Capstones

In this paper we aim to understand what is happening in the grizzly bear population mortalities from the year 2010 to 2020. We are performing Classical and Regression Tree (CART) methods and Correspondence Analysis on data provided by the U.S. Geological Survey (USGS). We found certain variables in the data set to be important through CART methods. Correspondence Analysis then allowed us to compare these variables to determine their relationships and association to one another. Most of the grizzly bear deaths are human caused and mainly over land and resources such as food and habitat. This aligns with some of …


Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki Jun 2021

Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki

Dissertations, Theses, and Capstone Projects

In the United States, a significant population is facing an uphill battle trying to thrive in an industry that has seen exponential growth in recent years. Women, who account for approximately 50.8% of the U.S. population are statistically underpaid and underrepresented in science, technology, engineering, and mathematics (STEM). Despite women-led technology teams establishing a 21% greater return on investment than teams who don’t, and young women largely outperforming men in math according to a 2015 study, there are only three fortune 500 companies led by women, and they comprise only 10% of internet entrepreneurs. Research generates hundreds of articles, infographics, …


Why Does An Ex-Offender Reoffend?, Jacob Rybak May 2021

Why Does An Ex-Offender Reoffend?, Jacob Rybak

Symposium of Student Scholars

What leads an offender to go back to prison? This researcher has lived in the Georgia State prison system for 3.5 years. Using personal insights as well as analytics, this researcher analyzes Iowa state’s six-year data set tracking recidivism of released offenders and recommends changes to the prison system to address the analytical findings.

The Iowa recidivism data set includes the following information for all offenders: age group, type of release (parole vs different discharges), release year, original offense, and whether they recidivated. For the recidivating offenders, the data set includes the days to return to prison, the type of …


Access To Higher Education: Do Schools “Grant” Success?, Nathaniel Jones May 2021

Access To Higher Education: Do Schools “Grant” Success?, Nathaniel Jones

Symposium of Student Scholars

University education can lead to upward income mobility for low-income students. Being exposed to other student’s life experiences that are different from their own may highlight activities and actions that they may want to consider aiding their success. According to the U.S. Bureau of Labor Statistics, the median weekly earnings in 2019 for all workers in the U.S. was $969. Of those, U.S. workers who held bachelor’s degrees earned $1,248. In 2016, the Brookings Institute found that Pell Grant recipients and first-generation student loan borrowers attended universities that had lower graduation rates and higher loan default rates in comparison to …


Use Of Linear Discriminant Analysis In Song Classification: Modeling Based On Wilco Albums, Caroline Pollard May 2021

Use Of Linear Discriminant Analysis In Song Classification: Modeling Based On Wilco Albums, Caroline Pollard

Honors Theses

The study of music recommender algorithms is a relatively new area of study. Although these algorithms serve a variety of functions, they primarily help advertise and suggest music to users on music streaming services. This thesis explores the use of linear discriminant analysis in music categorization for the purpose of serving as a cheaper and simpler content-based recommender algorithm. The use of linear discriminant analysis was tested by creating lineardiscriminant functions that classify Wilco’s songs into their respective albums, specifically A.M., Yankee Hotel Foxtrot, and Sky Blue Sky. 4 sample songs were chosen from each album, and song data was …


Association Between Stream Impairment By Mercury And Superfund Sites In The Conterminous Usa, Karessa L. Manning May 2021

Association Between Stream Impairment By Mercury And Superfund Sites In The Conterminous Usa, Karessa L. Manning

Masters Theses

Mercury is a natural element that can cause harm to the brain, heart, kidneys, lungs, and immune system, especially to fetuses developing in the womb. Many natural and anthropogenic factors contribute to mercury in the environment, such as geologic deposits, landfills, gold and silver mining operations, cement production, and atmospheric deposition. Mercury has been identified as a contaminant of concern at many National Priority List (NPL) sites, however, studies on contamination at NPL sites are often only conducted on a local level. This study was to analyze the potential connection between mercury-contaminated NPL sites and the presence of mercury impaired …


A Comparison Of The Localized Aviation Mos Program (Lamp) And Terminal Aerodrome Forecast (Taf) Accuracy For General Aviation, Douglas D. Boyd, Thomas A. Guinn Apr 2021

A Comparison Of The Localized Aviation Mos Program (Lamp) And Terminal Aerodrome Forecast (Taf) Accuracy For General Aviation, Douglas D. Boyd, Thomas A. Guinn

Journal of Aviation Technology and Engineering

Background. For general aviation (GA) pilots, operations in instrument meteorological conditions (IMC) carry an elevated risk of a fatal accident. As to whether a general aviation flight can be safely undertaken, aerodrome-specific forecasts (TAF, LAMP) provide guidance. Although LAMP forecasts are more common for GA-frequented aerodromes, nevertheless, the FAA recommends that for such aerodromes (and for which a TAF is not issued) the airman uses the TAF generated for the geographically closest airport for pre-flight weather evaluation. Herein, for non-TAF-issuing airports, the LAMP (sLAMP) predictive accuracy for visual (VFR) and instrument (IFR) flight rules flight category was determined.

Method. sLAMP …


Application Of Cycle-By-Cycle Analysis To Eeg Data From Individuals With Phelan-Mcdermid Syndrome, Naomi Miller Apr 2021

Application Of Cycle-By-Cycle Analysis To Eeg Data From Individuals With Phelan-Mcdermid Syndrome, Naomi Miller

ENGS 88 Honors Thesis (AB Students)

This study aimed to analyze a novel method of processing data from electroencephalography (EEG) recordings, which implements time-domain cycle-by-cycle analysis. This "bycycle" method, developed by the Cole & Voytek laboratory, was implemented on a EEG dataset of children with and without Phelan-McDermid Syndrome in the hopes of uncovering network-level explanations for the genetic disorder. A supplemental Python pipeline was developed to organize and visualize the data. This led to the discovery of group-level differences in measures of cycle symmetry in alpha band waves over the sensorimotor electrodes. Through the same pipeline, the bycycle tool was validated as a sound EEG …


Analyzing Student Experience On Group Work With The Application Of Different Group Allocation Approaches, An Yee Tan Mar 2021

Analyzing Student Experience On Group Work With The Application Of Different Group Allocation Approaches, An Yee Tan

Management and HR

Working as a group can be as challenging as working by oneself. Common issues like ineffective group work, unequal work contribution, and poor communication are believed to be the reasons why many students preferred to work individually. The purpose of this study is to understand if there is a disparity in student experience on group work by implementing different methods of group formation, which are, intentional group formation and random assignment. Topics around team well-being, team communication, and team effectiveness are the main focus of this study. The second emphasis of this study is students’ opinions on whether or not …


Behavior Of Lightning In Developing Storms, Erick A. Tello Mar 2021

Behavior Of Lightning In Developing Storms, Erick A. Tello

Theses and Dissertations

Air Force weather squadrons issue a warning when lightning activity is observed within 5 nautical miles (NM) of protected areas. Upon receiving this warning, personnel outdoors are expected to pause work and move inside. Studies sponsored by the 45th Weather Squadron (45 WS) have concluded that the 5 NM warning radius can be safely reduced for well-developed storms. This thesis investigates whether radii for storms in early development can also be reduced. Our research develops algorithms to partition lightning sensor data into storms. Next, storms are filtered to their earliest lightning events, and the study calculates distances between successive early …


Node Classification On Relational Graphs Using Deep-Rgcns, Nagasai Chandra Mar 2021

Node Classification On Relational Graphs Using Deep-Rgcns, Nagasai Chandra

Master's Theses

Knowledge Graphs are fascinating concepts in machine learning as they can hold usefully structured information in the form of entities and their relations. Despite the valuable applications of such graphs, most knowledge bases remain incomplete. This missing information harms downstream applications such as information retrieval and opens a window for research in statistical relational learning tasks such as node classification and link prediction. This work proposes a deep learning framework based on existing relational convolutional (R-GCN) layers to learn on highly multi-relational data characteristic of realistic knowledge graphs for node property classification tasks. We propose a deep and improved variant, …


Carbon Dioxide And Particulate Matter Concentration On Hampton Roads Air Quality, Gregory Hubbard Jan 2021

Carbon Dioxide And Particulate Matter Concentration On Hampton Roads Air Quality, Gregory Hubbard

OUR Journal: ODU Undergraduate Research Journal

Hampton Roads has been a maritime crossroads for the last 400 years. Industrialization has impacted the coastal region for the last 250 years. The expansion of the Port of Virginia in 2019 has created dense traffic in the region resulting in impacts to air quality. Two waste products that affect humans are particulate matter and carbon dioxide. Both respective emissions can cause adverse effects on humans, such as asthma, some lung cancers, and other respiratory distress. Scientists and health practitioners are studying the effects of particulate matter on human health. Hampton Roads, in particular, because of its unique location on …


An Evaluation Of The Performance Of Proc Arima's Identify Statement: A Data-Driven Approach Using Covid-19 Cases And Deaths In Florida, Fahmida Akter Shahela Jan 2021

An Evaluation Of The Performance Of Proc Arima's Identify Statement: A Data-Driven Approach Using Covid-19 Cases And Deaths In Florida, Fahmida Akter Shahela

Electronic Theses and Dissertations, 2020-

Understanding data on novel coronavirus (COVID-19) pandemic, and modeling such data over time are crucial for decision making at managing, fighting, and controlling the spread of this emerging disease. This thesis work looks at some aspects of exploratory analysis and modeling of COVID-19 data obtained from the Florida Department of Health (FDOH). In particular, the present work is devoted to data collection, preparation, description, and modeling of COVID-19 cases and deaths reported by FDOH between March 12, 2020, and April 30, 2021. For modeling data on both cases and deaths, this thesis utilized an autoregressive integrated moving average (ARIMA) times …


Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper Jan 2021

Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper

Theses and Dissertations--Public Health (M.P.H. & Dr.P.H.)

Background: Extraction of coal through mountaintop removal mining (MTR) alters many dimensions of the landscape, and explosive blasts, exposed rock, and coal washing have the potential to pollute air and water with substances known to increase risk of developmental and birth anomalies. Previous research suggests that infants born to mothers living in MTR coal mining counties have higher prevalence of most types of birth defects.

Objectives: This study seeks to examine further the relationship between MTR activity and birth defects by employing individual level exposure estimation through precise satellite data of MTR activity in the Appalachian region and maternal residence …


Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao Jan 2021

Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao

Theses and Dissertations

Drug addiction can lead to many health-related problems and social concerns. Functional connectivity obtained from functional magnetic resonance imaging (fMRI) data promotes a variety of fundamental understandings in such association. Due to its complex correlation structure and large dimensionality, the modeling and analysis of the functional connectivity from neuroimage are challenging. By proposing a spatio-temporal model for multi-subject neuroimage data, we incorporate voxel-level spatio-temporal dependencies of whole-brain measurements to improve the accuracy of statistical inference. To tackle large-scale spatio-temporal neuroimage data, we develop a computationally efficient algorithm to estimate the parameters. Our method is used to identify functional connectivity and …


Novel Nonparametric Testing Approaches For Multivariate Growth Curve Data: Finite-Sample, Resampling And Rank-Based Methods, Ting Zeng Jan 2021

Novel Nonparametric Testing Approaches For Multivariate Growth Curve Data: Finite-Sample, Resampling And Rank-Based Methods, Ting Zeng

Theses and Dissertations--Statistics

Multivariate growth curve data naturally arise in various fields, for example, biomedical science, public health, agriculture, social science and so on. For data of this type, the classical approach is to conduct multivariate analysis of variance (MANOVA) based on Wilks' Lambda and other multivariate statistics, which require the assumptions of multivariate normality and homogeneity of within-cell covariance matrices. However, data being analyzed nowadays show marked departure from multivariate normal distribution and homoscedasticity. In this dissertation, we investigate nonparametric testing approaches for multivariate growth curve data from three aspects, i.e., finite-sample, resampling and rank-based methods.

The first project proposes an approximate …