Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Physical Sciences and Mathematics

Logistic Regression Under Sparse Data Conditions, David A. Walker, Thomas J. Smith Sep 2020

Logistic Regression Under Sparse Data Conditions, David A. Walker, Thomas J. Smith

Journal of Modern Applied Statistical Methods

The impact of sparse data conditions was examined among one or more predictor variables in logistic regression and assessed the effectiveness of the Firth (1993) procedure in reducing potential parameter estimation bias. Results indicated sparseness in binary predictors introduces bias that is substantial with small sample sizes, and the Firth procedure can effectively correct this bias.


Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison Jul 2020

Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison

Presentations

Do you trudge through gathering statistics at fiscal or calendar year-end? Do you wonder why you track certain things, thinking many seem outdated or irrelevant? Many places seem to keep counting certain statistics because "that's what they've always done." For e-resources, how do you integrate those with physical counts and reconcile the variations (updated e-resources versus re-cataloged physical items)? What about repository downloads and other web traffic? The quantity of stats that libraries track is staggering and keeps growing. This program will encourage attendees to stop and evaluate what and why they're gathering data and help identify possible alternatives to …


Exploratory Spatial Data Analysis In Traffic Safety, Amin Azimian, Dimitra Pyrialakou May 2020

Exploratory Spatial Data Analysis In Traffic Safety, Amin Azimian, Dimitra Pyrialakou

International Journal of Geospatial and Environmental Research

This paper presents an exploratory spatial data analysis (ESDA) of road traffic crashes at different severity levels in West Virginia (WV). Although ESDA can support transportation safety decision-making by helping planners understand and summarize crash data, it is underutilized in practice. This paper describes the application of five representative easy-to-use method to identify crash patterns and high crash-risk counties in WV. Analysis of crash data from 2010 to 2015 indicated that traffic crashes in WV were not spatially correlated. However, crash severities were found to be positively correlated.


Rdc Data Alternatives: Conducting Research During Covid-19, Kristi Thompson, Elizabeth Hill Apr 2020

Rdc Data Alternatives: Conducting Research During Covid-19, Kristi Thompson, Elizabeth Hill

Western Libraries Presentations

Recent physical distancing protocols pertaining to the COVID-19 Pandemic have meant that RDC researchers need to find alternatives ways of carrying out their research. The Real Time Remote Access (RTRA) program offers one alternative way to access confidential Statistics Canada data. Other options include using the Statistics Canada public use files and analyzing data from other sources.

The presenters, data librarians from Western Libraries will discuss the differences between the data that can be accessed through the RTRA the RDC. RTRA data is a very useful option for some types of questions but also has some important limitations. We will …


Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice Apr 2020

Boom Or Bust: Examining The Relationship Between High School Recruiting Rankings And The Nfl Draft, Nicholas E. Tice

Senior Theses

The goal of this thesis is to model the probability of a high school football player’s chance of being drafted based on information taken from their recruiting profile. The response variable is binary and defined as drafted (1) or undrafted (0). The independent variables were collected by scraping data from the recruiting websites including height, weight, position, hometown, recruiting grade and other socioeconomic factors based on the player’s high school. 247Sports and ESPN were the two recruiting services used and compared in this study. Because of the binary nature of the dependent variable, logistic regression and decision trees were chosen …


Complex Systems Analysis In Selected Domains: Animal Biosecurity & Genetic Expression, Luke Trinity Jan 2020

Complex Systems Analysis In Selected Domains: Animal Biosecurity & Genetic Expression, Luke Trinity

Graduate College Dissertations and Theses

I first broadly define the study of complex systems, identifying language to describe and characterize mechanisms of such systems which is applicable across disciplines. An overview of methods is provided, including the description of a software development methodology which defines how a combination of computer science, statistics, and mathematics are applied to specified domains. This work describes strategies to facilitate timely completion of robust and adaptable projects which vary in complexity and scope. A biosecurity informatics pipeline is outlined, which is an abstraction useful in organizing the analysis of biological data from cells. This is followed by specific applications of …


Data Rescue & Curation Best Practices Guide, Ocul Data Community (Odc) Data Rescue Group Jan 2020

Data Rescue & Curation Best Practices Guide, Ocul Data Community (Odc) Data Rescue Group

Western Libraries Publications

he aim of the Data Rescue & Curation Best Practices Guide is to provide an accessible and hands-on approach to handling data rescue and digital curation of at-risk data for use in secondary research. We provide a set of examples and workflows for addressing common challenges with social science survey data that can be applied to other social and behavioural research data. The goal of this guide and set of workflows presented is to improve librarians’ and data curators’ skills in providing access to high-quality, well-documented, and reusable research data. The aspects of data curation that are addressed throughout this …


Data Governance And The Emerging University, Michael J. Madison Jan 2020

Data Governance And The Emerging University, Michael J. Madison

Book Chapters

Knowledge and information governance questions are tractable primarily in institutional terms, rather than in terms of abstractions such as knowledge itself or individual or social interests. This chapter offers the modern research university as an example. Practices of data-intensive research by university-based researchers, sometimes reduced to the popular phrase “Big Data,” pose governance challenges for the university. The chapter situates those challenges in the traditional understanding of the university as an institution for understanding forms and flows of knowledge. At a broad level, the chapter argues that the new salience of data exposes emerging shifts in the social, cultural, and …


A Machine Learning Approach To The Perception Of Phrase Boundaries In Music, Evan Matthew Petratos Jan 2020

A Machine Learning Approach To The Perception Of Phrase Boundaries In Music, Evan Matthew Petratos

Senior Projects Fall 2020

Segmentation is a well-studied area of research for speech, but the segmentation of music has typically been treated as a separate domain, even though the same acoustic cues that constitute information in speech (e.g., intensity, timbre, and rhythm) are present in music. This study aims to sew the gap in research of speech and music segmentation. Musicians can discern where musical phrases are segmented. In this study, these boundaries are predicted using an algorithmic, machine learning approach to audio processing of acoustic features. The acoustic features of musical sounds have localized patterns within sections of the music that create aurally …