Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics

Categorical Data Analysis

Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 29 of 29

Full-Text Articles in Physical Sciences and Mathematics

Influence Diagnostics For Generalized Estimating Equations Applied To Correlated Categorical Data, Louis Vazquez Apr 2023

Influence Diagnostics For Generalized Estimating Equations Applied To Correlated Categorical Data, Louis Vazquez

Statistical Science Theses and Dissertations

Influence diagnostics in regression analysis allow analysts to identify observations that have a strong influence on model fitted probabilities and parameter estimates. The most common influence diagnostics, such as Cook’s Distance for linear regression, are based on a deletion approach where the results of a model with and without observations of interest are compared. Here, deletion-based influence diagnostics are proposed for generalized estimating equations (GEE) for correlated, or clustered, nominal multinomial responses. The proposed influence diagnostics focus on GEEs with the baseline-category logit link function and a local odds ratio parameterization of the association structure. Formulas for both observation- and …


Split Classification Model For Complex Clustered Data, Katherine Gerot Mar 2022

Split Classification Model For Complex Clustered Data, Katherine Gerot

Honors Theses

Classification in high-dimensional data has generated tremendous interest in a multitude of fields. Data in higher dimensions often tend to reside in non-Euclidean metric space. This prevents Euclidean-based classification methodologies, such as regression, from reliably modeling the data. Many proposed models rely on computationally-complex embedding to convert the data to a more usable format. Others, namely the Support Vector Machine, rely on kernel manipulation to implicitly describe the "feature space" to arrive at a non-linear decision boundary. The proposed methodology in this paper seeks to classify complex data in a relatively computationally-simple and explainable manner.


A Monte Carlo Simulation Of Rat Choice Behavior With Interdependent Outcomes, Michelle A. Frankot Jan 2022

A Monte Carlo Simulation Of Rat Choice Behavior With Interdependent Outcomes, Michelle A. Frankot

Graduate Theses, Dissertations, and Problem Reports

Preclinical behavioral neuroscience often uses choice paradigms to capture psychiatric symptoms. In particular, the subfield of operant research produces nested datasets with many discrete choices in a session. The standard analytic practice is to aggregate choice into a continuous variable and analyze using ANOVA or linear regression. However, choice data often have multiple interdependent outcomes of interest, violating an assumption of general linear models. The aim of the current study was to quantify the accuracy of linear mixed-effects regression (LMER) for analyzing data from a 4-choice operant task called the Rodent Gambling Task (RGT), which measures decision-making in the context …


Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki Jun 2021

Data Analysis And Visualization To Dismantle Gender Discrimination In The Field Of Technology, Quinn Bolewicki

Dissertations, Theses, and Capstone Projects

In the United States, a significant population is facing an uphill battle trying to thrive in an industry that has seen exponential growth in recent years. Women, who account for approximately 50.8% of the U.S. population are statistically underpaid and underrepresented in science, technology, engineering, and mathematics (STEM). Despite women-led technology teams establishing a 21% greater return on investment than teams who don’t, and young women largely outperforming men in math according to a 2015 study, there are only three fortune 500 companies led by women, and they comprise only 10% of internet entrepreneurs. Research generates hundreds of articles, infographics, …


How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel Apr 2021

How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel

Thinking Matters Symposium

Due to the pandemic, people have started relying more on televisions, news, social media, and other news outlets for guidance. Moreover, with the increasing amount of news, data, and information there is also an increase in the amount of misleading statistics. People’s opinions and decisions significantly depend on the data, statistics, and information that they are exposed to, as well as their sources. For this project, we want to look at how information and its sources are affecting the decision made by the general public for the usage of the Portland Transit System. It is very important to know why …


Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec Dec 2020

Can Statcast Variables Explain The Variation In Weighted Runs Created Plus?, Ryan Kupiec

Student Research

The release of Statcast data in 2015 was revolutionary for data analysis in the game of baseball. Many analysts have begun using this data regularly, but none have used it exclusively. Often older, less reliable statistics (on-base percentage) are still used in favor of the newer statistics (weighted runs created plus). In this paper, we attempt to explain the variation in weighted runs created plus (wRC+) using Statcast variables such as exit velocity and launch angle. We find that exit velocity along with other Statcast variables, can explain as much as 70% of the variation in wRC+. Launch angle can …


Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman Nov 2020

Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman

Access*: Interdisciplinary Journal of Student Research and Scholarship

The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …


“Playing The Whole Game”: A Data Collection And Analysis Exercise With Google Calendar, Albert Y. Kim, Johanna Hardin Aug 2020

“Playing The Whole Game”: A Data Collection And Analysis Exercise With Google Calendar, Albert Y. Kim, Johanna Hardin

Statistical and Data Sciences: Faculty Publications

We provide a computational exercise suitable for early introduction in an undergraduate statistics or data science course that allows students to “play the whole game” of data science: performing both data collection and data analysis. While many teaching resources exist for data analysis, such resources are not as abundant for data collection given the inherent difficulty of the task. Our proposed exercise centers around student use of Google Calendar to collect data with the goal of answering the question “How do I spend my time?” On the one hand, the exercise involves answering a question with near universal appeal, but …


Inference Of Heterogeneity In Meta-Analysis Of Rare Binary Events And Rss-Structured Cluster Randomized Studies, Chiyu Zhang Dec 2019

Inference Of Heterogeneity In Meta-Analysis Of Rare Binary Events And Rss-Structured Cluster Randomized Studies, Chiyu Zhang

Statistical Science Theses and Dissertations

This dissertation contains two topics: (1) A Comparative Study of Statistical Methods for Quantifying and Testing Between-study Heterogeneity in Meta-analysis with Focus on Rare Binary Events; (2) Estimation of Variances in Cluster Randomized Designs Using Ranked Set Sampling.

Meta-analysis, the statistical procedure for combining results from multiple studies, has been widely used in medical research to evaluate intervention efficacy and safety. In many practical situations, the variation of treatment effects among the collected studies, often measured by the heterogeneity parameter, may exist and can greatly affect the inference about effect sizes. Comparative studies have been done for only one or …


Market Research On Student Concert Attendance At Bgsu's College Of Musical Arts, Mary Solomon May 2019

Market Research On Student Concert Attendance At Bgsu's College Of Musical Arts, Mary Solomon

Honors Projects

Bowling Green State University boasts a well established College of Musical Arts which holds concerts performed by esteemed faculty, prestigious guest artists, and students. The school hosts these events in Kobacker Hall and Bryan Recital Hall which can accommodate up to 800 and 250 audience members, respectively. However, performances in Kobacker hall only fill one- fourth of the 800 seats, on average. Why is this so? This project aims to investigate the factors that influence students’ decisions to attend concerts at the College of Musical Arts (CMA). By methodology of survey research and statistical analysis, this project will look into …


The Evolution Of Data Science: A New Mode Of Knowledge Production, Jennifer Lewis Priestley, Robert J. Mcgrath Apr 2019

The Evolution Of Data Science: A New Mode Of Knowledge Production, Jennifer Lewis Priestley, Robert J. Mcgrath

Faculty and Research Publications

Is data science a new field of study or simply an extension or specialization of a discipline that already exists, such as statistics, computer science, or mathematics? This article explores the evolution of data science as a potentially new academic discipline, which has evolved as a function of new problem sets that established disciplines have been ill-prepared to address. The authors find that this newly-evolved discipline can be viewed through the lens of a new mode of knowledge production and is characterized by transdisciplinarity collaboration with the private sector and increased accountability. Lessons from this evolution can inform knowledge production …


Comparative Analysis Of Students’ Performance Between Online And On Campus In An Introductory Statistics Course, Kendal Mcdonald Jan 2019

Comparative Analysis Of Students’ Performance Between Online And On Campus In An Introductory Statistics Course, Kendal Mcdonald

The Corinthian

In this research, we compare students’ performance in an online and on-campus introductory statistics and probability course at Georgia College. MyStatLab is the learning management system used in both the online and on-campus courses for homework and quizzes. The online data is produced by five summer courses between Summer 2014 to Summer 2017 and the on-campus data is produced from nine on-campus courses from Spring 2014, Spring 2016, and Spring 2017. For homework, the research compares the scores made between online and on-campus. For quizzes, we test if there is a difference between the scores and the number of attempts …


Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane Jan 2019

Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane

Statistical Science Theses and Dissertations

If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?

We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …


Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John Aug 2018

Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

SMU Data Science Review

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age, …


A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz Dec 2016

A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz

Doctor of Business Administration Dissertations

At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with …


Scientific Awareness At Ursinus College, Frank G. Devone Apr 2015

Scientific Awareness At Ursinus College, Frank G. Devone

Mathematics Honors Papers

Ursinus College prides itself on creating well-rounded students, and recent initiatives, such as the Fellowships in the Ursinus Transition to the Undergraduate Research Experience Program and the Center for Science and the Common Good suggest that science is a vital part of the Ursinus liberal arts mission. A scientific awareness pilot survey was administered to a sample of Ursinus students drawn from the Class of 2014 and students residing at Ursinus during summer 2014. Experience and data collected from this pilot were used to create a final survey which was made available to all students at Ursinus College. The survey …


Revising Common Core Georgia Performance Standards Statistics Lesson Plans To Better Align With Statistical Practice, Rachel Bonilla Jan 2013

Revising Common Core Georgia Performance Standards Statistics Lesson Plans To Better Align With Statistical Practice, Rachel Bonilla

Electronic Theses and Dissertations

In this thesis, lesson plans provided by the Georgia Department of Education are revised to give students better exposure and practice working with real-life data. Three learning tasks and a performance task are presented covering a unit lesson on statistical regression. The development of Georgia statistics curriculum standards are reviewed and presented.


An Analysis Of Risk Reduction Choices In Dcis Breast Cancer Patients, Lauren Soltesz Dec 2012

An Analysis Of Risk Reduction Choices In Dcis Breast Cancer Patients, Lauren Soltesz

Statistics

The main focus of this paper was to evaluate possible demographic and clinical characteristics associated with a woman’s choice of breast conserving surgery (BCS), unilateral mastectomy (ULM), or bilateral risk reduction mastectomy (BRRM). The cohort consisted of patients presenting to the City of Hope National Medical Center with ductal carcinoma in situ breast cancer who elected to have cancer directed surgery (N=305). Analyses to examine associations of patient characteristics with type of surgery were conducted using a multinomial logistic regression. Results showed that older women were more likely to choose breast conserving surgery over bilateral risk reduction mastectomy than younger …


Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison May 2012

Using The R Library Rpanel For Gui-Based Simulations In Introductory Statistics Courses, Ryan M. Allison

Statistics

As a student, I noticed that the statistical package R (http://www.r-project.org) would have several benefits of its usage in the classroom. One benefit to the package is its free and open-source nature. This would be a great benefit for instructors and students alike since it would be of no cost to use, unlike other statistical packages. Due to this, students could continue using the program after their statistical courses and into their professional careers. It would be good to expose students while they are in school to a tool that professionals use in industry. R also has powerful …


Software Internationalization: A Framework Validated Against Industry Requirements For Computer Science And Software Engineering Programs, John Huân Vũ Mar 2010

Software Internationalization: A Framework Validated Against Industry Requirements For Computer Science And Software Engineering Programs, John Huân Vũ

Master's Theses

View John Huân Vũ's thesis presentation at http://youtu.be/y3bzNmkTr-c.

In 2001, the ACM and IEEE Computing Curriculum stated that it was necessary to address "the need to develop implementation models that are international in scope and could be practiced in universities around the world." With increasing connectivity through the internet, the move towards a global economy and growing use of technology places software internationalization as a more important concern for developers. However, there has been a "clear shortage in terms of numbers of trained persons applying for entry-level positions" in this area. Eric Brechner, Director of Microsoft Development Training, suggested …


Statistical Section (June 1984), Central Bank Of Nigeria Cbn Jun 1984

Statistical Section (June 1984), Central Bank Of Nigeria Cbn

Economic and Financial Review

Statistical tables on central banking, commercial banking, merchant banking, international liquidity, international trade, the flow of funds, money supply, national savings, production, consumer prices, public debt and public finance.


Statistical Section (September 1983), Central Bank Of Nigeria Cbn Sep 1983

Statistical Section (September 1983), Central Bank Of Nigeria Cbn

Economic and Financial Review

The statistical tables comprised of the following: Central Bank of Nigeria statement of assets & liabilities, Central Bank monthly rediscounts gross, Commercial Banks' statement of assets and liabilities, Analysis of Commercial Banks' loans and advances, Selected predominant interest rates, Ratio of loans and advances to deposits, Net external assets of Commercial Banks, and Liquidity ratios of Commercial and merchants Banks'.


The Maine Coast, A Statistical Source, Maine Coastal Program Sep 1978

The Maine Coast, A Statistical Source, Maine Coastal Program

Maine Collection

The Maine Coast, A Statistical Source

Maine Coastal Program, Natural Resource Planning Division, Maine State Planning Office , Augusta, Maine

First Printing June 1978 - Second Printing September 1978

Contents: Preface / Introduction / Chapter 1 - Demography / Chapter 2 - Land Use and Taxation / Chapter 3 - Economy / Chapter 4 - Housing / Chapter 5 - Transportation / Chapter 6 - Education / Chapter 7 - Recreation / Chapter 8 - Social Services / Chapter 9 - Natural Resources / References / Index


Statistical Section (June 1972), Central Bank Of Nigeria Cbn Jun 1972

Statistical Section (June 1972), Central Bank Of Nigeria Cbn

Economic and Financial Review

The dataset set contains data from central banking, commercial banking, currency in circulation, external assets, international trade, money and capital, money supply, national savings, production and public debt covering 1966-1971.


Statistical Section (June 1970), Central Bank Of Nigeria Cbn Jun 1970

Statistical Section (June 1970), Central Bank Of Nigeria Cbn

Economic and Financial Review

The dataset include data for central bank and commercial banks statement of assets and liabilities, currency in circulation, external assets, international trade, money and capital markets, money supply, national savings, production, public debt and public finance.


Statistical Section (June 1969), Central Bank Of Nigeria Cbn Jun 1969

Statistical Section (June 1969), Central Bank Of Nigeria Cbn

Economic and Financial Review

The dataset for the June 1969 issue of the Economic and Financial Review contents data for central banking, commercial banking, money supply, money and capital markets, public debt, national savings, external assets, international trade, production, fuel and power and public finance, covering data from 1962 to 1969.


Statistical Section (December 1968), Central Bank Of Nigeria Cbn Dec 1968

Statistical Section (December 1968), Central Bank Of Nigeria Cbn

Economic and Financial Review

Publication of the statistics include analysis of commercial bank loans and advances, finance agricultural exports, money and capital marker, national savings, international trade, fuel and power and some new headings that are now included like transportation and communications, while some have been merged to other headings.


Statistical Section December 1966, Central Bank Of Nigeria Cbn Dec 1966

Statistical Section December 1966, Central Bank Of Nigeria Cbn

Economic and Financial Review

The statistics included in this publication for 1966 are the central bank statement of assets and liabilities, currency in circulation, commercial banking, money and capital markets, public finance, international trade, production, fuel and power among others


Statistical Section (June 1966), Central Bank Of Nigeria Cbn Jun 1966

Statistical Section (June 1966), Central Bank Of Nigeria Cbn

Economic and Financial Review

This section of the EFR reports the statistics of the Central Bank of Nigeria (CBN) from 1960 to the first half 1966. This includes the Bank’s statement of assets and liabilities as well as its rediscount operations. Other sections include: currency in circulation, money supply, commercial banking activities, money and capital markets, public finance, national savings, external assets and international trade. The remaining sections include: Production (agricultural produce, mineral production), fuel and power, electricity generation and consumption.