Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

City University of New York (CUNY)

Discipline
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 23 of 23

Full-Text Articles in Statistical Models

Making Sense Of Making Parole In New York, Alexandra Mcglinchy Feb 2024

Making Sense Of Making Parole In New York, Alexandra Mcglinchy

Dissertations, Theses, and Capstone Projects

For many individuals incarcerated in New York, the initial step toward freedom begins with an interview with the Board of Parole. This process, however, is frequently a complex and challenging one, characterized by repeated denials and extended incarcerations. The disparity in outcomes – where one individual may receive over 20 denials and another is granted parole on their first attempt – highlights the ambiguity and inconsistency in the parole decision-making process. This project aims to clarify the factors that influence parole decisions by concentrating on measurable variables. These include age, race, duration of sentence served, proportion of sentence served, type …


Modeling Of Covid-19 Clinical Outcomes In Mexico: An Analysis Of Demographic, Clinical, And Chronic Disease Factors, Livia Clarete Feb 2024

Modeling Of Covid-19 Clinical Outcomes In Mexico: An Analysis Of Demographic, Clinical, And Chronic Disease Factors, Livia Clarete

Dissertations, Theses, and Capstone Projects

This study explores COVID-19 clinical outcomes in Mexico, focusing on demographic, clinical, and chronic disease variables to develop predictive models. In the binary classification task, the Ada Boost Classifier distinguishes survivors from non-survivors, with age, sex, ethnicity, and chronic medical conditions influencing outcomes. In multiclass classification, the Gradient Boosting Classifier categorizes patients into outcome groups.

Demographic variables, especially age, are crucial for predicting COVID-19 outcomes for both the binary and multiclass classification tasks. Clinical information about previous conditions, including chronic diseases, also holds relevance, especially diabetes, immunocompromise, and cardiovascular diseases. These insights inform public health measures and healthcare strategies, emphasizing …


Analyzing Relationships With Machine Learning, Oscar Ko Feb 2023

Analyzing Relationships With Machine Learning, Oscar Ko

Dissertations, Theses, and Capstone Projects

Procedurally, this project aims to take a dataset, analyze it, and offer insights to the audience in an easy-to-digest format. Conceptually, this project will seek to explore questions like: “Do couples that meet through online dating or dating apps have higher or lower quality relationships?”, “Can any features in this dataset help predict how a subject would rate their relationship quality?”, and “What other insights can I derive from using machine learning for exploratory analysis?” The intended audience for this project is anyone interested in romantic relationships or machine learning.

The dataset is from a Stanford University survey, “How Couples …


Abm Simulation Model Of A Pandemic For Optimizing Vaccination Strategy, Gibeom Park Aug 2022

Abm Simulation Model Of A Pandemic For Optimizing Vaccination Strategy, Gibeom Park

Theses and Dissertations

This study presents a process-oriented hybrid model for individuals' immune responses and interactions involving vaccination to describe the trend of contagious disease and estimate the future societal cost. The model considers "recovery" as a non-absorbing state and incorporates various infection stage states including two symptomatic states. To model contagiousness to be consistent with the current pandemic and include that the spread of a disease depends on the mobility of people, we developed an Agent-Based Simulator that fitted to the particular model used in this study and can test various what-if scenarios. We improved the simulator considerably by appying data structures …


A Course In Data Science: R And Prediction Modeling, Adam Kapelner May 2022

A Course In Data Science: R And Prediction Modeling, Adam Kapelner

Open Educational Resources

This is a self-contained course in data science and machine learning using R. It covers philosophy of modeling with data, prediction via linear models, machine learning including support vector machines and random forests, probability estimation and asymmetric costs using logistic regression and probit regression, underfitting vs. overfitting, model validation, handling missingness and much more. There is formal instruction of data manipulation using dplyr and data.table, visualization using ggplot2 and statistical computing.


Behavioral Predictive Analytics Towards Personalization For Self-Management – A Use Case On Linking Health-Related Social Needs, Bon Sy, Michael Wassil, Helene Connelly, Alisha Hassan Jan 2022

Behavioral Predictive Analytics Towards Personalization For Self-Management – A Use Case On Linking Health-Related Social Needs, Bon Sy, Michael Wassil, Helene Connelly, Alisha Hassan

Publications and Research

The objective of this research is to investigate the feasibility of applying behavioral predictive analytics to optimize patient engagement in diabetes self-management, and to gain insights on the potential of infusing a chatbot with NLP technology for discovering health-related social needs. In the U.S., less than 25% of patients actively engage in self-health management even though self-health management has been reported to associate with improved health outcomes and reduced healthcare costs. The proposed behavioral predictive analytics relies on manifold clustering to identify subpopulations segmented by behavior readiness characteristics that exhibit non-linear properties. For each subpopulation, an individualized auto-regression model and …


Modeling Covid-19 Spread In Small Colleges, Riti Bahl, Nicole Eikmeier, Alexandra Fraser, Matthew Junge, Felicia Keesing, Kukai Nakahata, Lily Reeves Aug 2021

Modeling Covid-19 Spread In Small Colleges, Riti Bahl, Nicole Eikmeier, Alexandra Fraser, Matthew Junge, Felicia Keesing, Kukai Nakahata, Lily Reeves

Publications and Research

We develop an agent-based model on a network meant to capture features unique to COVID-19 spread through a small residential college. We find that a safe reopening requires strong policy from administrators combined with cautious behavior from students. Strong policy includes weekly screening tests with quick turnaround and halving the campus population. Cautious behavior from students means wearing facemasks, socializing less, and showing up for COVID-19 testing. We also find that comprehensive testing and facemasks are the most effective single interventions, building closures can lead to infection spikes in other areas depending on student behavior, and faster return of test …


Application Of Randomness In Finance, Jose Sanchez, Daanial Ahmad, Satyanand Singh May 2021

Application Of Randomness In Finance, Jose Sanchez, Daanial Ahmad, Satyanand Singh

Publications and Research

Brownian Motion which is also considered to be a Wiener process and can be thought of as a random walk. In our project we had briefly discussed the fluctuations of financial indices and related it to Brownian Motion and the modeling of Stock prices.


Species In Vernal Pools: Anova, Lisa Manne May 2021

Species In Vernal Pools: Anova, Lisa Manne

Open Educational Resources

A one-way analysis of variance exercise using data on species diversities from vernal pools.Data are from vernal pools in Willowbrook Park (adjacent to College of Staten Island's campus) in spring.

The typical ANOVA gives a straightforward result (significant anova, easily-interpreted Tukey-Kramer analysis). This data set requires more nuanced interpretation, as the ANOVA is marginally significant, and Tukey-Kramer yields one significant pairwise comparison between groups. Relative lack of variation within groups explains this apparent enigma.


Decision Tree For Predicting The Party Of Legislators, Afsana Mimi May 2020

Decision Tree For Predicting The Party Of Legislators, Afsana Mimi

Publications and Research

The motivation of the project is to identify the legislators who voted frequently against their party in terms of their roll call votes using Office of Clerk U.S. House of Representatives Data Sets collected in 2018 and 2019. We construct a model to predict the parties of legislators based on their votes. The method we used is Decision Tree from Data Mining. Python was used to collect raw data from internet, SAS was used to clean data, and all other calculations and graphical presentations are performed using the R software.


Semi-Supervised Regression With Generative Adversarial Networks Using Minimal Labeled Data, Greg Olmschenk Sep 2019

Semi-Supervised Regression With Generative Adversarial Networks Using Minimal Labeled Data, Greg Olmschenk

Dissertations, Theses, and Capstone Projects

This work studies the generalization of semi-supervised generative adversarial networks (GANs) to regression tasks. A novel feature layer contrasting optimization function, in conjunction with a feature matching optimization, allows the adversarial network to learn from unannotated data and thereby reduce the number of labels required to train a predictive network. An analysis of simulated training conditions is performed to explore the capabilities and limitations of the method. In concert with the semi-supervised regression GANs, an improved label topology and upsampling technique for multi-target regression tasks are shown to reduce data requirements. Improvements are demonstrated on a wide variety of vision …


Garch Modeling Of Value At Risk And Expected Shortfall Using Bayesian Model Averaging, Ismail Kheir Aug 2019

Garch Modeling Of Value At Risk And Expected Shortfall Using Bayesian Model Averaging, Ismail Kheir

Theses and Dissertations

This thesis conducts Value at Risk (VaR) and Expected Shortfall (ES) estimation using GARCH modeling and Bayesian Model Averaging (BMA). BMA considers multiple models weighted by some information criterion. Through BMA, this thesis finds that VaR and ES estimates can be improved through enhanced modeling of the data generation process.


An Overview And Evaluation Of Synthetc: A Statistical Model For Extra-Tropical Cyclones, Rafael Uryayev Jan 2019

An Overview And Evaluation Of Synthetc: A Statistical Model For Extra-Tropical Cyclones, Rafael Uryayev

Dissertations and Theses

Extratropical cyclones (ETCs) are the most common weather phenomena affecting the United States, Canada, and Europe. They can pose serious hazards over large swaths of area. In this thesis, a statistical model of ETCs, called SynthETC, is discussed. The model accounts for the for genesis, track path, termination, and intensity of statistically generated ETCs. Genesis is modeled as a Poisson process, whose mean is determined by climate and historical information. Tracks are modeled as a regression-mean determined by climate and historical information plus a stochastic component. Lysis is modeled using logistic regression, with climate states as covariates. Intensity is modeled …


Hydroclimate Drivers And Atmospheric Dynamics Of Floods, Nasser Najibi Jan 2019

Hydroclimate Drivers And Atmospheric Dynamics Of Floods, Nasser Najibi

Dissertations and Theses

Our preliminary survey showed that most of the recent flood-related studies did not formally explain the physical mechanisms of long-duration and large-peak flood events that can evoke substantial damages to properties and infrastructure systems. These studies also fell short of fully assessing the interactions of coupled ocean-atmosphere and land dynamics which are capable of forcing substantial changes to the flood attributes by governing the exceeding surface flow regimes and moisture source-sink relationships at the spatiotemporal scales important for risk management. This dissertation advances the understanding of the variability in flood duration, peak, volume, and timing at the regional to the …


Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri Oct 2018

Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri

Publications and Research

Water risk management is a ubiquitous challenge faced by stakeholders in the water or agricultural sector. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Pre-season large-scale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest ranked probability skill score and lowest root-mean-squared error in a leave-one-out cross-validation mode. Adaptive forecasts were made in the years …


Physical Applications Of The Geometric Minimum Action Method, George L. Poppe Jr. May 2018

Physical Applications Of The Geometric Minimum Action Method, George L. Poppe Jr.

Dissertations, Theses, and Capstone Projects

This thesis extends the landscape of rare events problems solved on stochastic systems by means of the \textit{geometric minimum action method} (gMAM). These include partial differential equations (PDEs) such as the real Ginzburg-Landau equation (RGLE), the linear Schroedinger equation, along with various forms of the nonlinear Schroedinger equation (NLSE) including an application towards an ultra-short pulse mode-locked laser system (MLL).

Additionally we develop analytical tools that can be used alongside numerics to validate those solutions. This includes the use of instanton methods in deriving state transitions for the linear Schroedinger equation and the cubic diffusive NLSE.

These analytical solutions are …


Assessing The Ordinality Of Response Bias With Item Response Models: A Case Study Using The Phq-9, Venessa N. Singhroy May 2018

Assessing The Ordinality Of Response Bias With Item Response Models: A Case Study Using The Phq-9, Venessa N. Singhroy

Dissertations, Theses, and Capstone Projects

Improper scale usage in psychological and clinical assessment is an important problem. If respondents do not use the scales in a consistent manner, the reliability of a composite is likely to be attenuated. This is particularly problematic when particular items are singled out for special treatment or when subscales are of interest, not just a total score. This study used both non-parametric and parametric item response theory (IRT) methods to gain further insight into the validity of the PHQ-9, a dual purpose instrument that assesses the severity of depressive symptoms using nine Likert-scale items and allows the investigator to establish …


Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei Jul 2017

Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei

Publications and Research

Comparative data on the burden of atopic dermatitis (AD) in adults relative to the general population are limited. We performed a large-scale evaluation of the burden of disease among US adults with AD relative to matched non-AD controls, encompassing comorbidities, healthcare resource utilization (HCRU), and costs, using healthcare claims data. The impact of AD disease severity on these outcomes was also evaluated.


Quantifying Transit Access In New York City: Formulating An Accessibility Index For Analyzing Spatial And Social Patterns Of Public Transportation, Maxwell S. Siegel May 2016

Quantifying Transit Access In New York City: Formulating An Accessibility Index For Analyzing Spatial And Social Patterns Of Public Transportation, Maxwell S. Siegel

Theses and Dissertations

This paper aims to analyze accessibility within New York City’s transportation system through creating unique accessibility indices. Indices are detailed and implemented using GIS, analyzing the distribution of transit need and access. Regression analyses are performed highlighting relationships between demographics and accessibility and recommendations for transit expansion are presented.


An Evolutionary Vaccination Game In The Modified Activity Driven Network By Considering The Closeness, Dun Han, Mei Sun Sep 2015

An Evolutionary Vaccination Game In The Modified Activity Driven Network By Considering The Closeness, Dun Han, Mei Sun

Publications and Research

In this paper, we explore an evolutionary vaccination game in the modified activity driven network by considering the closeness. We set a closeness parameter p which is used to describe the way of connection between two individuals. The simulation results show that the closeness p may have an active role in weakening both the spreading of epidemic and the vaccination. Besides, when vaccination is not allowed, the final recovered density increases with the value of the ratio of the infection rate to the recovery rate λ/μ. However, when vaccination is allowed the final density of recovered individual first increases and …


Time Series Analysis For Psychological Research: Examining And Forecasting Change, Andrew T. Jebb, Louis Tay, Wei Wang, Qiming Huang Jun 2015

Time Series Analysis For Psychological Research: Examining And Forecasting Change, Andrew T. Jebb, Louis Tay, Wei Wang, Qiming Huang

Publications and Research

Psychological research has increasingly recognized the importance of integrating temporal dynamics into its theories, and innovations in longitudinal designs and analyses have allowed such theories to be formalized and tested. However, psychological researchers may be relatively unequipped to analyze such data, given its many characteristics and the general complexities involved in longitudinal modeling. The current paper introduces time series analysis to psychological research, an analytic domain that has been essential for understanding and predicting the behavior of variables across many diverse fields. First, the characteristics of time series data are discussed. Second, different time series modeling techniques are surveyed that …


Using Spatiotemporal Methods To Fill Gaps In Energy Usage Interval Data, Kristin K. Graves May 2015

Using Spatiotemporal Methods To Fill Gaps In Energy Usage Interval Data, Kristin K. Graves

Theses and Dissertations

Researchers analyzing spatiotemporal or panel data, which varies both in location and over time, often find that their data has holes or gaps. This thesis explores alternative methods for filling those gaps and also suggests a set of techniques for evaluating those gap-filling methods to determine which works best.


Stochastic Dea With A Perfect Object And Its Application To Analysis Of Environmental Efficiency, Alexander Vaninsky Jul 2013

Stochastic Dea With A Perfect Object And Its Application To Analysis Of Environmental Efficiency, Alexander Vaninsky

Publications and Research

The paper introduces stochastic DEA with a Perfect Object (SDEA PO). The Perfect Object (PO) is a virtual Decision Making Unit (DMU) that has the smallest inputs and greatest outputs. Including the PO in a collection of actual objects yields an explicit formula of the efficiency index. Given the distributions of DEA inputs and outputs, this formula allows us to derive the probability distribution of the efficiency score, to find its mathematical expectation, and to deliver common (group–related) and partial (object-related) efficiency components. We apply this approach to a prospective analysis of environmental efficiency of the major national and regional …