Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Journal

Discipline
Institution
Keyword
Publication Year
Publication

Articles 1 - 30 of 69

Full-Text Articles in Statistical Models

Research On Chinese Data Sovereignty Policy Based On Lda Model And Policy Instruments, Han Qiao, Junru Xu Mar 2024

Research On Chinese Data Sovereignty Policy Based On Lda Model And Policy Instruments, Han Qiao, Junru Xu

Bulletin of Chinese Academy of Sciences (Chinese Version)

Data sovereignty has become an important component of national sovereignty in the dual context of the digital economy development and the overall national security concept. Major countries and regions are actively carrying out data sovereignty strategic deployment and engaging in fierce competition in data resources, data technology, and data rules. This work adopts the policy text analysis method to study China’s data sovereignty policy, and employs the LDA model and policy instruments to quantitatively analyze the process evolution and thematic characteristics of China’s data sovereignty policy. Drawing on these findings, this study comprehensively considers the global data sovereignty policy and …


Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown Jan 2024

Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown

The Journal of Purdue Undergraduate Research

No abstract provided.


Sensitivity Analysis Of Prior Distributions In Regression Model Estimation, Ayoade I Adewole, Oluwatoyin K. Bodunwa Jan 2024

Sensitivity Analysis Of Prior Distributions In Regression Model Estimation, Ayoade I Adewole, Oluwatoyin K. Bodunwa

Al-Bahir Journal for Engineering and Pure Sciences

Bayesian inferences depend solely on specification and accuracy of likelihoods and prior distributions of the observed data. The research delved into Bayesian estimation method of regression models to reduce the impact of some of the problems, posed by convectional method of estimating regression models, such as handling complex models, availability of small sample sizes and inclusion of background information in the estimation procedure. Posterior distributions are based on prior distributions and the data accuracy, which is the fundamental principles of Bayesian statistics to produce accurate final model estimates. Sensitivity analysis is an essential part of mathematical model validation in obtaining …


Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia Dec 2023

Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia

Journal of Nonprofit Innovation

Urban farming can enhance the lives of communities and help reduce food scarcity. This paper presents a conceptual prototype of an efficient urban farming community that can be scaled for a single apartment building or an entire community across all global geoeconomics regions, including densely populated cities and rural, developing towns and communities. When deployed in coordination with smart crop choices, local farm support, and efficient transportation then the result isn’t just sustainability, but also increasing fresh produce accessibility, optimizing nutritional value, eliminating the use of ‘forever chemicals’, reducing transportation costs, and fostering global environmental benefits.

Imagine Doris, who is …


Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre Dec 2023

Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre

SMU Data Science Review

Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …


The Impacts Of The Covid-19 Pandemic On Mental Health Across Different Genders And Sexualities, Jiale Zhu, Jonas Katona Nov 2023

The Impacts Of The Covid-19 Pandemic On Mental Health Across Different Genders And Sexualities, Jiale Zhu, Jonas Katona

Undergraduate Research Journal for the Human Sciences

Current studies report an increase in psychological distress as a result of the COVID-19 pandemic. This study is interested in examining mental health disparities and how the COVID-19 pandemic has disproportionately impacted marginalized groups—and more specifically, those identified by sex, gender, and sexuality—compared with the general population. This study also considers the effects and ramifications of different policy measures taken during the course of the pandemic. We perform exploratory data modeling and analysis on several important and publicly available datasets taken during the pandemic on mental health and COVID-19 infection data across various identity groups to look for significant disparities, …


Decentralized Science (Desci): A New Paradigm For Diverse And Sustainable Scientific Development, Feiyue Wang, Wenwen Ding Oct 2023

Decentralized Science (Desci): A New Paradigm For Diverse And Sustainable Scientific Development, Feiyue Wang, Wenwen Ding

Bulletin of Chinese Academy of Sciences (Chinese Version)

The rise of artificial intelligence for science (AI4S) has made it particularly important and urgent to ensure the openness, fairness, impartiality, diversity, and sustainability of scientific systems. This is significant to the discourse power and leadership of countries in global innovation and industrial revolution, and also affects the security, stability, and sustainable development of a community with a shared future for mankind. To address these challenges, AI4S needs to adopt new scientific organizational and operational methods. Decentralized science (DeSci) has emerged to vitalize AI4S and provide strong support, effectively addressing issues such as information silos, biases, unfair distribution, and monopolies …


Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy Aug 2023

Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy

SMU Data Science Review

American Football is a billion-dollar industry in the United States. The analytical aspect of the sport is an ever-growing domain, with open-source competitions like the NFL Big Data Bowl accelerating this growth. With the amount of player movement during each play, tracking data can prove valuable in many areas of football analytics. While concussion detection, catch recognition, and completion percentage prediction are all existing use cases for this data, player-specific movement attributes, such as speed and agility, may be helpful in predicting play success. This research calculates player-specific speed and agility attributes from tracking data and supplements them with descriptive …


Public Acceptance Of Guidance And Regulations For Space Flight Participation, Cory Trunkhill, Robert Joslin, Joseph Keebler May 2023

Public Acceptance Of Guidance And Regulations For Space Flight Participation, Cory Trunkhill, Robert Joslin, Joseph Keebler

Journal of Aviation Technology and Engineering

Space flight participants are not professional astronauts and not subject to the rules and guidance covering space flight crewmembers. Ordinal logistic regression of survey data was utilized to explore public acceptance of current medical screening recommendations and regulations for safety risk and implied liability for civil space flight participation. Independent variables constituted participant demographic representations while dependent variables represented current Federal Aviation Administration guidance and regulations. Odds ratios were derived based on the demographic categories to interpret likelihood of acceptance for the criteria. Significant likely acceptance of guidance and regulations was found for five of twelve demographic variables influencing public …


Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater Apr 2023

Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater

SMU Data Science Review

A chasm exists between the active public equity investment management industry's fundamental, momentum, and quantitative styles. In this study, the researchers explore ways to bridge this gap by leveraging domain knowledge, fundamental analysis, momentum, crowdsourcing, and data science methods. This research also seeks to test the developed tools and strategies during the volatile time period of 2020 and 2021.


Comparison Of Sampling Methods For Predicting Wine Quality Based On Physicochemical Properties, Robert Burigo, Scott Frazier, Eli Kravez, Nibhrat Lohia Apr 2023

Comparison Of Sampling Methods For Predicting Wine Quality Based On Physicochemical Properties, Robert Burigo, Scott Frazier, Eli Kravez, Nibhrat Lohia

SMU Data Science Review

Using the physicochemical properties of wine to predict quality has been done in numerous studies. Given the nature of these properties, the data is inherently skewed. Previous works have focused on handful of sampling techniques to balance the data. This research compares multiple sampling techniques in predicting the target with limited data. For this purpose, an ensemble model is used to evaluate the different techniques. There was no evidence found in this research to conclude that there are specific oversampling methods that improve random forest classifier for a multi-class problem.


A New Generalized Gamma-Weibull Distribution And Its Applications, Nihimat Iyebuhola Aleshinloye, Samuel Adewale Aderoju, Alfred Adewole Abiodun, Bako Lukmon Taiwo Apr 2023

A New Generalized Gamma-Weibull Distribution And Its Applications, Nihimat Iyebuhola Aleshinloye, Samuel Adewale Aderoju, Alfred Adewole Abiodun, Bako Lukmon Taiwo

Al-Bahir Journal for Engineering and Pure Sciences

In this paper, a New Generalized Gamma-Weibull (NGGW) distribution is developed by compounding Weibull and generalized gamma distribution. Some mathematical properties such as moments, Rényi entropy and order statistics are derived and discussed. The maximum likelihood estimation (MLE) method is used to estimate the model parameters. The proposed model is applied to two real-life datasets to illustrate its performance and flexibility as compared to some other competing distributions. The results obtained show that the new distribution fits each of the data better than the other competing distributions.


Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn Mar 2023

Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn

SMU Data Science Review

Today, there is an increased risk to data privacy and information security due to cyberattacks that compromise data reliability and accessibility. New machine learning models are needed to detect and prevent these cyberattacks. One application of these models is cybersecurity threat detection and prevention systems that can create a baseline of a network's traffic patterns to detect anomalies without needing pre-labeled data; thus, enabling the identification of abnormal network events as threats. This research explored algorithms that can help automate anomaly detection on an enterprise network using Canadian Institute for Cybersecurity data. This study demonstrates that Neural Networks with Bayesian …


Biasing Estimator To Mitigate Multicollinearity In Linear Regression Model, Abdulrasheed Bello Badawaire, Issam Dawoud, Adewale Folaranmi Lukman, Victoria Laoye, Arowolo Olatunji Jan 2023

Biasing Estimator To Mitigate Multicollinearity In Linear Regression Model, Abdulrasheed Bello Badawaire, Issam Dawoud, Adewale Folaranmi Lukman, Victoria Laoye, Arowolo Olatunji

Al-Bahir Journal for Engineering and Pure Sciences

A new two-parameter estimator was developed to combat the threat of multicollinearity for the linear regression model. Some necessary and sufficient conditions for the dominance of the proposed estimator over ordinary least squares (OLS) estimator, ridge regression estimator, Liu estimator, KL estimator, and some two-parameter estimators are obtained in the matrix mean square error sense. Theory and simulation results show that, under some conditions, the proposed two-parameter estimator consistently dominates other estimators considered in this study. The real-life application result follows suit.


Aircraft Damage Classification By Using Machine Learning Methods, Tüzün Tolga İnan Jan 2023

Aircraft Damage Classification By Using Machine Learning Methods, Tüzün Tolga İnan

International Journal of Aviation, Aeronautics, and Aerospace

Safety is the most significant factor that affected incidents (non-fatal) and accidents (fatal) in civil aviation history related to scheduled flights. In the history of scheduled flights, the total incident and accident number until 2022 is 1988. In this study, 677 of them are taken into consideration since 11 September 2001. The purpose of this study is to reveal the factors that can classify type of aircraft damages such as none, minor and substantial in all-time incidents and accidents. ML algorithms with different configurations are applied for the classification process. The RFE and PCA are used to find the most …


Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal Jan 2023

Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal

International Journal of Aviation, Aeronautics, and Aerospace

Since there is an uncertainty in the arrival times of flights, pre-scheduled allocation of runways and stands and the subsequent first-come-first-served treatment results in a sub-optimal allocation of runways and stands, this is the prime reason for the unusual delays in taxi-in times at IGIA, New Delhi.

We simulated the arrival pattern of aircraft and utilized stochastic optimization to arrive at the best runway-stands allocation for a day. Optimization is done using a GRG Non-Linear algorithm in the Frontline Systems Analytic Solver platform. We applied this model to eight representative scenarios of two different days. Our results show that without …


Study On Innovation Networks And Its Spillover Effect Of China’S New Energy Automobile Industry, Zhifei Xiong, Wenzhong Zhang Dec 2022

Study On Innovation Networks And Its Spillover Effect Of China’S New Energy Automobile Industry, Zhifei Xiong, Wenzhong Zhang

Bulletin of Chinese Academy of Sciences (Chinese Version)

The network spillover effect of knowledge has been playing an increasingly significant role in the development of industrial innovation. The urban cooperation matrix of China’s new energy automobile industry is built based on new energy automobile patent data, and the structure and evolution process of China’s new energy automobile industry are depicted. On this basis, the spatial Dubin model (SDM) is used to calculate the network spillover effect, and its results are compared with the results of spillover effect based on the relationship of spatial contiguity and distance of cities. The results show that the innovation activities of China’s new …


An Attempt To Develop A Measurement Tool For Interpretation Performance Of Tourist Guides, Gizem Capar, Dilek Atci Oct 2022

An Attempt To Develop A Measurement Tool For Interpretation Performance Of Tourist Guides, Gizem Capar, Dilek Atci

University of South Florida (USF) M3 Publishing

The search for different experiences in touristic visits brings the necessity of differentiating the tours for tour guides with. Interpretation lies at the heart of this differentiation. This research aims to examine the structure of interpretation performance of tour guides empirically within the framework of E.R.O.T/T.O.R.E model. For this purpose, in line with the literature firstly conceptual structure of interpretation performance and interpretative guiding was determined, then expert opinion was sought with the expression pool consisting of draft statements. After expertising process, the measurement tool was first applied on a sample of 191 participants. For preliminary analysis the performance of …


Classification Of Breast Cancer Histopathological Images Using Semi-Supervised Gans, Balaji Avvaru, Nibhrat Lohia, Sowmya Mani, Vijayasrikanth Kaniti Sep 2022

Classification Of Breast Cancer Histopathological Images Using Semi-Supervised Gans, Balaji Avvaru, Nibhrat Lohia, Sowmya Mani, Vijayasrikanth Kaniti

SMU Data Science Review

Breast cancer is diagnosed more frequently than skin cancer in women in the United States. Most breast cancer cases are diagnosed in women, while children and men are less likely to develop the disease. Various tissues in the breast grow uncontrollably, resulting in breast cancer. Different treatments analyze microscopic histopathology images for diagnosis that help accurately detect cancer cells. Deep learning is one of the evolving techniques to classify images where accuracy depends on the volume and quality of labeled images. This study used various pre-trained models to train the histopathological images and analyze these models to create a new …


Predicting Insulin Pump Therapy Settings, Riccardo L. Ferraro, David Grijalva, Alex Trahan Sep 2022

Predicting Insulin Pump Therapy Settings, Riccardo L. Ferraro, David Grijalva, Alex Trahan

SMU Data Science Review

Millions of people live with diabetes worldwide [7]. To mitigate some of the many symptoms associated with diabetes, an estimated 350,000 people in the United States rely on insulin pumps [17]. For many of these people, how effectively their insulin pump performs is the difference between sleeping through the night and a life threatening emergency treatment at a hospital. Three programmed insulin pump therapy settings governing effective insulin pump function are: Basal Rate (BR), Insulin Sensitivity Factor (ISF), and Carbohydrate Ratio (ICR). For many people using insulin pumps, these therapy settings are often not correct, given their physiological needs. While …


Application Of Probabilistic Ranking Systems On Women’S Junior Division Beach Volleyball, Cameron Stewart, Michael Mazel, Bivin Sadler Sep 2022

Application Of Probabilistic Ranking Systems On Women’S Junior Division Beach Volleyball, Cameron Stewart, Michael Mazel, Bivin Sadler

SMU Data Science Review

Women’s beach volleyball is one of the fastest growing collegiate sports today. The increase in popularity has come with an increase in valuable scholarship opportunities across the country. With thousands of athletes to sort through, college scouts depend on websites that aggregate tournament results and rank players nationally. This project partnered with the company Volleyball Life, who is the current market leader in the ranking space of junior beach volleyball players. Utilizing the tournament information provided by Volleyball Life, this study explored replacements to the current ranking systems, which are designed to aggregate player points from recent tournament placements. Three …


Between “Breaking” And “Building”: The Bridge Theory Of Research Evaluation, Fang Xu, Xiaoxuan Li Aug 2022

Between “Breaking” And “Building”: The Bridge Theory Of Research Evaluation, Fang Xu, Xiaoxuan Li

Bulletin of Chinese Academy of Sciences (Chinese Version)

How to build "new standards" after breaking "Siwei" is a hot and difficult issue in the current reform of research evaluation, which urgently needs good theoretical and methodological support. In this context, this study puts forward the BRIDGE theory of research evaluation of scientific researchers' achievements, which is to integrate the reasonable elements in the quantitative evaluation based on SCI papers into the "new standard" based on peer review, so as to build a bridge between quantitative analysis and qualitative evaluation. The practical application of BRIDGE theory is expressed as "Six Steps", in which the second step "Recode" and the …


Adjusting Community Survey Data Benchmarks For External Factors, Allen Miller, Nicole M. Norelli, Robert Slater, Mingyang N. Yu Jun 2022

Adjusting Community Survey Data Benchmarks For External Factors, Allen Miller, Nicole M. Norelli, Robert Slater, Mingyang N. Yu

SMU Data Science Review

Abstract. Using U.S. resident survey data from the National Community Survey in combination with public data from the U.S. Census and additional sources, a Voting Regressor Model was developed to establish fair benchmark values for city performance. These benchmarks were adjusted for characteristics the city cannot easily influence that contribute to confidence in local government, such as population size, demographics, and income. This adjustment allows for a more meaningful comparison and interpretation of survey results among individual cities. Methods explored for the benchmark adjustment included cluster analysis, anomaly detection, and a variety of regression techniques, including random forest, ridge, decision …


Identification And Characterization Of Forest Fire Risk Zones Leveraging Machine Learning Methods, Joshua Balson, Matt Chinchilla, Cam Lu, Jeff Washburn, Nibhrat Lohia Dec 2021

Identification And Characterization Of Forest Fire Risk Zones Leveraging Machine Learning Methods, Joshua Balson, Matt Chinchilla, Cam Lu, Jeff Washburn, Nibhrat Lohia

SMU Data Science Review

Across the United States, record numbers of wildfires are observed costing billions of dollars in property damage, polluting the environment, and putting lives at risk. The ability of emergency management professionals, city planners, and private entities such as insurance companies to determine if an area is at higher risk of a fire breaking out has never been greater. This paper proposes a novel methodology for identifying and characterizing zones with increased risks of forest fires. Methods involving machine learning techniques use the widely available and recorded data, thus making it possible to implement the tool quickly.


Empirical Fitting Of Periodically Repeating Environmental Data, Pavel Bělík, Andrew Hotchkiss, Brandon Perez, John Zobitz Aug 2021

Empirical Fitting Of Periodically Repeating Environmental Data, Pavel Bělík, Andrew Hotchkiss, Brandon Perez, John Zobitz

Spora: A Journal of Biomathematics

We extend and generalize an approach to conduct fitting models of periodically repeating data. Our method first detrends the data from a baseline function and then fits the data to a periodic (trigonometric, polynomial, or piecewise linear) function. The polynomial and piecewise linear functions are developed from assumptions of continuity and differentiability across each time period. We apply this approach to different datasets in the environmental sciences in addition to a synthetic dataset. Overall the polynomial and piecewise linear approaches developed here performed as good (or better) compared to the trigonometric approach when evaluated using statistical measures (R2 …


Modeling Reproduction Influencers Of An Endangered Oak, Camila Cortez Aug 2021

Modeling Reproduction Influencers Of An Endangered Oak, Camila Cortez

DePaul Discoveries

The endemic oak, Quercus brandegeei has been labeled as endangered by the IUCN Red List of Endangered Species due to its limited genetic diversity and lack of regeneration. The oak (Quercus) species is a keystone species in many parts of the world and has been facing various challenges to their survival (Westwood 2017) making efforts to support and protect endemic oaks all the more ecologically and socially imperative. There are challenges to identifying threats as there are many unknown characteristics of Q. brandegeei’s biology that are essential to carrying out conservation efforts. To develop a greater understanding of …


Estimating The Size Of Georgia's Resident Canada Goose Population, Gregory D. Balkcom Feb 2021

Estimating The Size Of Georgia's Resident Canada Goose Population, Gregory D. Balkcom

Georgia Journal of Science

Canada geese (Branta canadensis) are an important waterfowl species in Georgia, and are hunted across the state. To meet management objectives, managers need to understand the impacts of hunting regulations on the population of interest. Therefore, reliable population estimates are necessary. Population size can be estimated by various methods, including aerial surveys, ground surveys, or population indices such as the Lincoln Estimator. I used annual estimates of resident Canada goose harvest in Georgia from the U.S. Fish and Wildlife Service’s Harvest Information Program along with banding and recovery data from the Bird Banding Laboratory in a bias-adjusted version …


Influence Of Some Climatic Elements On Radon Concentration In Saeva Dupka Cave, Bulgaria, Peter Nojarov, Petar Stefanov, Karel Turek Dec 2020

Influence Of Some Climatic Elements On Radon Concentration In Saeva Dupka Cave, Bulgaria, Peter Nojarov, Petar Stefanov, Karel Turek

International Journal of Speleology

This study reveals the influence of some climatic elements on radon concentration in Saeva Dupka Cave, Bulgaria. The research is based mainly on statistical methods. Radon concentration in the cave is determined by two main mechanisms. The first one is through penetration of radon from soil and rocks around the cave (present all year round, but has leading role during the warm half of the year). The second one is through thermodynamic exchange of air between inside of the cave and outside atmosphere (cold half of the year). Climatic factors that affect radon concentration in the cave are temperatures (air, …


Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman Nov 2020

Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman

Access*: Interdisciplinary Journal of Student Research and Scholarship

The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …


A Differential Geometry-Based Machine Learning Algorithm For The Brain Age Problem, Justin Asher, Khoa Tan Dang, Maxwell Masters Aug 2020

A Differential Geometry-Based Machine Learning Algorithm For The Brain Age Problem, Justin Asher, Khoa Tan Dang, Maxwell Masters

The Journal of Purdue Undergraduate Research

No abstract provided.