Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

1,358 Full-Text Articles 2,008 Authors 853,222 Downloads 156 Institutions

All Articles in Statistical Models

Faceted Search

1,358 full-text articles. Page 5 of 53.

Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn 2023 Southern Methodist University

Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn

SMU Data Science Review

Today, there is an increased risk to data privacy and information security due to cyberattacks that compromise data reliability and accessibility. New machine learning models are needed to detect and prevent these cyberattacks. One application of these models is cybersecurity threat detection and prevention systems that can create a baseline of a network's traffic patterns to detect anomalies without needing pre-labeled data; thus, enabling the identification of abnormal network events as threats. This research explored algorithms that can help automate anomaly detection on an enterprise network using Canadian Institute for Cybersecurity data. This study demonstrates that Neural Networks with Bayesian …


A Characterization Of Bias Introduced Into Forensic Source Identification When There Is A Subpopulation Structure In The Relevant Source Population., Dylan Borchert, Semhar Michael, Christopher Saunders 2023 South Dakota State University

A Characterization Of Bias Introduced Into Forensic Source Identification When There Is A Subpopulation Structure In The Relevant Source Population., Dylan Borchert, Semhar Michael, Christopher Saunders

SDSU Data Science Symposium

In forensic source identification the forensic expert is responsible for providing a summary of the evidence that allows for a decision maker to make a logical and coherent decision concerning the source of some trace evidence of interest. The academic consensus is usually that this summary should take the form of a likelihood ratio (LR) that summarizes the likelihood of the trace evidence arising under two competing propositions. These competing propositions are usually referred to as the prosecution’s proposition, that the specified source is the actual source of the trace evidence, and the defense’s proposition, that another source in a …


Analyzing Relationships With Machine Learning, Oscar Ko 2023 The Graduate Center, City University of New York

Analyzing Relationships With Machine Learning, Oscar Ko

Dissertations, Theses, and Capstone Projects

Procedurally, this project aims to take a dataset, analyze it, and offer insights to the audience in an easy-to-digest format. Conceptually, this project will seek to explore questions like: “Do couples that meet through online dating or dating apps have higher or lower quality relationships?”, “Can any features in this dataset help predict how a subject would rate their relationship quality?”, and “What other insights can I derive from using machine learning for exploratory analysis?” The intended audience for this project is anyone interested in romantic relationships or machine learning.

The dataset is from a Stanford University survey, “How Couples …


Biasing Estimator To Mitigate Multicollinearity In Linear Regression Model, Abdulrasheed Bello Badawaire, Issam Dawoud, Adewale Folaranmi Lukman, Victoria Laoye, Arowolo Olatunji 2023 Department of Mathematics and Statistics, Federal University Wukari, Wukari, Nigeria

Biasing Estimator To Mitigate Multicollinearity In Linear Regression Model, Abdulrasheed Bello Badawaire, Issam Dawoud, Adewale Folaranmi Lukman, Victoria Laoye, Arowolo Olatunji

Al-Bahir Journal for Engineering and Pure Sciences

A new two-parameter estimator was developed to combat the threat of multicollinearity for the linear regression model. Some necessary and sufficient conditions for the dominance of the proposed estimator over ordinary least squares (OLS) estimator, ridge regression estimator, Liu estimator, KL estimator, and some two-parameter estimators are obtained in the matrix mean square error sense. Theory and simulation results show that, under some conditions, the proposed two-parameter estimator consistently dominates other estimators considered in this study. The real-life application result follows suit.


On Partially Observed Tensor Regression, Dinara Miftyakhetdinova 2023 University of Windsor

On Partially Observed Tensor Regression, Dinara Miftyakhetdinova

Major Papers

Tensor data is widely used in modern data science. The interest lies in identifying and characterizing the relationship between tensor datasets and external covariates. These datasets, though, are often incomplete. An efficient nonconvex alternating updating algorithm proposed by J. Zhou et al. in the paper "Partially Observed Dynamic Tensor Response Regression" provides a novel approach. The algorithm handles the problem of unobserved entries by solving an optimization problem of a loss function under the low-rankness, sparsity, and fusion constraints. This analysis aims to understand in detail the proposed algorithms and their theoretical proofs with, potentially, dropping some of the assumptions …


Uniformity Test Based On The Empirical Bernstein Distribution, Ran Sun 2023 University of Windsor

Uniformity Test Based On The Empirical Bernstein Distribution, Ran Sun

Major Papers

In this paper, we firstly review the origin of Bernstein polynomial and the various application of it. Then we review the importance of goodness-of-fit test, especially the uniformity test, and we examine lots of different test statistics proposed by far. After that we suggest two new statistics for testing the uniformity. These two statistics are based on Komogorov-Smirnov test type and Cramér-Von Mises test type, respectively. Also we embed Bernstein polynomial into those test type and take advantage of great approximation performance of this polynomial. Finally, we run a Monte-Carlo simulation to compare the performance of our statistics to those …


Finite Mixtures Of Mean-Parameterized Conway-Maxwell-Poisson Models, Dongying Zhan 2023 University of Kentucky

Finite Mixtures Of Mean-Parameterized Conway-Maxwell-Poisson Models, Dongying Zhan

Theses and Dissertations--Statistics

For modeling count data, the Conway-Maxwell-Poisson (CMP) distribution is a popular generalization of the Poisson distribution due to its ability to characterize data over- or under-dispersion. While the classic parameterization of the CMP has been well-studied, its main drawback is that it is does not directly model the mean of the counts. This is mitigated by using a mean-parameterized version of the CMP distribution. In this work, we are concerned with the setting where count data may be comprised of subpopulations, each possibly having varying degrees of data dispersion. Thus, we propose a finite mixture of mean-parameterized CMP distributions. An …


Classification Of Adult Income Using Decision Tree, Roland Fiagbe 2023 University of Central Florida

Classification Of Adult Income Using Decision Tree, Roland Fiagbe

Data Science and Data Mining

Decision tree is a commonly used data mining methodology for performing classification tasks. It is a tree-based supervised machine learning algorithm that is used to classify or make predictions in a path of how previous questions are answered. Generally, the decision tree algorithm categorizes data into branch-like segments that develop into a tree that contains a root, nodes, and leaves. This project seeks to explore the decision tree methodology and apply it to the Adult Income dataset from the UCI Machine Learning Repository, to determine whether a person makes over 50K per year and determine the necessary factors that improve …


Bayesian Structural Time Series Methods For Modeling Cattle Body Temperature In Heat-Stressed Animals, Lacey Quandt 2023 Murray State University

Bayesian Structural Time Series Methods For Modeling Cattle Body Temperature In Heat-Stressed Animals, Lacey Quandt

Murray State Theses and Dissertations

Climate change has had devastating effects globally, most commonly talked about during natural disasters and rising temperatures. Notably, the climate concern is turning towards agriculture and livestock. With rising temperatures, the prolonged amount of heat stress put on animals, specifically cattle, is becoming more apparent. Heat stress has been linked to a reduction in cattle growing and fattening, feed intake, productivity, reproduction, and fertility; increased heart rates and respiration; changes in behavior; and mortality in severe cases. There are abatement strategies put in place to lower heat stress in cattle, such as improvements in shading and cooling, nutritional management, and …


The Impact Of Subjective Risk Analysis On Real Estate Prices In The Nisqually Region Following The 2001 Nisqually Earthquake, Ryan Espedal 2023 Central Washington University

The Impact Of Subjective Risk Analysis On Real Estate Prices In The Nisqually Region Following The 2001 Nisqually Earthquake, Ryan Espedal

All Master's Theses

Earthquakes are an environmental hazard that pose great risks to communities almost every day. With earthquakes, the main cause of concern is physical destruction of property, however, there are also psychological effects that are researched and discussed much less. In 2001, the Nisqually area of western Washington experienced a substantial earthquake that produced minimal physical damage but caused a significant decrease in real estate prices. Studying single-family homes from 1986-2012, this research utilizes hedonic property models to measure the change in consumer’s subjective risk calculations with reference to real estate purchases after the Nisqually earthquake, measure the relationship between earthquake …


Forecasting Remission Time Of A Treatment Method For Leukemia As An Application To Statistical Inference Approach, Ahmed Galal Atia, Mahmoud Mansour, Rashad Mohamed El-Sagheer, B. S. El-Desouky 2023 Mansoura University

Forecasting Remission Time Of A Treatment Method For Leukemia As An Application To Statistical Inference Approach, Ahmed Galal Atia, Mahmoud Mansour, Rashad Mohamed El-Sagheer, B. S. El-Desouky

Basic Science Engineering

In this paper, Weibull-Linear Exponential distribution (WLED) has been investigated whether being it is a well-fit distribution to a clinical real data. These data represent the duration of remission achieved by a certain drug used in the treatment of leukemia for a group of patients. The statistical inference approach is used to estimate the parameters of the WLED through the set of the fitted data. The estimated parameters are utilized to evaluate the survival and hazard functions and hence assessing the treatment method through forecasting the duration of remission times of patients. A two-sample prediction approach has been applied to …


The Birds And The Trees: Quantifying The Drivers Of Whitebark Pine Decline And Clark's Nutcracker Habitat Use In Glacier National Park, Vladimir Kovalenko 2023 The University Of Montana

The Birds And The Trees: Quantifying The Drivers Of Whitebark Pine Decline And Clark's Nutcracker Habitat Use In Glacier National Park, Vladimir Kovalenko

Graduate Student Theses, Dissertations, & Professional Papers

Whitebark pine (Pinus albicaulis), recently listed as threatened under the Endangered Species Act, is in steep decline in Glacier National Park, Montana, USA due to the non-native pathogen Cronartium ribicola, causal agent of the fatal disease white pine blister rust. A sample of the park’s population suggests that approximately 70 percent of whitebark pines have died, while 65 percent of the remaining trees are infected. Using landscape and climate variables, we show how geographic location, elevation, aspect, solar radiation, relative humidity, and snowpack interact with tree diameter to affect mortality, disease incidence, cone production, and regeneration. We also examine how …


Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu 2023 Claremont Colleges

Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu

CMC Senior Theses

This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …


Statistical Methods For Gene Selection And Genetic Association Studies, Xuewei Cao 2023 Michigan Technological University

Statistical Methods For Gene Selection And Genetic Association Studies, Xuewei Cao

Dissertations, Master's Theses and Master's Reports

This dissertation includes five Chapters. A brief description of each chapter is organized as follows.

In Chapter One, we propose a signed bipartite genotype and phenotype network (GPN) by linking phenotypes and genotypes based on the statistical associations. It provides a new insight to investigate the genetic architecture among multiple correlated phenotypes and explore where phenotypes might be related at a higher level of cellular and organismal organization. We show that multiple phenotypes association studies by considering the proposed network are improved by incorporating the genetic information into the phenotype clustering.

In Chapter Two, we first illustrate the proposed GPN …


Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi 2023 University of Nebraska - Lincoln

Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi

Honors Theses

All populations display patterns in allele frequencies over time. Some alleles cease to exist, while some grow to become the norm. These frequencies can shift or stay constant based on the conditions the population lives in. If in Hardy-Weinberg equilibrium, the allele frequencies stay constant. Most populations, however, have bias from environmental factors, sexual preferences, other organisms, etc. We propose a stochastic Markov chain model to study allele progression across generations. In such a model, the allele frequencies in the next generation depend only on the frequencies in the current one.

We use this model to track a recessive allele …


Potential Alzheimer's Disease Plasma Biomarkers, Taylor Estepp 2023 University of Kentucky

Potential Alzheimer's Disease Plasma Biomarkers, Taylor Estepp

Theses and Dissertations--Epidemiology and Biostatistics

In this series of studies, we examined the potential of a variety of blood-based plasma biomarkers for the identification of Alzheimer's disease (AD) progression and cognitive decline. With the end goal of studying these biomarkers via mixture modeling, we began with a literature review of the methodology. An examination of the biomarkers with demographics and other health factors found evidence of minimal risk of confounding along the causal pathway from biomarkers to cognitive performance. Further study examined the usefulness of linear combinations of biomarkers, achieved via partial least squares (PLS) analysis, as predictors of various cognitive assessment scores and clinical …


Network Intrusion Detection Using Deep Reinforcement Learning, Hamed T. Sanusi 2023 Georgia Southern University

Network Intrusion Detection Using Deep Reinforcement Learning, Hamed T. Sanusi

Electronic Theses and Dissertations

This thesis delves into cybersecurity by applying Deep Reinforcement(DRL) Learning in network intrusion detection. One advantage of DRL is the ability to adapt to changing network conditions and evolving attack methods, making it a promising solution for addressing the challenges involved in intrusion detection. The thesis will also discuss the obstacles and benefits of using Classification methods for network intrusion detection and the need for high-quality training data. To train and test our proposed method, the NSL-KDD dataset was used and then adjusted by converting it from a multi-classification to a binary classification, achieved by joining all attacks into one. …


The Influence Of Urban Forms And Street Infrastructure On Pedestrian-Motorist Collisions, Taylor J. Foreman 2023 Georgia Southern University

The Influence Of Urban Forms And Street Infrastructure On Pedestrian-Motorist Collisions, Taylor J. Foreman

Electronic Theses and Dissertations

Unwalkable cities are afflicted by serious issues such as increasing rates of pedestrian traffic accidents, public health concerns, and the denied right to have an accessible city. This study examines how different types of urban forms and street infrastructure contribute to the prevalence of traffic accidents in two major metropolitan cities in the United States: Atlanta, Georgia, and Boston, Massachusetts. This study utilizes geospatial analysis through the Average Nearest Neighbor and Optimized Hot Spot Analysis tools to determine the spatial distribution of traffic accidents throughout both cities. Additionally, statistical tests were conducted to explore the relationships between the number of …


Aircraft Damage Classification By Using Machine Learning Methods, Tüzün Tolga İnan 2023 Bahcesehir University

Aircraft Damage Classification By Using Machine Learning Methods, Tüzün Tolga İnan

International Journal of Aviation, Aeronautics, and Aerospace

Safety is the most significant factor that affected incidents (non-fatal) and accidents (fatal) in civil aviation history related to scheduled flights. In the history of scheduled flights, the total incident and accident number until 2022 is 1988. In this study, 677 of them are taken into consideration since 11 September 2001. The purpose of this study is to reveal the factors that can classify type of aircraft damages such as none, minor and substantial in all-time incidents and accidents. ML algorithms with different configurations are applied for the classification process. The RFE and PCA are used to find the most …


Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, RAJIB DAS, SAILESWAR GHOSH, RAJENDRA DESAI, PIJUS KANTI BHUIN, STUTI AGARWAL 2023 Brainware University, Kolkata

Stochastic Optimization To Reduce Aircraft Taxi-In Time At Igia, New Delhi, Rajib Das, Saileswar Ghosh, Rajendra Desai, Pijus Kanti Bhuin, Stuti Agarwal

International Journal of Aviation, Aeronautics, and Aerospace

Since there is an uncertainty in the arrival times of flights, pre-scheduled allocation of runways and stands and the subsequent first-come-first-served treatment results in a sub-optimal allocation of runways and stands, this is the prime reason for the unusual delays in taxi-in times at IGIA, New Delhi.

We simulated the arrival pattern of aircraft and utilized stochastic optimization to arrive at the best runway-stands allocation for a day. Optimization is done using a GRG Non-Linear algorithm in the Frontline Systems Analytic Solver platform. We applied this model to eight representative scenarios of two different days. Our results show that without …


Digital Commons powered by bepress