Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

12,646 Full-Text Articles 19,929 Authors 6,911,751 Downloads 286 Institutions

All Articles in Statistics and Probability

Faceted Search

12,646 full-text articles. Page 124 of 434.

Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey MacVittie, Robert Slater, Daniel W. Engels 2019 Southern Methodist University

Identifying Undervalued Players In Fantasy Football, Christopher D. Morgan, Caroll Rodriguez, Korey Macvittie, Robert Slater, Daniel W. Engels

SMU Data Science Review

In this paper we present a model to predict player performance in fantasy football. In particular, identifying high-performance players can prove to be a difficult problem, as there are on occasion players capable of high performance whose past metrics give no indication of this capacity. These "sleepers"' are often undervalued, and the acquisition of such players can have notable impact on a fantasy football team's overall performance. We constructed a regression model that accounts for players' past performance and athletic metrics to predict their future performance. The model we built performs favorably in predicting athlete performance in relation to other …


Seeing And Understanding Data, Beverly Wood, Charlotte Bolch 2019 Embry-Riddle Aeronautical University

Seeing And Understanding Data, Beverly Wood, Charlotte Bolch

Publications

Visual displays of data are commonly used today in media reports online or in print. For example, data visualizations are sometimes used as a marketing tool to convince people to purchase a certain product, or they are displayed in articles or magazines as a way to graphically display data to emphasize a certain point. In general, it is hard to imagine the majority of disciplines in science and mathematics not using data visualizations. However, before standard data visualization techniques were developed (and accepted by the community), mathematicians and scientists very rarely used graphical displays or pictures to represent empirical data.


Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels 2019 Southern Methodist University

Machine Learning Predicts Aperiodic Laboratory Earthquakes, Olha Tanyuk, Daniel Davieau, Charles South, Daniel W. Engels

SMU Data Science Review

In this paper we find a pattern of aperiodic seismic signals that precede earthquakes at any time in a laboratory earthquake’s cycle using a small window of time. We use a data set that comes from a classic laboratory experiment having several stick-slip displacements (earthquakes), a type of experiment which has been studied as a simulation of seismologic faults for decades. This data exhibits similar behavior to natural earthquakes, so the same approach may work in predicting the timing of them. Here we show that by applying random forest machine learning technique to the acoustic signal emitted by a laboratory …


Longitudinal Analysis With Modes Of Operation For Aes, Dana Geislinger, Cory Thigpen, Daniel W. Engels 2019 Southern Methodist University

Longitudinal Analysis With Modes Of Operation For Aes, Dana Geislinger, Cory Thigpen, Daniel W. Engels

SMU Data Science Review

In this paper, we present an empirical evaluation of the randomness of the ciphertext blocks generated by the Advanced Encryption Standard (AES) cipher in Counter (CTR) mode and in Cipher Block Chaining (CBC) mode. Vulnerabilities have been found in the AES cipher that may lead to a reduction in the randomness of the generated ciphertext blocks that can result in a practical attack on the cipher. We evaluate the randomness of the AES ciphertext using the standard key length and NIST randomness tests. We evaluate the randomness through a longitudinal analysis on 200 billion ciphertext blocks using logistic regression and …


Texture-Based Deep Neural Network For Histopathology Cancer Whole Slide Image (Wsi) Classification, NELSON Zange TSAKU 2019 Kennesaw State University

Texture-Based Deep Neural Network For Histopathology Cancer Whole Slide Image (Wsi) Classification, Nelson Zange Tsaku

Master of Science in Computer Science Theses

Automatic histopathological Whole Slide Image (WSI) analysis for cancer classification has been highlighted along with the advancements in microscopic imaging techniques. However, manual examination and diagnosis with WSIs is time-consuming and tiresome. Recently, deep convolutional neural networks have succeeded in histopathological image analysis. In this paper, we propose a novel cancer texture-based deep neural network (CAT-Net) that learns scalable texture features from histopathological WSIs. The innovation of CAT-Net is twofold: (1) capturing invariant spatial patterns by dilated convolutional layers and (2) Reducing model complexity while improving performance. Moreover, CAT-Net can provide discriminative texture patterns formed on cancerous regions of histopathological …


Classification With Measurement Error In Covariates Or Response, With Application To Prostate Cancer Imaging Study, Kexin Luo 2019 The University of Western Ontario

Classification With Measurement Error In Covariates Or Response, With Application To Prostate Cancer Imaging Study, Kexin Luo

Electronic Thesis and Dissertation Repository

The research is motivated by the prostate cancer imaging study conducted at the University of Western Ontario to classify cancer status using multiple in-vivo images. The prostate cancer histological image and the in-vivo images are subject to misalignment in the co-registration procedure, which can be viewed as measurement error in covariates or response. We investigate methods to correct this problem.

The first proposed method corrects the predicted class probability when the data has misclassified labels. The correction equation is derived from the relationship between the true response and the error-prone response. The probability for the observed class label is adjusted …


An Hdg Method For Dirichlet Boundary Control Of Convection Dominated Diffusion Pdes, Gang Chen, John R. Singler, Yangwen Zhang 2019 Missouri University of Science and Technology

An Hdg Method For Dirichlet Boundary Control Of Convection Dominated Diffusion Pdes, Gang Chen, John R. Singler, Yangwen Zhang

Mathematics and Statistics Faculty Research & Creative Works

We first propose a hybridizable discontinuous Galerkin (HDG) method to approximate the solution of a convection dominated Dirichlet boundary control problem without constraints. Dirichlet boundary control problems and convection dominated problems are each very challenging numerically due to solutions with low regularity and sharp layers, respectively. Although there are some numerical analysis works in the literature on diffusion dominated convection diffusion Dirichlet boundary control problems, we are not aware of any existing numerical analysis works for convection dominated boundary control problems. Moreover, the existing numerical analysis techniques for convection dominated PDEs are not directly applicable for the Dirichlet boundary control …


Dietary Inflammatory Index And Non-Communicable Disease Risk: A Narrative Review, Catherine M. Phillips, Ling-Wei Chen, Barbara Heude, Jonathan Y. Bernard, Nicholas C. Harvey, Liesbeth Duijts, Sara M. Mensink-Bout, Kinga Polanska, Giulia Mancano, Matthew Suderman, Nitin Shivappa, James R. Hébert 2019 University of South Carolina

Dietary Inflammatory Index And Non-Communicable Disease Risk: A Narrative Review, Catherine M. Phillips, Ling-Wei Chen, Barbara Heude, Jonathan Y. Bernard, Nicholas C. Harvey, Liesbeth Duijts, Sara M. Mensink-Bout, Kinga Polanska, Giulia Mancano, Matthew Suderman, Nitin Shivappa, James R. Hébert

Faculty Publications

There are over 1,000,000 publications on diet and health and over 480,000 references on inflammation in the National Library of Medicine database. In addition, there have now been over 30,000 peer-reviewed articles published on the relationship between diet, inflammation, and health outcomes. Based on this voluminous literature, it is now recognized that low-grade, chronic systemic inflammation is associated with most non-communicable diseases (NCDs), including diabetes, obesity, cardiovascular disease, cancers, respiratory and musculoskeletal disorders, as well as impaired neurodevelopment and adverse mental health outcomes. Dietary components modulate inflammatory status. In recent years, the Dietary Inflammatory Index (DII®), a literature-derived …


Increased Dietary Inflammatory Index Is Associated With Schizophrenia: Results Of A Case–Control Study From Bahrain, Haitham Jahrami, Moez Al-Islam Faris, Hadeel Ghazzawi, Zahra Saif, Layla Habib, Nitin Shivappa, James R. Hébert 2019 University of South Carolina

Increased Dietary Inflammatory Index Is Associated With Schizophrenia: Results Of A Case–Control Study From Bahrain, Haitham Jahrami, Moez Al-Islam Faris, Hadeel Ghazzawi, Zahra Saif, Layla Habib, Nitin Shivappa, James R. Hébert

Faculty Publications

Background: Several studies have indicated that chronic low-grade inflammation is associated with the development of schizophrenia. Given the role of diet in modulating inflammatory markers, excessive caloric intake and increased consumption of pro-inflammatory components such as calorie-dense, nutrient-sparse foods may contribute toward increased rates of schizophrenia. This study aimed to examine the association between dietary inflammation, as measured by the dietary inflammatory index (DII®), and schizophrenia. Methods: A total of 120 cases attending the out-patient department in the Psychiatric Hospital/Bahrain were recruited, along with 120 healthy controls matched on age and sex. The energy-adjusted DII (E-DII) was computed …


Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood 2019 Duquesne University

Identifying Risk Factors Related To Premature Birth Through Binary Logistic And Proportional Odds Ordinal Logistic Regression, Clayton Elwood

Electronic Theses and Dissertations

Premature birth has been identified as the single greatest cause of death worldwide in children under the age of five. This thesis will implement binary logistic regression and proportional odds ordinal logistic regression to predict different levels of premature birth and identify associated risk factors. The models will be built from the Center for Disease Control and Prevention's 2014 Vital Statistics Natality Birth Data containing nearly 4 million live births within the United States. Odds ratios and confidence intervals on risk factors were produced utilizing binary logistic regression.


Garch Modeling Of Value At Risk And Expected Shortfall Using Bayesian Model Averaging, Ismail Kheir 2019 CUNY Hunter College

Garch Modeling Of Value At Risk And Expected Shortfall Using Bayesian Model Averaging, Ismail Kheir

Theses and Dissertations

This thesis conducts Value at Risk (VaR) and Expected Shortfall (ES) estimation using GARCH modeling and Bayesian Model Averaging (BMA). BMA considers multiple models weighted by some information criterion. Through BMA, this thesis finds that VaR and ES estimates can be improved through enhanced modeling of the data generation process.


Beta Regression Models For Repeated-Measures Data Analysis, Nicholas A. Hein 2019 University of Nebraska Medical Center

Beta Regression Models For Repeated-Measures Data Analysis, Nicholas A. Hein

Theses & Dissertations

Bounded data often give rise to uncorrectable skew and heteroscedasticity. Bounded data are a relatively frequent occurrence in clinical and research settings. For example, in neuropsychology, most neurocognitive tests are bounded, and subjects are repeatedly measured over time. The statistician needs to choose a model that accounts for the correlated nature of the repeated measures. The Beta distribution is a natural choice for modeling bounded data. Currently, generalized linear mixed models (GLMM) and generalized estimating equations (GEE) are two methods that can be used to model Beta distributed data with repeated measures. However, GLMMs and GEEs have limitations, i.e., GLMMs …


Sharing Of Injection Drug Preparation Equipment Is Associated With Hiv Infection: A Cross-Sectional Study, Laura J. Ball, Klajdi Puka, Mark Speechley, Ryan Wong, Brian Hallam, Joshua C. Weiner, Sharon Koivu, Michael S. Silverman 2019 Western University

Sharing Of Injection Drug Preparation Equipment Is Associated With Hiv Infection: A Cross-Sectional Study, Laura J. Ball, Klajdi Puka, Mark Speechley, Ryan Wong, Brian Hallam, Joshua C. Weiner, Sharon Koivu, Michael S. Silverman

Epidemiology and Biostatistics Publications

Background: Sharing needles/syringes and sexual transmission are widely appreciated as means of HIV transmission among persons who inject drugs (PWIDs). London, Canada, is experiencing an outbreak of HIV among PWIDs, despite a large needle/syringe distribution program and low rates of needle/syringe sharing.

Objective: To determine whether sharing of injection drug preparation equipment (IDPE) is associated with HIV infection.

Methods: Between August 2016 and June 2017, individuals with a history of injection drug use and residence in London were recruited to complete a comprehensive questionnaire and HIV testing.

Results: A total of 127 participants were recruited; 8 were excluded because of …


Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan 2019 University of Missouri

Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan

John E. Sawyer

Nitrogen (N) fertilizer recommendation tools could be improved for estimating corn (Zea mays L.) N needs by incorporating site-specific soil and weather information. However, an evaluation of analytical methods is needed to determine the success of incorporating this information. The objectives of this research were to evaluate statistical and machine learning (ML) algorithms for utilizing soil and weather information for improving corn N recommendation tools. Eight algorithms [stepwise, ridge regression, least absolute shrinkage and selection operator (Lasso), elastic net regression, principal component regression (PCR), partial least squares regression (PLSR), decision tree, and random forest] were evaluated using a dataset …


Advances In Moment-Based Distributional Methodologies, Yishan Zang 2019 The University of Western Ontario

Advances In Moment-Based Distributional Methodologies, Yishan Zang

Electronic Thesis and Dissertation Repository

This thesis comprises various results that rely on the moments of a distribution or the sample moments associated with a set of observations. Since a sample of size n is uniquely specified by its first n moments, it is pertinent to make use of sample moments for modeling, classification or inference purposes. Three density mixtures are approximated by adjusting in various ways an initial density approximation referred to a base density by means certain moment-based functions, and the accuracy of the resulting density approximants are compared. A similar study is carried out in the context of density estimation. Moreover, it …


Sample Size Calculation Of Clinical Trials With Correlated Outcomes, Dateng Li 2019 Southern Methodist University

Sample Size Calculation Of Clinical Trials With Correlated Outcomes, Dateng Li

Statistical Science Theses and Dissertations

In this thesis, we investigate sample size calculation for three kinds of clinical trials: (1). Randomized controlled trials (RCTs) with longitudinal count outcomes; (2). Cluster randomized trials (CRTs) with count outcomes; (3). CRTs with multiple binary co-primary endpoints.


Effective Statistical Energy Function Based Protein Un/Structure Prediction, Avdesh Mishra 2019 University of New Orleans

Effective Statistical Energy Function Based Protein Un/Structure Prediction, Avdesh Mishra

University of New Orleans Theses and Dissertations

Proteins are an important component of living organisms, composed of one or more polypeptide chains, each containing hundreds or even thousands of amino acids of 20 standard types. The structure of a protein from the sequence determines crucial functions of proteins such as initiating metabolic reactions, DNA replication, cell signaling, and transporting molecules. In the past, proteins were considered to always have a well-defined stable shape (structured proteins), however, it has recently been shown that there exist intrinsically disordered proteins (IDPs), which lack a fixed or ordered 3D structure, have dynamic characteristics and therefore, exist in multiple states. Based on …


Dietary Inflammatory Index And Its Relationship With Cervical Carcinogenesis Risk In Korean Women: A Case-Control Study, Sundara Raj Sreeja, Hyun Yi Lee, Minji kwon, Nitin Shivappa, James R. Hébert, Mi Kyung Kim 2019 University of South Carolina

Dietary Inflammatory Index And Its Relationship With Cervical Carcinogenesis Risk In Korean Women: A Case-Control Study, Sundara Raj Sreeja, Hyun Yi Lee, Minji Kwon, Nitin Shivappa, James R. Hébert, Mi Kyung Kim

Faculty Publications

Several studies have reported that diet’s inflammatory potential is related to chronic diseases such as cancer, but its relationship with cervical cancer risk has not been studied yet. The aim of this study was to investigate the association between Dietary Inflammatory Index (DII®) and cervical cancer risk among Korean women. This study consisted of 764 cases with cervical intraepithelial neoplasia (CIN)1, 2, 3, or cervical cancer, and 729 controls from six gynecologic oncology clinics in South Korea. The DII was computed using a validated semiquantitative Food Frequency Questionnaire (FFQ). Odds ratios and 95% CI were calculated using multinomial …


Exploring The Estimability Of Mark-Recapture Models With Individual, Time-Varying Covariates Using The Scaled Logit Link Function, Jiaqi Mu 2019 The University of Western Ontario

Exploring The Estimability Of Mark-Recapture Models With Individual, Time-Varying Covariates Using The Scaled Logit Link Function, Jiaqi Mu

Electronic Thesis and Dissertation Repository

Mark-recapture studies are often used to estimate the survival of individuals in a population and identify factors that affect survival in order to understand how the population might be affected by changing conditions. Factors that vary between individuals and over time, like body mass, present a challenge because they can only be observed when an individual is captured. Several models have been proposed to deal with the missing-covariate problem and commonly impose a logit link function which implies that the survival probability varies between 0 and 1. In this thesis I explore the estimability of four possible models when survival …


Split Credibility: A Two-Dimensional Semi-Linear Credibility Model, Jingbing Qiu 2019 The University of Western Ontario

Split Credibility: A Two-Dimensional Semi-Linear Credibility Model, Jingbing Qiu

Electronic Thesis and Dissertation Repository

In the thesis, we introduce a two-dimensional semi-linear credibility model, which is an extension of the classical credibility or split credibility models used by practicing actuaries. Our model predicts the future expected losses of a policyholder by considering its historical primary and excess losses. The optimal split point is derived based on the mean squared error criterion. We show when and why splitting a policyholder’s historical losses into primary and excess parts work analytically. In addition, we derived formulas for estimating our model parameters nonparametrically. Finally, we show the application of our model through three examples.


Digital Commons powered by bepress