Open Access. Powered by Scholars. Published by Universities.^{®}
Other Statistics and Probability Commons^{™}
Open Access. Powered by Scholars. Published by Universities.^{®}
 Discipline

 Applied Statistics (65)
 Statistical Models (48)
 Statistical Methodology (38)
 Social and Behavioral Sciences (37)
 Multivariate Analysis (30)

 Business (30)
 Engineering (27)
 Mathematics (21)
 Life Sciences (21)
 Longitudinal Data Analysis and Time Series (20)
 Management Sciences and Quantitative Methods (19)
 Computer Sciences (18)
 Probability (18)
 Applied Mathematics (17)
 Medicine and Health Sciences (15)
 Education (15)
 Statistical Theory (14)
 Design of Experiments and Sample Surveys (13)
 Categorical Data Analysis (12)
 Environmental Sciences (11)
 Electrical and Computer Engineering (11)
 Power and Energy (10)
 Biostatistics (10)
 Oceanography and Atmospheric Sciences and Meteorology (10)
 Climate (9)
 Other Applied Mathematics (9)
 Geography (8)
 Institution

 University of Nebraska  Lincoln (57)
 Selected Works (24)
 University of Pennsylvania (18)
 The University of Maine (14)
 Western University (10)

 Iowa State University (10)
 Florida International University (7)
 California Polytechnic State University, San Luis Obispo (6)
 University of Massachusetts Amherst (4)
 Eastern Illinois University (4)
 City University of New York (CUNY) (4)
 University of Connecticut (3)
 California State University, San Bernardino (3)
 Claremont Colleges (3)
 Georgia Southern University (3)
 Portland State University (3)
 SelectedWorks (3)
 Utah State University (3)
 Purdue University (2)
 EmbryRiddle Aeronautical University (2)
 Cornell University ILR School (2)
 Bryant University (2)
 Washington University in St. Louis (2)
 Virginia Commonwealth University (2)
 Southern Methodist University (2)
 University of North Florida (2)
 University of Tennessee, Knoxville (2)
 Illinois State University (2)
 East Tennessee State University (2)
 Bowling Green State University (1)
 Keyword

 Statistics (12)
 Bayesian Model Averaging and Semiparametric Regression (11)
 Copula Modeling (7)
 Forecasting and Time Series (6)
 Saint John River Watershed (Me. and N.B.) (5)

 Maine (5)
 Mathematics (4)
 GenPred (4)
 New England (4)
 Hydroelectric power plants (4)
 Multivariate Models in Marketing (4)
 Machine learning (4)
 Power resources (3)
 Model selection (3)
 Wildlife management (3)
 Simulation (3)
 Metaregression (3)
 Shared data resource (3)
 Genomic selection (3)
 Pumped storage power plants (3)
 Education (3)
 Beef cattle (2)
 Body fat (2)
 Assessment (2)
 Bootstrap (2)
 Algorithm (2)
 Assault reporting ratio (2)
 ARR (2)
 Analytics (2)
 Asymptotic estimation (2)
 Publication Year
 Publication

 Faculty Publications, Department of Statistics (54)
 Michael Stanley Smith (16)
 Statistics Papers (15)
 DickeyLincoln School Lakes Project (14)
 Electronic Thesis and Dissertation Repository (10)

 FIU Electronic Theses and Dissertations (7)
 Electronic Theses and Dissertations (5)
 Research Bulletin (Iowa Agriculture and Home Economics Experiment Station) (4)
 Statistics (3)
 Electronic Theses, Projects, and Dissertations (3)
 CHIP Documents (3)
 Political Science Publications (3)
 Operations, Information and Decisions Papers (3)
 Publications and Research (3)
 All Graduate Plan B and other Reports (3)
 Doctoral Dissertations (2)
 Annual Symposium on Biomathematics and Ecology: Education and Research (2)
 Cathy Kessel (2)
 Masters Theses (2)
 STAR (STEM Teacher and Researcher) Presentations (2)
 UNF Graduate Theses and Dissertations (2)
 SMU Data Science Review (2)
 Masters Theses 1911  February 2014 (2)
 Research Studies and Reports (2)
 Dissertations and Theses (2)
 Honors Projects in Mathematics (2)
 All Dissertations, Theses, and Capstone Projects (1)
 Alicia L. Carriquiry (1)
 Arts & Sciences Electronic Theses and Dissertations (1)
 All Theses, Dissertations, and Other Capstone Projects (1)
 Publication Type
 File Type
Articles 1  30 of 230
FullText Articles in Other Statistics and Probability
Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett
Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett
All Graduate Plan B and other Reports
Random forests are very popular tools for predictive analysis and data science. They work for both classification (where there is a categorical response variable) and regression (where the response is continuous). Random forests provide proximities, and both local and global measures of variable importance. However, these quantities require special tools to be effectively used to interpret the forest. Rfviz is a sophisticated interactive visualization package and toolkit in R, specially designed for interpreting the results of a random forest in a userfriendly way. Rfviz uses a recently developed R package (loon) from the Comprehensive R Archive Network (CRAN) to create ...
SeasonAhead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri
SeasonAhead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri
Publications and Research
Water risk management is a ubiquitous challenge faced by stakeholders in the water or agricultural sector. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Preseason largescale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest ranked probability skill score and lowest rootmeansquared error in a leaveoneout crossvalidation mode. Adaptive forecasts were made in the years ...
Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack RasmusVorrath, Mooyoung Lee, Daniel W. Engels
Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack RasmusVorrath, Mooyoung Lee, Daniel W. Engels
SMU Data Science Review
In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or nonrecommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or nonrecommended. The model classified review recommendations with an accuracy of 78%. We found that ...
Secondary Data Analysis Project, Jonathan M. Gallimore
Secondary Data Analysis Project, Jonathan M. Gallimore
SF 420 PR  Gallimore  Fall 2018
This activity is designed to give students an opportunity to apply what they have learned in statistics to a real dataset.
This activity will help students apply what they have learned in statistics to real world data and answer their own research questions. Students will also practice reporting their results in a paper using APA format.
Quantitative Jeopardy Feud, Jonathan M. Gallimore
Quantitative Jeopardy Feud, Jonathan M. Gallimore
MSF 600 PR  Gallimore  Fall 2018
This activity  Quantitative Jeopardy Feud  is a method for using a game as a final exam.
On N/PAsymptotic Distribution Of Vector Of Weighted Traces Of Powers Of Wishart Matrices, Jolanta Maria Pielaszkiewicz, Dietrich Von Rosen, Martin Singull
On N/PAsymptotic Distribution Of Vector Of Weighted Traces Of Powers Of Wishart Matrices, Jolanta Maria Pielaszkiewicz, Dietrich Von Rosen, Martin Singull
Electronic Journal of Linear Algebra
The joint distribution of standardized traces of $\frac{1}{n}XX'$ and of $\Big(\frac{1}{n}XX'\Big)^2$, where the matrix $X:p\times n$ follows a matrix normal distribution is proved asymptotically to be multivariate normal under condition $\frac{{n}}{p}\overset{n,p\rightarrow\infty}{\rightarrow}c>0$. Proof relies on calculations of asymptotic moments and cumulants obtained using a recursive formula derived in Pielaszkiewicz et al. (2015). The covariance matrix of the underlying vector is explicitely given as a function of $n$ and $p$.
Goalie Analytics: Statistical Evaluation Of ContextSpecific Goalie Performance Measures In The National Hockey League, Marc Naples, Logan Gage, Amy Nussbaum
Goalie Analytics: Statistical Evaluation Of ContextSpecific Goalie Performance Measures In The National Hockey League, Marc Naples, Logan Gage, Amy Nussbaum
SMU Data Science Review
In this paper, we attempt to improve upon the classic formulation of save percentage in the NHL by controlling the context of the shots and use alternative measures than save percentage. In particular, we find save percentage to be both a weakly repeatable skill and predictor of future performance, and we seek other goalie performance calculations that are more robust. To do so, we use three primary tests to test intraseason consistency, intraseason predictability, and interseason consistency, and extend the analysis to disentangle team effects on goalie statistics. We find that there are multiple ways to improve upon classic save ...
Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen
Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen
Electronic Theses and Dissertations
The bootstrap procedure is widely used in nonparametric statistics to generate an empirical sampling distribution from a given sample data set for a statistic of interest. Generally, the results are good for location parameters such as population mean, median, and even for estimating a population correlation. However, the results for a population variance, which is a spread parameter, are not as good due to the resampling nature of the bootstrap method. Bootstrap samples are constructed using sampling with replacement; consequently, groups of observations with zero variance manifest in these samples. As a result, a bootstrap variance estimator will carry a ...
Standard And Anomalous Wave Transport Inside Random Media, Xujun Ma
Standard And Anomalous Wave Transport Inside Random Media, Xujun Ma
All Dissertations, Theses, and Capstone Projects
This thesis is a study of wave transport inside random media using random matrix theory. Anderson localization plays a central role in wave transport in random media. As a consequence of destructive interference in multiple scattering, the wave function decays exponentially inside random systems. Anderson localization is a wave effect that applies to both classical waves and quantum waves. Random matrix theory has been successfully applied to study the statistical properties of transport and localization of waves. Particularly, the solution of the DorokhovMelloPereyraKumar (DMPK) equation gives the distribution of transmission.
For wave transport in standard one dimensional random systems in ...
Initial Evidence Of Construct Validity Of Data From A SelfAssessment Instrument Of Technological Pedagogical Content Knowledge (Tpack) In 2Year Public College Faculty In Texas, Kristin C. Scott
Human Resource Development Theses and Dissertations
Technological pedagogical content knowledge (TPACK) has been studied in K12 faculty in the U.S. and around the world using survey methodology. Very few studies of TPACK in postsecondary faculty have been conducted and no peerreviewed studies in U.S. postsecondary faculty have been published to date. The present study is the first reliability and validity of data from a TPACK survey to be conducted with a large sample of U.S. postsecondary faculty. The professorate of 2year public college faculty in Texas will help their institutions meet the goals of the state’s higher education strategic plan, 60x30TX. In ...
Waste Management By Waste: Removal Of Acid Dyes From Wastewaters Of Textile Coloration Using Fish Scales, S M Fijul Kabir
Waste Management By Waste: Removal Of Acid Dyes From Wastewaters Of Textile Coloration Using Fish Scales, S M Fijul Kabir
LSU Master's Theses
Removal of hazardous acid dyes by economical process using lowcost biosorbents from wool industry wastewaters is of a pressing need, since it causes skin and respiratory diseases and disrupts other environmental components. Fish scales (FS), a byproduct of fish industry, a type of solid waste, are usually discarded carelessly resulting in pungent odor and environmental burden. In this research, the FS of black drum (Pogonias cromis) were used for the removal of acid dyes (acid red 1 (AR1), acid blue 45 (AB45) and acid yellow 127 (AY126)) from wool industry wastewaters by absorption process with a view to valorizing fish ...
On Some Ridge Regression Estimators For Logistic Regression Models, Ulyana P. Williams
On Some Ridge Regression Estimators For Logistic Regression Models, Ulyana P. Williams
FIU Electronic Theses and Dissertations
The purpose of this research is to investigate the performance of some ridge regression estimators for the logistic regression model in the presence of moderate to high correlation among the explanatory variables. As a performance criterion, we use the mean square error (MSE), the mean absolute percentage error (MAPE), the magnitude of bias, and the percentage of times the ridge regression estimator produces a higher MSE than the maximum likelihood estimator. A Monto Carlo simulation study has been executed to compare the performance of the ridge regression estimators under different experimental conditions. The degree of correlation, sample size, number of ...
On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar
On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar
FIU Electronic Theses and Dissertations
Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo ...
Advances In SemiNonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam
Advances In SemiNonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam
Electronic Thesis and Dissertation Repository
This thesis advocates the use of shrinkage and penalty techniques for estimating the parameters of a regression model that comprises both parametric and nonparametric components and develops seminonparametric density estimation methodologies that are applicable in a regression context.
First, a momentbased approach whereby a univariate or bivariate density function is approximated by means of a suitable initial density function that is adjusted by a linear combination of orthogonal polynomials is introduced. Such adjustments are shown to be mathematically equivalent to making use of standard polynomials in one or two variables. Once extended to apply to density estimation, in which case ...
Building A Better Risk Prevention Model, Steven Hornyak
Building A Better Risk Prevention Model, Steven Hornyak
National YouthAtRisk Conference Savannah
This presentation chronicles the work of Houston County Schools in developing a risk prevention model built on more than ten years of longitudinal student data. In its second year of implementation, Houston AtRisk Profiles (HARP), has proven effective in identifying those students most in need of support and linking them to interventions and supports that lead to improved outcomes and significantly reduces the risk of failure.
Predicting The Next Us President By Simulating The Electoral College, Boyan Kostadinov
Predicting The Next Us President By Simulating The Electoral College, Boyan Kostadinov
Publications and Research
We develop a simulation model for predicting the outcome of the US Presidential election based on simulating the distribution of the Electoral College. The simulation model has two parts: (a) estimating the probabilities for a given candidate to win each state and DC, based on state polls, and (b) estimating the probability that a given candidate will win at least 270 electoral votes, and thus win the White House. All simulations are coded using the highlevel, opensource programming language R. One of the goals of this paper is to promote computational thinking in any STEM field by illustrating how probabilistic ...
Effect Of Neuromodulation Of ShortTerm Plasticity On Information Processing In Hippocampal Interneuron Synapses, Elham Bayat Mokhtari
Effect Of Neuromodulation Of ShortTerm Plasticity On Information Processing In Hippocampal Interneuron Synapses, Elham Bayat Mokhtari
Graduate Student Theses, Dissertations, & Professional Papers
Neurons convey information about the complex dynamic environment in the form of signals. Computational neuroscience provides a theoretical foundation toward enhancing our understanding of nervous system. The aim of this dissertation is to present techniques to study the brain and how it processes information in particular neurons in hippocampus.
We begin with a brief review of the history of neuroscience and biological background of basic neurons. To appreciate the importance of information theory, familiarity with the information theoretic basics is required, these basics are presented in Chapter 2. In Chapter 3, we use information theory to estimate the amount of ...
Queues With Server Utilization Of One, Robert Aidoo
Queues With Server Utilization Of One, Robert Aidoo
Major Papers
In most queueing systems of type GI/G/1, the stability condition requires that the server utilization be strictly less than 1. The standard exception is a D/D/1 system in which stability still holds for server utilization equal to 1. This paper presents other cases when server utilization can equal 1, and discusses their characteristics.
Multiclass Classification Using Support Vector Machines, Duleep Prasanna W. Rathgamage Don
Multiclass Classification Using Support Vector Machines, Duleep Prasanna W. Rathgamage Don
Electronic Theses and Dissertations
In this thesis, we discuss different SVM methods for multiclass classification and introduce the Divide and Conquer Support Vector Machine (DCSVM) algorithm which relies on data sparsity in high dimensional space and performs a smart partitioning of the whole training data set into disjoint subsets that are easily separable. A single prediction performed between two partitions eliminates one or more classes in a single partition, leaving only a reduced number of candidate classes for subsequent steps. The algorithm continues recursively, reducing the number of classes at each step until a final binary decision is made between the last two classes ...
Sequential Probing With A Random Start, Joshua Miller
Sequential Probing With A Random Start, Joshua Miller
HMC Senior Theses
Processing user requests quickly requires not only fast servers, but also demands methods to quickly locate idle servers to process those requests. Methods of finding idle servers are analogous to open addressing in hash tables, but with the key difference that servers may return to an idle state after having been busy rather than staying busy. Probing sequences for open addressing are wellstudied, but algorithms for locating idle servers are less understood. We investigate sequential probing with a random start as a method for finding idle servers, especially in cases of heavy traffic. We present a procedure for finding the ...
Old English Character Recognition Using Neural Networks, Sattajit Sutradhar
Old English Character Recognition Using Neural Networks, Sattajit Sutradhar
Electronic Theses and Dissertations
Character recognition has been capturing the interest of researchers since the beginning of the twentieth century. While the Optical Character Recognition for printed material is very robust and widespread nowadays, the recognition of handwritten materials lags behind. In our digital era more and more historical, handwritten documents are digitized and made available to the general public. However, these digital copies of handwritten materials lack the automatic content recognition feature of their printed materials counterparts. We are proposing a practical, accurate, and computationally efficient method for Old English character recognition from manuscript images. Our method relies on a modern machine learning ...
Existing And Potential Statistical And Computational Approaches For The Analysis Of 3d Ct Images Of Plant Roots, Zheng Xu, Camilo Valdes, Jennifer Clarke
Existing And Potential Statistical And Computational Approaches For The Analysis Of 3d Ct Images Of Plant Roots, Zheng Xu, Camilo Valdes, Jennifer Clarke
Faculty Publications, Department of Statistics
Scanning technologies based on Xray Computed Tomography (CT) have been widely used in many scientific fields including medicine, nanosciences and materials research. Considerable progress in recent years has been made in agronomic and plant science research thanks to Xray CT technology. Xray CT imagebased phenotyping methods enable highthroughput and nondestructive measuring and inference of root systems, which makes downstream studies of complex mechanisms of plants during growth feasible. An impressive amount of plant CT scanning data has been collected, but how to analyze these data efficiently and accurately remains a challenge. We review statistical and computational approaches that have been ...
Data Analysis With Small Samples And NonNormal Data: Nonparametrics And Other Strategies, Carl Siebert, Darcy C. Siebert
Data Analysis With Small Samples And NonNormal Data: Nonparametrics And Other Strategies, Carl Siebert, Darcy C. Siebert
Carl Siebert
No abstract provided.
Making Models With Bayes, Pilar Olid
Making Models With Bayes, Pilar Olid
Electronic Theses, Projects, and Dissertations
Bayesian statistics is an important approach to modern statistical analyses. It allows us to use our prior knowledge of the unknown parameters to construct a model for our data set. The foundation of Bayesian analysis is Bayes' Rule, which in its proportional form indicates that the posterior is proportional to the prior times the likelihood. We will demonstrate how we can apply Bayesian statistical techniques to fit a linear regression model and a hierarchical linear regression model to a data set. We will show how to apply different distributions to Bayesian analyses and how the use of a prior affects ...
Open Source Artificial Intelligence In A Biological/Ecological Context, Trevor Grant
Open Source Artificial Intelligence In A Biological/Ecological Context, Trevor Grant
Annual Symposium on Biomathematics and Ecology: Education and Research
No abstract provided.
Discrete Stochastic Modeling For FirstYear Biology Students, Dmitry Kondrashov
Discrete Stochastic Modeling For FirstYear Biology Students, Dmitry Kondrashov
Annual Symposium on Biomathematics and Ecology: Education and Research
No abstract provided.
Investigating The Student Enrollment Decision At Wku, Alec Brown
Investigating The Student Enrollment Decision At Wku, Alec Brown
Honors College Capstone Experience/Thesis Projects
The purpose of this research is to investigate the relationships between the enrollment decision of firsttime, firstyear students admitted to Western Kentucky University and the amount of financial aid awarded, as well as demographic information. The Division of Enrollment Management provided a SAS dataset containing various information about all WKU students admitted in 2013, 2014, and 2015. Additionally, information about the 2016 class of admitted students was provided. The data has been analyzed in SAS Enterprise Miner. We performed analysis using decision tree modeling and logistic regression modeling. Results of these two procedures indicated the importance of credit hours earned ...
Imputation For Random Forests, Joshua Young
Imputation For Random Forests, Joshua Young
All Graduate Plan B and other Reports
This project introduces two new methods for imputation of missing data in random forests. The new methods are compared against other frequently used imputation methods, including those used in the randomForest package in R. To test the effectiveness of these methods, missing data are imputed into datasets that contain two missing data mechanisms including missing at random and missing completely at random. After imputation, random forests are run on the data and accuracies for the predictions are obtained. Speed is an important aspect in computing; the speeds for all the tested methods are also compared.
One of the new methods ...
The Importance Of Inhaler Technique In Measuring And Calculating Inhaler Adherence, And Its Clinical Outcomes, Imran Sulaiman
The Importance Of Inhaler Technique In Measuring And Calculating Inhaler Adherence, And Its Clinical Outcomes, Imran Sulaiman
PhD theses
Depending on the population studied, crosssectional observational studies suggest that between 14%90% of patients do not use their pressurized metered dose inhaler correctly, while 5060% misuse a dry powder inhaler. This means that unless incorrect technique is acounted for a significant underestimation of how much medication the person actually obtained may be made.
The aim of this thesis was to objectively determine the frequency and importance of inhaler technique errors and to combine these with inhaler use to provide an acurate method of calculating adherence. I then investigated different patterns of inhaler use, determinants of inhaler use and the ...
A Comparison Of Some Confidence Intervals For Estimating The Kurtosis Parameter, Guensley Jerome
A Comparison Of Some Confidence Intervals For Estimating The Kurtosis Parameter, Guensley Jerome
FIU Electronic Theses and Dissertations
Several methods have been proposed to estimate the kurtosis of a distribution. The three common estimators are: g_{2}, G_{2} and b_{2}. This thesis addressed the performance of these estimators by comparing them under the same simulation environments and conditions. The performance of these estimators are compared through confidence intervals by determining the average width and probabilities of capturing the kurtosis parameter of a distribution. We considered and compared classical and nonparametric methods in constructing these intervals. Classical method assumes normality to construct the confidence intervals while the nonparametric methods rely on bootstrap techniques. The bootstrap techniques used ...