Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Machine Learning (4)
- Bias (2)
- Classification (2)
- Cloud Computing (2)
- Deep Learning (2)
-
- Heteroscedasticity (2)
- Mean square error (2)
- Random Forest (2)
- Regression (2)
- ARIMA (1)
- ASL (1)
- Academic Consulting (1)
- Acceptance sampling plan (1)
- Actual value and estimated value (1)
- Akaike and Bayesian information criterion (1)
- Analysis (1)
- Analysis of outcomes (1)
- Ancillary variable (1)
- And T-K approximation. (1)
- Andersen LRT (1)
- Applied statistics (1)
- Asymmetry (1)
- Auxiliary information (1)
- Azure Container Instances (1)
- Back-propagation (1)
- Bayesian analysis (1)
- Bayesian estimation (1)
- Behrens-Fisher problem (1)
- Berkson error (1)
- Big Data (1)
Articles 1 - 30 of 38
Full-Text Articles in Physical Sciences and Mathematics
Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas
Personalized Detection Of Anxiety Provoking News Events Using Semantic Network Analysis, Jacquelyn Cheun Phd, Luay Dajani, Quentin B. Thomas
SMU Data Science Review
In the age of hyper-connectivity, 24/7 news cycles, and instant news alerts via social media, mental health researchers don't have a way to automatically detect news content which is associated with triggering anxiety or depression in mental health patients. Using the Associated Press news wire, a semantic network was built with 1,056 news articles containing over 500,000 connections across multiple topics to provide a personalized algorithm which detects problematic news content for a given reader. We make use of Semantic Network Analysis to surface the relationship between news article text and anxiety in readers who struggle with mental health disorders. …
Economic Design Of Acceptance Sampling Plans For Truncated Life Tests Using Three-Parameter Lindley Distribution, Amer Ibrahim Al-Omari, Enrico Ciavolino, Amjad D. Al-Nasser
Economic Design Of Acceptance Sampling Plans For Truncated Life Tests Using Three-Parameter Lindley Distribution, Amer Ibrahim Al-Omari, Enrico Ciavolino, Amjad D. Al-Nasser
Journal of Modern Applied Statistical Methods
A single acceptance sampling plan for the three-parameter Lindley distribution under a truncated life test is developed. For various consumer’s confidence levels, acceptance numbers, and values of the ratio of the experimental time to the specified average lifetime, the minimum sample size important to assert a certain average lifetime are calculated. The operating characteristic (OC) function values as well as the associated producer’s risks are also provided. A numerical example is presented to illustrate the suggested acceptance sampling plans.
The Estimation Of Missing Values In Rectangular Lattice Designs, Emmanuel Ogochukwu Ossai, Abimibola Victoria Oladugba
The Estimation Of Missing Values In Rectangular Lattice Designs, Emmanuel Ogochukwu Ossai, Abimibola Victoria Oladugba
Journal of Modern Applied Statistical Methods
Algebraic expressions for estimating missing data when one or more observation(s) are missing in Rectangular lattice designs with repetition were derived using the method of minimizing the residual sum of squares. Results showed that the estimated value(s) were significantly approximate to that of the actual value(s).
Predicting Wind Turbine Blade Erosion Using Machine Learning, Casey Martinez, Festus Asare Yeboah, Scott Herford, Matt Brzezinski, Viswanath Puttagunta
Predicting Wind Turbine Blade Erosion Using Machine Learning, Casey Martinez, Festus Asare Yeboah, Scott Herford, Matt Brzezinski, Viswanath Puttagunta
SMU Data Science Review
Using time-series data and turbine blade inspection assessments, we present a classification model in order to predict remaining turbine blade life in wind turbines. Capturing the kinetic energy of wind requires complex mechanical systems, which require sophisticated maintenance and planning strategies. There are many traditional approaches to monitoring the internal gearbox and generator, but the condition of turbine blades can be difficult to measure and access. Accurate and cost- effective estimates of turbine blade life cycles will drive optimal investments in repairs and improve overall performance. These measures will drive down costs as well as provide cheap and clean electricity …
Choose Your Own Adventure: An Analysis Of Interactive Gamebooks Using Graph Theory, D'Andre Adams, Daniela Beckelhymer, Alison Marr
Choose Your Own Adventure: An Analysis Of Interactive Gamebooks Using Graph Theory, D'Andre Adams, Daniela Beckelhymer, Alison Marr
Journal of Humanistic Mathematics
"BEWARE and WARNING! This book is different from other books. You and YOU ALONE are in charge of what happens in this story." This is the captivating introduction to every book in the interactive novel series, Choose Your Own Adventure (CYOA). Our project uses the mathematical field of graph theory to analyze forty books from the CYOA book series for ages 9-12. We first began by drawing the digraphs of each book. Then we analyzed these digraphs by collecting structural data such as longest path length (i.e. longest story length) and number of vertices with outdegree zero (i.e. number …
Measure Of Departure From Marginal Average Point-Symmetry For Two-Way Contingency Tables, Kiyotaka Iki, Sadao Tomizawa
Measure Of Departure From Marginal Average Point-Symmetry For Two-Way Contingency Tables, Kiyotaka Iki, Sadao Tomizawa
Journal of Modern Applied Statistical Methods
For the analysis of two-way contingency tables with ordered categories, Yamamoto, Tahata, Suzuki, and Tomizawa (2011) considered a measure to represent the degree of departure from marginal point-symmetry. The maximum value of the measure cannot distinguish two kinds of marginal complete asymmetry with respect to the midpoint. A measure is proposed which can distinguish two kinds of marginal asymmetry with respect to the midpoint. It also gives large-sample confidence interval for the proposed measure.
The Impact Of Equating On Detection Of Treatment Effects, Youn-Jeng Choi, Seohyun Kim, Allan S. Cohen, Zhenqiu Lu
The Impact Of Equating On Detection Of Treatment Effects, Youn-Jeng Choi, Seohyun Kim, Allan S. Cohen, Zhenqiu Lu
Journal of Modern Applied Statistical Methods
Equating makes it possible to compare performances on different forms of a test. Three different equating methods (baseline selection, subgroup, and subscore equating) using common-item item response theory equating were examined for their impact on detection of treatment effects in multilevel models.
Upper Record Values From Extended Exponential Distribution, Devendra Kumar, Sanku Dey
Upper Record Values From Extended Exponential Distribution, Devendra Kumar, Sanku Dey
Journal of Modern Applied Statistical Methods
Some recurrence relations are established for the single and product moments of upper record values for the extended exponential distribution by Nadarajah and Haghighi (2011) as an alternative to the gamma, Weibull, and the exponentiated exponential distributions. Recurrence relations for negative moments and quotient moments of upper record values are also obtained. Using relations of single moments and product moments, means, variances, and covariances of upper record values from samples of sizes up to 10 are tabulated for various values of the shape parameter and scale parameter. A characterization of this distribution based on conditional moments of record …
Asl Reverse Dictionary - Asl Translation Using Deep Learning, Ann Nelson, Kj Price, Rosalie Multari
Asl Reverse Dictionary - Asl Translation Using Deep Learning, Ann Nelson, Kj Price, Rosalie Multari
SMU Data Science Review
The challenges of learning a new language can be reduced with real-time feedback on pronunciation and language usage. Today there are readily available technologies which provide such feedback on spoken languages, by translating the voice of the learner into written text. For someone seeking to learn American Sign Language (ASL), there is however no such feedback application available. A learner of American Sign Language might reference websites or books to obtain an image of a hand sign for a word. This process is like looking up a word in a dictionary, and if the person wanted to know if they …
What Makes A Good Research Consultant?, Justin Harding, Samantha Estrada, Michael Floren
What Makes A Good Research Consultant?, Justin Harding, Samantha Estrada, Michael Floren
The Qualitative Report
Statistical and research consulting is defined as the collaboration of a statistician or methodologist with another professional for devising solutions to research problems. An in-depth, interview qualitative approach was taken to answer the research question of what makes a good research consultant. The authors interviewed four faculty members in the field of statistics and research methods and two experienced graduate student consultants. In-depth, face-to-face interviews revealed common themes regarding consultancy skills, resourcefulness, communication and interpersonal skills. The participants discussed how to improve consulting sessions and deal with clients with different statistics levels and backgrounds. Participants felt there was no difference …
Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley
Leveraging Reviews To Improve User Experience, Anthony Schams, Iram Bakhtiar, Cristina Stanley
SMU Data Science Review
In this paper, we will explore and present a method of finding characteristics of a restaurant using its reviews through machine learning algorithms. We begin by building models to predict the ratings of individual reviews using text and categorical features. This is to examine the efficacy of the algorithms to the task. Both XGBoost and logistic regression will be examined. With these models, our goal is then to identify key phrases in reviews that are correlated with positive and negative experience. Our analysis makes use of review data publicly made available by Yelp. Key bigrams extracted were non-specific to the …
Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia
Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia
SMU Data Science Review
In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory …
Machine Learning Pipeline For Exoplanet Classification, George Clayton Sturrock, Brychan Manry, Sohail Rafiqi
Machine Learning Pipeline For Exoplanet Classification, George Clayton Sturrock, Brychan Manry, Sohail Rafiqi
SMU Data Science Review
Planet identification has typically been a tasked performed exclusively by teams of astronomers and astrophysicists using methods and tools accessible only to those with years of academic education and training. NASA’s Exoplanet Exploration program has introduced modern satellites capable of capturing a vast array of data regarding celestial objects of interest to assist with researching these objects. The availability of satellite data has opened up the task of planet identification to individuals capable of writing and interpreting machine learning models. In this study, several classification models and datasets are utilized to assign a probability of an observation being an exoplanet. …
Tidying And Analysis Of The 2014 Texas English Ii End-Of-Course Exam, David Churchman, Abigail Morton Garland
Tidying And Analysis Of The 2014 Texas English Ii End-Of-Course Exam, David Churchman, Abigail Morton Garland
SMU Data Science Review
The state of Texas requires all public high school students to take End of Course (EOC) exams. The results of these exams are made nominally public, but in a shape and format that precludes ready analysis. To the extent possible, principles of tidy data will be applied to clean and analyze the publicly released data file for the 2014 English II EOC exam, providing insights into the EOC program and a case for better public data from the Texas Education Administration (TEA).
The Andersen Likelihood Ratio Test With A Random Split Criterion Lacks Power, Georg Krammer
The Andersen Likelihood Ratio Test With A Random Split Criterion Lacks Power, Georg Krammer
Journal of Modern Applied Statistical Methods
The Andersen LRT uses sample characteristics as split criteria to evaluate Rasch model fit, or theory driven hypothesis testing for a test. The power and Type I error of a random split criterion was evaluated with a simulation study. Results consistently show a random split criterion lacks power.
Weighted Version Of Generalized Inverse Weibull Distribution, Sofi Mudiasir, S. P. Ahmad
Weighted Version Of Generalized Inverse Weibull Distribution, Sofi Mudiasir, S. P. Ahmad
Journal of Modern Applied Statistical Methods
Weighted distributions are used in many fields, such as medicine, ecology, and reliability. A weighted version of the generalized inverse Weibull distribution, known as weighted generalized inverse Weibull distribution (WGIWD), is proposed. Basic properties including mode, moments, moment generating function, skewness, kurtosis, and Shannon’s entropy are studied. The usefulness of the new model was demonstrated by applying it to a real-life data set. The WGIWD fits better than its submodels, such as length biased generalized inverse Weibull (LGIW), generalized inverse Weibull (GIW), inverse Weibull (IW) and inverse exponential (IE) distributions.
Calibration Of Measurements, Edward Kroc, Bruno D. Zumbo
Calibration Of Measurements, Edward Kroc, Bruno D. Zumbo
Journal of Modern Applied Statistical Methods
Traditional notions of measurement error typically rely on a strong mean-zero assumption on the expectation of the errors conditional on an unobservable “true score” (classical measurement error) or on the data themselves (Berkson measurement error). Weakly calibrated measurements for an unobservable true quantity are defined based on a weaker mean-zero assumption, giving rise to a measurement model of differential error. Applications show it retains many attractive features of estimation and inference when performing a naive data analysis (i.e. when performing an analysis on the error-prone measurements themselves), and other interesting properties not present in the classical or Berkson cases. Applied …
Estimation Of Mean With Two-Parameter Ratio-Product-Ratio Estimator In Double Sampling Using Ancillary Information Under Non-Response, Surya K. Pal, Housila P. Singh
Estimation Of Mean With Two-Parameter Ratio-Product-Ratio Estimator In Double Sampling Using Ancillary Information Under Non-Response, Surya K. Pal, Housila P. Singh
Journal of Modern Applied Statistical Methods
Ratio-product-ratio estimators with two parameters in double sampling under non-response are considered along with their properties. Practical conditions are obtained in which the suggested estimators are more proficient than other existing estimators. An example is given.
Efficient Class Of Estimators For Finite Population Mean Using Auxiliary Information In Two-Occasion Successive Sampling, G. N. Singh, Mohd Khalid
Efficient Class Of Estimators For Finite Population Mean Using Auxiliary Information In Two-Occasion Successive Sampling, G. N. Singh, Mohd Khalid
Journal of Modern Applied Statistical Methods
In the case of sampling on two occasions, a class of estimators is considered which uses information on the first occasion as well as the second occasion in order to estimate the population means on the current (second) occasion. The usefulness of auxiliary information in enhancing the efficiency of this estimation is examined through the class of proposed estimators. Some properties of the class of estimators and a strategy of optimum replacement are discussed. The proposed class of estimators were empirically compared with the sample mean estimator in the case of no matching. The established optimum estimator, which is a …
Jmasm 51: Bayesian Reliability Analysis Of Binomial Model – Application To Success/Failure Data, M. Tanwir Akhtar, Athar Ali Khan
Jmasm 51: Bayesian Reliability Analysis Of Binomial Model – Application To Success/Failure Data, M. Tanwir Akhtar, Athar Ali Khan
Journal of Modern Applied Statistical Methods
Reliability data are generated in the form of success/failure. An attempt was made to model such type of data using binomial distribution in the Bayesian paradigm. For fitting the Bayesian model both analytic and simulation techniques are used. Laplace approximation was implemented for approximating posterior densities of the model parameters. Parallel simulation tools were implemented with an extensive use of R and JAGS. R and JAGS code are developed and provided. Real data sets are used for the purpose of illustration.
A Random Forests Approach To Assess Determinants Of Central Bank Independence, Maddalena Cavicchioli, Angeliki Papana, Ariadni Papana Dagiasis, Barbara Pistoresi
A Random Forests Approach To Assess Determinants Of Central Bank Independence, Maddalena Cavicchioli, Angeliki Papana, Ariadni Papana Dagiasis, Barbara Pistoresi
Journal of Modern Applied Statistical Methods
A non-parametric efficient statistical method, Random Forests, is implemented for the selection of the determinants of Central Bank Independence (CBI) among a large database of economic, political, and institutional variables for OECD countries. It permits ranking all the determinants based on their importance in respect to the CBI and does not impose a priori assumptions on potential nonlinear relationships in the data. Collinearity issues are resolved, because correlated variables can be simultaneously considered.
Maximum Likelihood Estimation For The Generalized Pareto Distribution And Goodness-Of-Fit Test With Censored Data, Minh H. Pham, Chris Tsokos, Bong-Jin Choi
Maximum Likelihood Estimation For The Generalized Pareto Distribution And Goodness-Of-Fit Test With Censored Data, Minh H. Pham, Chris Tsokos, Bong-Jin Choi
Journal of Modern Applied Statistical Methods
The generalized Pareto distribution (GPD) is a flexible parametric model commonly used in financial modeling. Maximum likelihood estimation (MLE) of the GPD was proposed by Grimshaw (1993). Maximum likelihood estimation of the GPD for censored data is developed, and a goodness-of-fit test is constructed to verify an MLE algorithm in R and to support the model-validation step. The algorithms were composed in R. Grimshaw’s algorithm outperforms functions available in the R package ‘gPdtest’. A simulation study showed the MLE method for censored data and the goodness-of-fit test are both reliable.
Bayesian Approximation Techniques For Scale Parameter Of Laplace Distribution, Uzma Jan, S. P. Ahmad
Bayesian Approximation Techniques For Scale Parameter Of Laplace Distribution, Uzma Jan, S. P. Ahmad
Journal of Modern Applied Statistical Methods
The Bayesian estimation of the scale parameter of a Laplace Distribution is obtained using two approximation techniques, like Normal approximation and Tierney and Kadane (T-K) approximation, under different informative priors.
Can One Test Fit All? Responses To The Article “Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations” (Ruxton & Neuhäuser, 2018), Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen
Can One Test Fit All? Responses To The Article “Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations” (Ruxton & Neuhäuser, 2018), Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen
Journal of Modern Applied Statistical Methods
Responses to suggestions made by Ruxton & Neuhäuser (2018) regarding Nguyen et al. (2016) are given.
Φ-Divergence Loss-Based Artificial Neural Network, R. L. Salamwade, D. M. Sakate, S. K. Mathur
Φ-Divergence Loss-Based Artificial Neural Network, R. L. Salamwade, D. M. Sakate, S. K. Mathur
Journal of Modern Applied Statistical Methods
Artificial Neural Networks (ANNs) can fit non-linear functions and recognize patterns better than several standard techniques. Performance of ANNs is measured by using loss functions. Phi-divergence estimator is generalization of maximum likelihood estimator and it possesses all its properties. A neural network is proposed which is trained using phi-divergence loss.
On The Conditional And Unconditional Type I Error Rates And Power Of Tests In Linear Models With Heteroscedastic Errors, Patrick J. Rosopa, Alice M. Brawley, Theresa P. Atkinson, Stephen A. Robertson
On The Conditional And Unconditional Type I Error Rates And Power Of Tests In Linear Models With Heteroscedastic Errors, Patrick J. Rosopa, Alice M. Brawley, Theresa P. Atkinson, Stephen A. Robertson
Journal of Modern Applied Statistical Methods
Preliminary tests for homoscedasticity may be unnecessary in general linear models. Based on Monte Carlo simulations, results suggest that when testing for differences between independent slopes, the unconditional use of weighted least squares regression and HC4 regression performed the best across a wide range of conditions.
Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations, Graeme Ruxton, Markus Neuhäuser
Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations, Graeme Ruxton, Markus Neuhäuser
Journal of Modern Applied Statistical Methods
Nguyen et al. (2016) offered advice to researchers in the commonly-encountered situation where they are interested in testing for a difference in central tendency between two populations. Their data and the available literature support very simple advice that strikes the best balance between ease of implementation, power and reliability. Specifically, apply Satterthwaite’s test, with preliminary ranking of the data if a strong deviation from normality is expected, or is suggested by visual inspection of the data. This simple guideline will serve well except when dealing with small samples of discrete data, when more sophisticated treatment may be required.
Robust Ancova, Curvature, And The Curse Of Dimensionality, Rand Wilcox
Robust Ancova, Curvature, And The Curse Of Dimensionality, Rand Wilcox
Journal of Modern Applied Statistical Methods
There is a substantial collection of robust analysis of covariance (ANCOVA) methods that effectively deals with non-normality, unequal population slope parameters, outliers, and heteroscedasticity. Some are based on the usual linear model and others are based on smoothers (nonparametric regression estimators). However, extant results are limited to one or two covariates. A minor goal here is to extend a recently-proposed method, based on the usual linear model, to situations where there are up to six covariates. The usual linear model might provide a poor approximation of the true regression surface. The main goal is to suggest a method, based on …
Should We Give Up On Causality?, Tom Knapp
Should We Give Up On Causality?, Tom Knapp
Journal of Modern Applied Statistical Methods
No abstract provided.
Logistic Regression: An Inferential Method For Identifying The Best Predictors, Rand Wilcox
Logistic Regression: An Inferential Method For Identifying The Best Predictors, Rand Wilcox
Journal of Modern Applied Statistical Methods
When dealing with a logistic regression model, there is a simple method for estimating the strength of the association between the jth covariate and the dependent variable when all covariates are entered into the model. There is the issue of determining whether the jth independent variable has a stronger or weaker association than the kth independent variable. This note describes a method for dealing with this issue that was found to perform reasonably well in simulations.