Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

10368 Full-Text Articles 13657 Authors 2450423 Downloads 191 Institutions

All Articles in Statistics and Probability

Faceted Search

10368 full-text articles. Page 1 of 287.

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom 2018 Harvey Mudd College

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom

HMC Senior Theses

Since some of the location of where the users posted their tweets collected by social media company have varied accuracy, and some are missing. We want to use those tweets with highest accuracy to help fill in the data of those tweets with incomplete information. To test our algorithm, we used the sets of social media data from a city, we separated them into training sets, where we know all the information, and the testing sets, where we intentionally pretend to not know the location. One prediction method that was used in (Dukler, Han and Wang, 2016) requires appending one-hot ...


Prediction Intervals For Functional Data, Nicholas Rios 2018 Montclair State University

Prediction Intervals For Functional Data, Nicholas Rios

Theses, Dissertations and Culminating Projects

The prediction of functional data samples has been the focus of several functional data analysis endeavors. This work describes the use of dynamic function-on-function regression for dynamic prediction of the future trajectory as well as the construction of dynamic prediction intervals for functional data. The overall goals of this thesis are to assess the efficacy of Dynamic Penalized Function-on-Function Regression (DPFFR) and to compare DPFFR prediction intervals with those of other dynamic prediction methods. To make these comparisons, metrics are used that measure prediction error, prediction interval width, and prediction interval coverage. Simulations and applications to financial stock data from ...


Using Data Analytics For Discovering Library Resource Insights – Case From Singapore Management University, Ning LU, Rui SONG, Dina HENG, Swapna GOTTIPATI, Chee Hsien Aaron (ZHENG Zhixian) TAY, Aaron TAY 2017 Singapore Management University

Using Data Analytics For Discovering Library Resource Insights – Case From Singapore Management University, Ning Lu, Rui Song, Dina Heng, Swapna Gottipati, Chee Hsien Aaron (Zheng Zhixian) Tay, Aaron Tay

Research Collection School Of Information Systems

Library resources are critical in supporting teaching, research and learning processes. Several universities have employed online platforms and infrastructure for enabling the online services to students, faculty and staff. To provide efficient services by understanding and predicting user needs libraries are looking into the area of data analytics. Library analytics in Singapore Management University is the project committed to provide an interface for data-intensive project collaboration, while supporting one of the library’s key pillars on its commitment to collaborate on initiatives with SMU Communities and external groups. In this paper, we study the transaction logs for user behavior analysis ...


Facilitating Student Engagement Research: A Historical Analogy For Understanding And Applying Naturalistic Inquiry, Lane G. Perry III, April Perry 2017 Western Carolina University

Facilitating Student Engagement Research: A Historical Analogy For Understanding And Applying Naturalistic Inquiry, Lane G. Perry Iii, April Perry

Journal of Research Initiatives

This paper offers a historical theoretical discussion and practical perspective on the qualitative paradigm of inquiry referred to as Naturalistic Inquiry (Lincoln & Guba, 1985). Moreover, it endeavors to demonstrate the paradigm’s versatility and usefulness when attempting to illuminate phenomena that specifically occur when students experience and interact with engaging, innovative, and experientially based pedagogies (e.g., service-learning, work-integrated learning, community-based learning). This paper presents and paradigmatically supports the researchers’ worldview through a logical primacy and discussion of ontological, epistemological, axiological, and methodological perspectives (Guba & Lincoln, 2001). Following this, Naturalistic Inquiry is identified as a paradigm of inquiry that aligns ...


Optimized Adaptive Enrichment Designs For Multi-Arm Trials: Learning Which Subpopulations Benefit From Different Treatments, Jon Arni Steingrimsson, Joshua Betz, Tiachen Qian, Michael Rosenblum 2017 Department of Biostatistics, Brown School of Public Health

Optimized Adaptive Enrichment Designs For Multi-Arm Trials: Learning Which Subpopulations Benefit From Different Treatments, Jon Arni Steingrimsson, Joshua Betz, Tiachen Qian, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose a class of adaptive randomized trial designs for comparing two treatments to a common control in two disjoint subpopulations. The type of adaptation, called adaptive enrichment, involves a preplanned rule for modifying enrollment and arm assignment based on accruing data in an ongoing trial. The motivation for this adaptive feature is that interim data may indicate that a subpopulation, such as those with lower disease severity at baseline, are unlikely to benefit from a particular treatment, while uncertainty remains for the other treatment and/or subpopulation. We developed a new multiple testing procedure tailored to this design problem ...


Addressing Endogeneity In Actor-Speci Fi C Network Measures, Frederick J. Boehmke, Olga Chyzh, Cameron G. Thies 2017 University of Iowa

Addressing Endogeneity In Actor-Speci Fi C Network Measures, Frederick J. Boehmke, Olga Chyzh, Cameron G. Thies

Frederick J Boehmke

The study of international relations (IR), and political science more broadly, has derived great benefits from the recent growth of conceptualizing and modeling political phenomena within their broader network contexts. More than just a novel approach to evaluating the old puzzles, network analysis provides a whole new way of theoretical thinking. Challenging the traditional dyad-driven approach to the study of IR, networks highlight actor interdependence that goes beyond dyads and emphasizes that many traditional IR variables, such as conflict, trade, alliances, or international organization memberships must be treated and studied as networks. Properties of these networks (e.g., polarization, density ...


Perfect Ratings With Negative Comments: Learning From Contradictory Patient Survey Responses, Andrew S. Gallan, Marina Girju, Roxana Girju 2017 DePaul University

Perfect Ratings With Negative Comments: Learning From Contradictory Patient Survey Responses, Andrew S. Gallan, Marina Girju, Roxana Girju

Patient Experience Journal

This research explores why patients give perfect domain scores yet provide negative comments on surveys. In order to explore this phenomenon, vendor-supplied in-patient survey data from eleven different hospitals of a major U.S. health care system were utilized. The dataset included survey scores and comments from 56,900 patients, collected from January 2015 through October 2016. Of the total number of responses, 30,485 (54%) contained at least one comment. For our analysis, we use a two-step approach: a quantitative analysis on the domain scores augmented by a qualitative text analysis of patients’ comments. To focus the research, we ...


Using The Roc Curve To Measure Association And Evaluate Prediction Accuracy For A Binary Outcome, Jingjing Yin, Robert L. Vogel 2017 Georgia Southern University

Using The Roc Curve To Measure Association And Evaluate Prediction Accuracy For A Binary Outcome, Jingjing Yin, Robert L. Vogel

Jingjing Yin

This review article addresses the ROC curve and its advantage over the odds ratio to measure the association between a continuous variable and a binary outcome. A simple parametric model under the normality assumption and the method of Box-Cox transformation for non-normal data are discussed. Applications of the binormal model and the Box-Cox transformation under both univariate and multivariate inference are illustrated by a comprehensive data analysis tutorial. Finally, a summary and recommendations are given as to the usage of the binormal ROC curve.


Data Analysis Basics – Part Ii, Judith A. Savageau 2017 University of Massachusetts Medical School

Data Analysis Basics – Part Ii, Judith A. Savageau

Center for Health Policy and Research (CHPR) Publications

Blog post to AEA365, a blog sponsored by the American Evaluation Association (AEA) dedicated to highlighting Hot Tips, Cool Tricks, Rad Resources, and Lessons Learned for evaluators. The American Evaluation Association is an international professional association of evaluators devoted to the application and exploration of program evaluation, personnel evaluation, technology, and many other forms of evaluation. Evaluation involves assessing the strengths and weaknesses of programs, policies, personnel, products, and organizations to improve their effectiveness.


Data Analysis Basics – Part I, Judith A. Savageau 2017 University of Massachusetts Medical School

Data Analysis Basics – Part I, Judith A. Savageau

Center for Health Policy and Research (CHPR) Publications

Blog post to AEA365, a blog sponsored by the American Evaluation Association (AEA) dedicated to highlighting Hot Tips, Cool Tricks, Rad Resources, and Lessons Learned for evaluators. The American Evaluation Association is an international professional association of evaluators devoted to the application and exploration of program evaluation, personnel evaluation, technology, and many other forms of evaluation. Evaluation involves assessing the strengths and weaknesses of programs, policies, personnel, products, and organizations to improve their effectiveness.


Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai 2017 Utah State University

Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai

All Graduate Theses and Dissertations

Other research reported that genetic mechanism plays a major role in the development process of biological shapes. The primary goal of this dissertation is to develop novel statistical models to investigate the quantitative relationships between biological shapes and genetic variants. However, these problems can be extremely challenging to traditional statistical models for a number of reasons: 1) the biological phenotypes cannot be effectively represented by single-valued traits, while traditional regression only handles one dependent variable; 2) in real-life genetic data, the number of candidate genes to be investigated is extremely large, and the signal-to-noise ratio of candidate genes is expected ...


Lung Flute Improves Symptoms And Health Status In Copd With Chronic Bronchitis A 26 Week Randomized Controlled Trial, Sanjay Sethi, Jingjing Yin, Pamela K. Anderson 2017 University of Buffalo

Lung Flute Improves Symptoms And Health Status In Copd With Chronic Bronchitis A 26 Week Randomized Controlled Trial, Sanjay Sethi, Jingjing Yin, Pamela K. Anderson

Jingjing Yin

Background: Chronic obstructive pulmonary disease (COPD) is characterized by mucus hypersecretion that contributes to disease related morbidity and is associated with increased mortality. The Lung Flute® is a new respiratory device that produces a low frequency acoustic wave with moderately vigorous exhalation to increase mucus clearance. We hypothesized that the Lung Flute, used on a twice daily basis will provide clinical benefit to patients with COPD with chronic bronchitis. Methods: We performed a 26 week randomized, non-intervention controlled, single center, open label trial in 69 patients with COPD and Chronic Bronchitis. The primary endpoint was change in respiratory symptoms measured ...


Evaluating The Efficiency Of Treatment Comparison In Crossover Design By Allocating Subjects Based On Ranked Auxiliary Variable, Yisong Huang, Hani Samawi, Robert Vogel, Jingjing Yin, Worlanyo E. Gato, Daniel Linder 2017 Georgia Southern University

Evaluating The Efficiency Of Treatment Comparison In Crossover Design By Allocating Subjects Based On Ranked Auxiliary Variable, Yisong Huang, Hani Samawi, Robert Vogel, Jingjing Yin, Worlanyo E. Gato, Daniel Linder

Jingjing Yin

The validity of statistical inference depends on proper randomization methods. However, even with proper randomization, we can have imbalanced with respect to important characteristics. In this paper, we introduce a method based on ranked auxiliary variables for treatment allocation in crossover designs using Latin squares models. We evaluate the improvement of the efficiency in treatment comparisons using the proposed method. Our simulation study reveals that our proposed method provides a more powerful test compared to simple randomization with the same sample size. The proposed method is illustrated by conducting an experiment to compare two different concentrations of titanium dioxide nanofiber ...


Estimation Of P(X > Y) When X And Y Are Dependent Random Variables Using Different Bivariate Sampling Schemes, Hani M. Samawi, Amal Helu, Haresh Rochani, Jingjing Yin, Daniel Linder 2017 Georgia Southern University

Estimation Of P(X > Y) When X And Y Are Dependent Random Variables Using Different Bivariate Sampling Schemes, Hani M. Samawi, Amal Helu, Haresh Rochani, Jingjing Yin, Daniel Linder

Jingjing Yin

The stress-strength models have been intensively investigated in the literature in regards of estimating the reliability θ = P (X > Y) using parametric and nonparametric approaches under different sampling schemes when X and Y are independent random variables. In this paper, we consider the problem of estimating θ when (X, Y) are dependent random variables with a bivariate underlying distribution. The empirical and kernel estimates of θ = P (X > Y), based on bivariate ranked set sampling (BVRSS) are considered, when (X, Y) are paired dependent continuous random variables. The estimators obtained are compared to their counterpart, bivariate simple random sampling (BVSRS ...


Constructing A Confidence Interval For The Fraction Who Benefit From Treatment, Using Randomized Trial Data, Emily J. Huang, Ethan X. Fang, Daniel F. Hanley, Michael Rosenblum 2017 Johns Hopkins University School of Public Health, Department of Biostatistics

Constructing A Confidence Interval For The Fraction Who Benefit From Treatment, Using Randomized Trial Data, Emily J. Huang, Ethan X. Fang, Daniel F. Hanley, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

The fraction who benefit from treatment is the proportion of patients whose potential outcome under treatment is better than that under control. Inference on this parameter is challenging since it is only partially identifiable, even in our context of a randomized trial. We propose a new method for constructing a confidence interval for the fraction, when the outcome is ordinal or binary. Our confidence interval procedure is pointwise consistent. It does not require any assumptions about the joint distribution of the potential outcomes, although it has the flexibility to incorporate various user-defined assumptions. Unlike existing confidence interval methods for partially ...


Using Ranked Auxiliary Covariate As A More Efficient Sampling Design For Ancova Model: Analysis Of A Psychological Intervention To Buttress Resilience, Rajai Jabrah, Hani Samawi, Robert Vogel, Haresh Rochani, Daniel Linder 2017 Georgia Southern University

Using Ranked Auxiliary Covariate As A More Efficient Sampling Design For Ancova Model: Analysis Of A Psychological Intervention To Buttress Resilience, Rajai Jabrah, Hani Samawi, Robert Vogel, Haresh Rochani, Daniel Linder

Robert L. Vogel

Drawing a sample can be costly or time consuming in some studies. However, it may be possible to rank the sampling units according to some baseline auxiliary covariates, which are easily obtainable, and/or cost efficient. Ranked set sampling (RSS) is a method to achieve this goal. In this paper, we propose a modified approach of the RSS method to allocate units into an experimental study that compares L groups. Computer simulation estimates the empirical nominal values and the empirical power values for the test procedure of comparing L different groups using modified RSS based on the regression approach in ...


Evaluating The Efficiency Of Treatment Comparison In Crossover Design By Allocating Subjects Based On Ranked Auxiliary Variable, Yisong Huang, Hani Samawi, Robert Vogel, Jingjing Yin, Worlanyo E. Gato, Daniel Linder 2017 Georgia Southern University

Evaluating The Efficiency Of Treatment Comparison In Crossover Design By Allocating Subjects Based On Ranked Auxiliary Variable, Yisong Huang, Hani Samawi, Robert Vogel, Jingjing Yin, Worlanyo E. Gato, Daniel Linder

Robert L. Vogel

The validity of statistical inference depends on proper randomization methods. However, even with proper randomization, we can have imbalanced with respect to important characteristics. In this paper, we introduce a method based on ranked auxiliary variables for treatment allocation in crossover designs using Latin squares models. We evaluate the improvement of the efficiency in treatment comparisons using the proposed method. Our simulation study reveals that our proposed method provides a more powerful test compared to simple randomization with the same sample size. The proposed method is illustrated by conducting an experiment to compare two different concentrations of titanium dioxide nanofiber ...


Correction Of Verification Bias By Application Of Homogeneous Log-Linear Models For A Single Scale Binary Diagnostic Test, Haresh D. Rochani, Hani M. Samawi, Robert L. Vogel, Jingjing Yin 2017 Georgia Southern University

Correction Of Verification Bias By Application Of Homogeneous Log-Linear Models For A Single Scale Binary Diagnostic Test, Haresh D. Rochani, Hani M. Samawi, Robert L. Vogel, Jingjing Yin

Robert L. Vogel

In patient management and control of many infectious diseases it is very crucial to have accurate diagnostic test. The test/procedure that determines the true disease status without an error is referred to as gold standard test. Even when a gold standard exist, it is extremely difficult to verify each patient due to the issues of cost-effectiveness and invasive nature of the procedures. In practice some of the patients with test results are not selected for verification of the disease status which results into a verification bias for diagnostic tests. The ability of the diagnostic tests to correctly identify the ...


Persistent Organic Pollutants And Mortality In The United States, Nhanes 1999-2011., Kristiann Fry, Melinda C Power 2017 George Washington University

Persistent Organic Pollutants And Mortality In The United States, Nhanes 1999-2011., Kristiann Fry, Melinda C Power

Epidemiology and Biostatistics Faculty Publications

Background

Persistent organic pollutants (POPs) are environmentally and biologically persistent chemicals that include polybrominated diphenyl ethers (PBDEs), per- and polyfluoroalkyl substances (PFASs), polychlorinated biphenyls (PCBs), and organochlorine (OC) pesticides. Currently, data on the associations between exposure to POPs and the risk of mortality in the U.S. population is limited.

Our objective was to determine if higher exposure to POPs is associated with greater risk of all-cause, cancer, heart/cerebrovascular disease, or other-cause mortality.

Methods

Analyses included participants aged 60 years and older from the 1999–2006 National Health and Nutrition Examination Surveys (NHANES). We included 483 participants for analyses ...


Using Multivariate Statistical Techniques To Aid In A Sports Index Construction, Tiffany Kelly 2017 ESPN Stats & Information Group

Using Multivariate Statistical Techniques To Aid In A Sports Index Construction, Tiffany Kelly

Mathematics Colloquium Series

Within a quantitative career, you are/will soon be challenged to create an overall value to explain a situational status. For example, socio-economic status, well-being, and in this specific example, happiness among sports fans. This talk seeks to discuss my previous work developed out from student research performed at NSU in its application to my first project for ESPN Sports Analytics, the College Football Fan Happiness Index (http://es.pn/2vmParA) . I will dive into the multivariate statistical techniques of principal component analysis and hierarchal clustering to create this happiness index from a slew of variables.


Digital Commons powered by bepress