Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

10151 Full-Text Articles 13633 Authors 2299105 Downloads 183 Institutions

All Articles in Statistics and Probability

Faceted Search

10151 full-text articles. Page 1 of 277.

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom 2018 Harvey Mudd College

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom

HMC Senior Theses

Since some of the location of where the users posted their tweets collected by social media company have varied accuracy, and some are missing. We want to use those tweets with highest accuracy to help fill in the data of those tweets with incomplete information. To test our algorithm, we used the sets of social media data from a city, we separated them into training sets, where we know all the information, and the testing sets, where we intentionally pretend to not know the location. One prediction method that was used in (Dukler, Han and Wang, 2016) requires appending one-hot ...


Index Number Of Iowa Farm Products Prices, Gertrude M. Cox 2017 Iowa State College

Index Number Of Iowa Farm Products Prices, Gertrude M. Cox

Bulletin (Iowa Agricultural Experiment Station)

The present Iowa farm price index has been in use since 1926. It is widely employed as a measure of the general level of Iowa farm prices and appears each month in the price barometer published in Agricultural Economic Facts2. A few years ago Peck3 developed a farm lease, known as the sliding scale lease, in which the rental payments are based on and vary with the changes in the index number. More recently, contracts covering land sales have been devised in which the interest payments and in some cases also the principal payments are based on this ...


Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek 2017 Stephen F Austin State University

Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek

Electronic Theses and Dissertations

ABSTRACT

Examination and Comparison of the Performance of Common Non-Parametric and Robust Regression Models

By

Gregory Frank Malek

Stephen F. Austin State University, Masters in Statistics Program,

Nacogdoches, Texas, U.S.A.

g_m_2002@live.com

This work investigated common alternatives to the least-squares regression method in the presence of non-normally distributed errors. An initial literature review identified a variety of alternative methods, including Theil Regression, Wilcoxon Regression, Iteratively Re-Weighted Least Squares, Bounded-Influence Regression, and Bootstrapping methods. These methods were evaluated using a simple simulated example data set, as well as various real data sets, including math proficiency data, Belgian telephone ...


The Soybean Rhg1 Locus For Resistance To The Soybean Cyst Nematode Heterodera Glycines Regulates The Expression Of A Large Number Of Stress- And Defense-Related Genes In Degenerating Feeding Cells, Pramod Kaitheri Kandoth, Nagabhushana Ithal, Justin Recknor, Tom Maier, Dan Nettleton, Thomas J. Baum, Melissa G. Mitchum 2017 University of Missouri

The Soybean Rhg1 Locus For Resistance To The Soybean Cyst Nematode Heterodera Glycines Regulates The Expression Of A Large Number Of Stress- And Defense-Related Genes In Degenerating Feeding Cells, Pramod Kaitheri Kandoth, Nagabhushana Ithal, Justin Recknor, Tom Maier, Dan Nettleton, Thomas J. Baum, Melissa G. Mitchum

Thomas Baum

To gain new insights into the mechanism of soybean (Glycine max) resistance to the soybean cyst nematode (Heterodera glycines), we compared gene expression profiles of developing syncytia in soybean near-isogenic lines differing at Rhg1 (for resistance to Heterodera glycines), a major quantitative trait locus for resistance, by coupling laser capture microdissection with microarray analysis. Gene expression profiling revealed that 1,447 genes were differentially expressed between the two lines. Of these, 241 (16.8%) were stress- and defense-related genes. Several stress-related genes were up-regulated in the resistant line, including those encoding homologs of enzymes that lead to increased levels of ...


Improving The Accuracy For The Long-Term Hydrologic Impact Assessment (L-Thia) Model, Anqi Zhang, Lawrence Theller, Bernard A. Engel 2017 Purdue University

Improving The Accuracy For The Long-Term Hydrologic Impact Assessment (L-Thia) Model, Anqi Zhang, Lawrence Theller, Bernard A. Engel

The Summer Undergraduate Research Fellowship (SURF) Symposium

Urbanization increases runoff by changing land use types from less impervious to impervious covers. Improving the accuracy of a runoff assessment model, the Long-Term Hydrologic Impact Assessment (L-THIA) Model, can help us to better evaluate the potential uses of Low Impact Development (LID) practices aimed at reducing runoff, as well as to identify appropriate runoff and water quality mitigation methods. Several versions of the model have been built over time, and inconsistencies have been introduced between the models. To improve the accuracy and consistency of the model, the equations and parameters (primarily curve numbers in the case of this model ...


Thermodynamics Of Coherent Structures Near Phase Transitions, Julia M. Meyer, Ivan Christov 2017 Purdue University

Thermodynamics Of Coherent Structures Near Phase Transitions, Julia M. Meyer, Ivan Christov

The Summer Undergraduate Research Fellowship (SURF) Symposium

Phase transitions within large-scale systems may be modeled by nonlinear stochastic partial differential equations in which system dynamics are captured by appropriate potentials. Coherent structures in these systems evolve randomly through time; thus, statistical behavior of these fields is of greater interest than particular system realizations. The ability to simulate and predict phase transition behavior has many applications, from material behaviors (e.g., crystallographic phase transformations and coherent movement of granular materials) to traffic congestion. Past research focused on deriving solutions to the system probability density function (PDF), which is the ground-state wave function squared. Until recently, the extent to ...


Hazard Assessment Of Meteoroid Impact For The Design Of Lunar Habitats, Herta Paola Montoya, Shirley Dyke, Julio A. Ramirez, Antonio Bobet, H. Jay Melosh, Daniel Gomez 2017 Purdue University

Hazard Assessment Of Meteoroid Impact For The Design Of Lunar Habitats, Herta Paola Montoya, Shirley Dyke, Julio A. Ramirez, Antonio Bobet, H. Jay Melosh, Daniel Gomez

The Summer Undergraduate Research Fellowship (SURF) Symposium

The design of self-sustaining lunar habitats is a challenge primarily due to the Moon’s lack of atmospheric protection and hazardous environment. To assure safe habitats that will lead to further lunar and space exploration, it is necessary to assess the different hazards faced on the Moon such as meteoroid impacts, extreme temperatures, and radiation. In particular, meteoroids pose a risk to lunar structures due to their high frequency of occurrence and hypervelocity impact. Continuous meteoroid impacts can harm structural elements and vital equipment compromising the well-being of lunar inhabitants. This study is focused on the hazard conceptualization and quantification ...


A Characterization Of A Value Added Model And A New Multi-Stage Model For Estimating Teacher Effects Within Small School Systems, Julie M. Garai 2017 University of Nebraska-Lincoln

A Characterization Of A Value Added Model And A New Multi-Stage Model For Estimating Teacher Effects Within Small School Systems, Julie M. Garai

Dissertations and Theses in Statistics

At both the national and state level there is increasing pressure to develop metrics to determine if school systems are meeting educational objectives. All states mandate some form of assessment by standardized tests. One method currently used to model student test scores is Value Added Modeling (VAM), which models student scores as a product of classroom and school environments. One VAM approach is the Tennessee Value Added Assessment System (TVAAS) which models student gains from year to year. Teacher effects are included in this layered model, which estimates the teacher’s added value to a student score through best linear ...


Prediction Of Stress Increase In Unbonded Tendons Using Sparse Principal Component Analysis, Eric Mckinney 2017 Utah State University

Prediction Of Stress Increase In Unbonded Tendons Using Sparse Principal Component Analysis, Eric Mckinney

All Graduate Plan B and other Reports

While internal and external unbonded tendons are widely utilized in concrete structures, the analytic solution for the increase in unbonded tendon stress, Δ𝑓𝑝𝑠, is challenging due to the lack of bond between strand and concrete. Moreover, most analysis methods do not provide high correlation due to the limited available test data. In this thesis, Principal Component Analysis (PCA), and Sparse Principal Component Analysis (SPCA) are employed on different sets of candidate variables, amongst the material and sectional properties from the database compiled by Maguire et al. [18]. Predictions of Δ𝑓𝑝𝑠 are made via Principal Component Regression models, and the method ...


Uses Of The Hypergeometric Distribution For Determining Survival Or Complete Representation Of Subpopulations In Sequential Sampling, Brooke Busbee 2017 Stephen F Austin State University

Uses Of The Hypergeometric Distribution For Determining Survival Or Complete Representation Of Subpopulations In Sequential Sampling, Brooke Busbee

Electronic Theses and Dissertations

This thesis will explore the hypergeometric probability distribution by looking at many different aspects of the distribution. These include, and are not limited to: history and origin, derivation and elementary applications, properties, relationships to other probability models, kindred hypergeometric distributions and elements of statistical inference associated with the hypergeometric distribution. Once the above are established, an investigation into and furthering of work done by Walton (1986) and Charlambides (2005) will be done. Here, we apply the hypergeometric distribution to sequential sampling in order to determine a surviving subcategory as well as study the problem of and complete representation of the ...


Gender, Age, Research Experience, Leading Role And Academic Productivity Of Vietnamese Researchers In The Social Sciences And Humanities: Exploring A 2008-2017 Scopus Dataset, Quan-Hoang Vuong, Tung M. Ho, Thu-Trang Vuong, Nancy K. Napier, Hiep-Hung Pham, Ha V. Nguyen 2017 Universite Libre de Bruxelles

Gender, Age, Research Experience, Leading Role And Academic Productivity Of Vietnamese Researchers In The Social Sciences And Humanities: Exploring A 2008-2017 Scopus Dataset, Quan-Hoang Vuong, Tung M. Ho, Thu-Trang Vuong, Nancy K. Napier, Hiep-Hung Pham, Ha V. Nguyen

Quan-Hoang Vuong

Background: Academic productivity has been studied by scholars all round the world for many years. However, in Vietnam, this topic has scarcely been addressed. This research therefore aims at better understanding the correlations between gender, age, research experience, the leading role of corresponding authors, and the total number of their publications in the specific realm of social sciences and humanities.
Methods: The study employed a Scopus dataset with publication profiles of 410 Vietnamese researchers between 2008 and 2017.
Results: Men did not differ from women in academic publications (P=0.827). The proficiency of corresponding authors positively correlated with the ...


Methods For Scalar-On-Function Regression, Philip T. Reiss, Jeff Goldsmith, Han Lin Shang, R. Todd Ogden 2017 Columbia University

Methods For Scalar-On-Function Regression, Philip T. Reiss, Jeff Goldsmith, Han Lin Shang, R. Todd Ogden

Philip T. Reiss

Recent years have seen an explosion of activity in the field of functional data analysis (FDA), in which curves, spectra, images, etc. are considered as basic functional data units. A central problem in FDA is how to fit regression models with scalar responses and functional data points as predictors. We review some of the main approaches to this problem, categorizing the basic model types as linear, nonlinear and nonparametric. We discuss publicly available software packages, and illustrate some of the procedures by application to a functional magnetic resonance imaging dataset.


Application Of Support Vector Machine Modeling And Graph Theory Metrics For Disease Classification, Jessica M. Rudd 2017 Kennesaw State University

Application Of Support Vector Machine Modeling And Graph Theory Metrics For Disease Classification, Jessica M. Rudd

Grey Literature from PhD Candidates

Disease classification is a crucial element of biomedical research. Recent studies have demonstrated that machine learning techniques, such as Support Vector Machine (SVM) modeling, produce similar or improved predictive capabilities in comparison to the traditional method of Logistic Regression. In addition, it has been found that social network metrics can provide useful predictive information for disease modeling. In this study, we combine simulated social network metrics with SVM to predict diabetes in a sample of data from the Behavioral Risk Factor Surveillance System. In this dataset, Logistic Regression outperformed SVM with ROC index of 81.8 and 81.7 for ...


Cahost Facilitating The Johnson-Neyman Technique For Two-Way Interactions In Multiple Regression.Pdf, Stephen W. Carden, Nicholas S. Holtzman, Michael J. Strube 2017 Georgia Southern University

Cahost Facilitating The Johnson-Neyman Technique For Two-Way Interactions In Multiple Regression.Pdf, Stephen W. Carden, Nicholas S. Holtzman, Michael J. Strube

Stephen W. Carden

When using multiple regression, researchers frequently wish to explore how the
relationship between two variables is moderated by another variable; this is termed
an interaction. Historically, two approaches have been used to probe interactions:
the pick-a-point approach and the Johnson-Neyman (JN) technique. The pick-a-point
approach has limitations that can be avoided using the JN technique. Currently, the
software available for implementing the JN technique and creating corresponding figures
lacks several desirable features–most notably, ease of use and figure quality. To fill
this gap in the literature, we offer a free Microsoft Excel 2013 workbook, CAHOST (a
concatenation of the ...


Combining Biomarkers By Maximizing The True Positive Rate For A Fixed False Positive Rate, Allison Meisner, Marco Carone, Margaret Pepe, Kathleen F. Kerr 2017 University of Washington, Seattle

Combining Biomarkers By Maximizing The True Positive Rate For A Fixed False Positive Rate, Allison Meisner, Marco Carone, Margaret Pepe, Kathleen F. Kerr

UW Biostatistics Working Paper Series

Biomarkers abound in many areas of clinical research, and often investigators are interested in combining them for diagnosis, prognosis and screening. In many applications, the true positive rate for a biomarker combination at a prespecified, clinically acceptable false positive rate is the most relevant measure of predictive capacity. We propose a distribution-free method for constructing biomarker combinations by maximizing the true positive rate while constraining the false positive rate. Theoretical results demonstrate good operating characteristics for the resulting combination. In simulations, the biomarker combination provided by our method demonstrated improved operating characteristics in a variety of scenarios when compared with ...


Developing Biomarker Combinations In Multicenter Studies Via Direct Maximization And Penalization, Allison Meisner, Chirag R. Parikh, Kathleen F. Kerr 2017 University of Washington, Seattle

Developing Biomarker Combinations In Multicenter Studies Via Direct Maximization And Penalization, Allison Meisner, Chirag R. Parikh, Kathleen F. Kerr

UW Biostatistics Working Paper Series

When biomarker studies involve patients at multiple centers and the goal is to develop biomarker combinations for diagnosis, prognosis, or screening, we consider evaluating the predictive capacity of a given combination with the center-adjusted AUC (aAUC), a summary of conditional performance. Rather than using a general method to construct the biomarker combination, such as logistic regression, we propose estimating the combination by directly maximizing the aAUC. Furthermore, it may be desirable to have a biomarker combination with similar predictive capacity across centers. To that end, we allow for penalization of the variability in center-specific performance. We demonstrate good asymptotic properties ...


Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li 2017 The University of Western Ontario

Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li

Electronic Thesis and Dissertation Repository

Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered.

1. Sparse data and convergence behavior. How different properties of a dataset, such as ...


Socioeconomic Status, Air Quality And Geographic Variation In Emergency Room Visits For Acute Bronchitis On The California Central Coast, Sean Lang-Brown, Heather W. Starnes, Gary B. Hughes 2017 California Polytechnic State University

Socioeconomic Status, Air Quality And Geographic Variation In Emergency Room Visits For Acute Bronchitis On The California Central Coast, Sean Lang-Brown, Heather W. Starnes, Gary B. Hughes

Symposium

IMPORTANCE: Analysis of geospatial variation in acute bronchitis due to socioeconomic and environmental factors can allow the efficient delivery of resources to populations most at risk.

OBJECTIVE: We sought to determine if small scale variation in socioeconomic factors and emergency room (ER) visits for acute bronchitis are associated in small cities or rural communities. We also modeled the effects of air quality on daily rates of ER visits for acute bronchitis in the context of socioeconomic factors to investigate modifying relationships.

DESIGN, SETTING, AND PARTICIPANTS: We examined ER visits for acute bronchitis in San Luis Obispo and Santa Barbara counties ...


Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei 2017 STATinMED Research/SIMR, Inc.

Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei

Publications and Research

Comparative data on the burden of atopic dermatitis (AD) in adults relative to the general population are limited. We performed a large-scale evaluation of the burden of disease among US adults with AD relative to matched non-AD controls, encompassing comorbidities, healthcare resource utilization (HCRU), and costs, using healthcare claims data. The impact of AD disease severity on these outcomes was also evaluated.


Application Of Silicon Ameliorated Salinity Stress And Improved Wheat Yield, M. A. Ibrahim, A. M. Merwad, E. A. Elnaka, C. L. Burras, L. Follett 2017 Iowa State University

Application Of Silicon Ameliorated Salinity Stress And Improved Wheat Yield, M. A. Ibrahim, A. M. Merwad, E. A. Elnaka, C. L. Burras, L. Follett

C. Lee Burras

Management of soil salinity is an important research field around the globe, especially when associated with the limited water resources. This work aimed to improve the growth and yield of wheat (Triticum aestivum L. CV. Sakha-93) grown under salinity stress. A completely randomized design pot experiment with three replications was conducted in a loamy soil with various levels of salinity under local weather conditions. The treatments included five levels of salinity (2.74, 5.96, 8.85, 10.74, and 13.38 dSm-1) prepared by adding NaCl to the selected soil and five treatments of Si (0, 2.1, 4 ...


Digital Commons powered by bepress