Open Access. Powered by Scholars. Published by Universities.®
Other Statistics and Probability Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Applied Statistics (6)
- Statistical Models (6)
- Applied Mathematics (4)
- Longitudinal Data Analysis and Time Series (4)
- Mathematics (4)
-
- Biostatistics (3)
- Business (3)
- Medicine and Health Sciences (3)
- Multivariate Analysis (3)
- Other Mathematics (3)
- Probability (3)
- Social and Behavioral Sciences (3)
- Business Administration, Management, and Operations (2)
- Engineering (2)
- Geography (2)
- Life Sciences (2)
- Numerical Analysis and Computation (2)
- Psychology (2)
- Risk Analysis (2)
- Spatial Science (2)
- Algebra (1)
- Analysis (1)
- Categorical Data Analysis (1)
- Communication (1)
- Communication Technology and New Media (1)
- Community-Based Research (1)
- Computer Sciences (1)
- Institution
-
- University of Nebraska - Lincoln (13)
- Washington University in St. Louis (2)
- Bowling Green State University (1)
- COBRA (1)
- California State University, San Bernardino (1)
-
- Claremont Colleges (1)
- East Tennessee State University (1)
- Purdue University (1)
- Selected Works (1)
- Southern Illinois University Carbondale (1)
- University of Connecticut (1)
- University of Massachusetts Amherst (1)
- University of Nevada, Las Vegas (1)
- University of New Mexico (1)
- University of North Florida (1)
- Utah State University (1)
- Keyword
-
- Constrained binary classification (2)
- Cross-validation (2)
- Ensemble classification (2)
- Neyman-Pearson (2)
- PrEP (2)
-
- Rate of Positive Predictions (2)
- Sensitivity (2)
- Uper Learner (2)
- 5-Hydroxymethylcytosine (1)
- Acute stress (1)
- Alignment (1)
- Allele-specific binding (1)
- Alpha (1)
- Asymmetry (1)
- Baseline (1)
- Bayesian estimation (1)
- Bayesian model (1)
- Biogeography (1)
- Brief introduction to ergodic theory (1)
- CHPC (1)
- COI haplotypes (1)
- Categorical data (1)
- Certification exam (1)
- ChIP-Seq (1)
- Clustering (1)
- Count data (1)
- DNA barcoding (1)
- DNA methylation (1)
- Day (1)
- Delta sequences (1)
- Publication
-
- Department of Statistics: Faculty Publications (13)
- All Graduate Plan B and other Reports, Spring 1920 to Spring 2023 (1)
- Arts & Sciences Electronic Theses and Dissertations (1)
- Branch Mathematics and Statistics Faculty and Staff Publications (1)
- CHIP Documents (1)
-
- Doctor of Business Administration Dissertations (1)
- Electronic Theses and Dissertations (1)
- Electronic Theses, Projects, and Dissertations (1)
- Honors Projects (1)
- International Conference on Gambling & Risk Taking (1)
- Laura B. Balzer (1)
- Masters Theses (1)
- Online Journal for Workforce Education and Development (1)
- Pomona Senior Theses (1)
- The Summer Undergraduate Research Fellowship (SURF) Symposium (1)
- U.C. Berkeley Division of Biostatistics Working Paper Series (1)
- UNF Graduate Theses and Dissertations (1)
- Publication Type
- File Type
Articles 1 - 29 of 29
Full-Text Articles in Other Statistics and Probability
Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton
Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
Random Forests are very memory intensive machine learning algorithms and most computers would fail at building models from datasets with millions of observations. Using the Center for High Performance Computing (CHPC) at the University of Utah and an airline on-time arrival dataset with 7 million observations from the U.S. Department of Transportation Bureau of Transportation Statistics we built 316 models by adjusting the depth of the trees and randomness of each forest and compared the accuracy and time each took. Using this dataset we discovered that substantial restrictions to the size of trees, observations allowed for each tree, and variables …
A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz
A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz
Doctor of Business Administration Dissertations
At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with …
A Comparison Of Techniques For Handling Missing Data In Longitudinal Studies, Alexander R. Bogdan
A Comparison Of Techniques For Handling Missing Data In Longitudinal Studies, Alexander R. Bogdan
Masters Theses
Missing data are a common problem in virtually all epidemiological research, especially when conducting longitudinal studies. In these settings, clinicians may collect biological samples to analyze changes in biomarkers, which often do not conform to parametric distributions and may be censored due to limits of detection. Using complete data from the BioCycle Study (2005-2007), which followed 259 premenopausal women over two menstrual cycles, we compared four techniques for handling missing biomarker data with non-Normal distributions. We imposed increasing degrees of missing data on two non-Normally distributed biomarkers under conditions of missing completely at random, missing at random, and missing not …
Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan
Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan
Laura B. Balzer
Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …
Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan
Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …
Optimal Design Of Low-Density Snp Arrays For Genomic Prediction: Algorithm And Applications, Xiao-Lin Wu, Jiaqi Xu, Guofei Feng, George R. Wiggans, Jeremy F. Taylor, Jun He, Changsong Qian, Jiansheng Qiu, Barry Simpson, Jeremy Walker, Stewart Bauck
Optimal Design Of Low-Density Snp Arrays For Genomic Prediction: Algorithm And Applications, Xiao-Lin Wu, Jiaqi Xu, Guofei Feng, George R. Wiggans, Jeremy F. Taylor, Jun He, Changsong Qian, Jiansheng Qiu, Barry Simpson, Jeremy Walker, Stewart Bauck
Department of Statistics: Faculty Publications
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. …
Probabilistic Methods In Information Theory, Erik W. Pachas
Probabilistic Methods In Information Theory, Erik W. Pachas
Electronic Theses, Projects, and Dissertations
Given a probability space, we analyze the uncertainty, that is, the amount of information of a finite system, by studying the entropy of the system. We also extend the concept of entropy to a dynamical system by introducing a measure preserving transformation on a probability space. After showing some theorems and applications of entropy theory, we study the concept of ergodicity, which helps us to further analyze the information of the system.
Passive Visual Analytics Of Social Media Data For Detection Of Unusual Events, Kush Rustagi, Junghoon Chae
Passive Visual Analytics Of Social Media Data For Detection Of Unusual Events, Kush Rustagi, Junghoon Chae
The Summer Undergraduate Research Fellowship (SURF) Symposium
Now that social media sites have gained substantial traction, huge amounts of un-analyzed valuable data are being generated. Posts containing images and text have spatiotemporal data attached as well, having immense value for increasing situational awareness of local events, providing insights for investigations and understanding the extent of incidents, their severity, and consequences, as well as their time-evolving nature. However, the large volume of unstructured social media data hinders exploration and examination. To analyze such social media data, the S.M.A.R.T system provides the analyst with an interactive visual spatiotemporal analysis and spatial decision support environment that assists in evacuation planning …
Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma
Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma
International Conference on Gambling & Risk Taking
Fundamental form characteristics like how fast a horse ran at its last start, are widely used to help predict the outcome of horse racing events. The exception being in races where horses haven’t previously competed, such as Maiden races, where there is little or no publicly available past performance information. In these types of events bettors need only consider a simplified suite of factors however this is offset by a higher level of uncertainty. This paper examines the inherent information content embedded within a horse’s ancestry and the extent to which this information is discounted in the United Kingdom bookmaker …
Spot Volatility Estimation Of Ito Semimartingales Using Delta Sequences, Weixuan Gao
Spot Volatility Estimation Of Ito Semimartingales Using Delta Sequences, Weixuan Gao
Arts & Sciences Electronic Theses and Dissertations
This thesis studies a unifying class of nonparametric spot volatility estimators proposed by Mancini et. al.(2013). This method is based on delta sequences and is conceived to include many of the existing estimators in the field as special cases. The thesis first surveys the asymptotic theory of the proposed estimators under an infill asymptotic scheme and fixed time horizon, when the state variable follows a Brownian semimartingale. Then, some extensions to include jumps and financial microstructure noise in the observed price process are also presented. The main goal of the thesis is to assess the suitability of the proposed methods …
The Relationship Between Time Of Day, Mood, And Electroencephalography (Eeg) Asymmetry, Morgan Tantillo
The Relationship Between Time Of Day, Mood, And Electroencephalography (Eeg) Asymmetry, Morgan Tantillo
Honors Projects
Previous researchers have had success in finding a correlation between exercise and an increase in positive mood. Researchers have also found a correlation between time of day and mood. The current study will explore the relationship between time of day, mood, and electroencephalography (EEG) asymmetry. The study utilized a convenient sample of ten undergraduate students at Bowling Green State University. Participants had baseline EEG recordings taken, and then participated in moderate exercise, followed by another EEG recording. Participants’ mood was assessed through a self-reported mood questionnaire before the condition as well as immediately after. Due to multiple statistical tests, the …
Takens Theorem With Singular Spectrum Analysis Applied To Noisy Time Series, Thomas K. Torku
Takens Theorem With Singular Spectrum Analysis Applied To Noisy Time Series, Thomas K. Torku
Electronic Theses and Dissertations
The evolution of big data has led to financial time series becoming increasingly complex, noisy, non-stationary and nonlinear. Takens theorem can be used to analyze and forecast nonlinear time series, but even small amounts of noise can hopelessly corrupt a Takens approach. In contrast, Singular Spectrum Analysis is an excellent tool for both forecasting and noise reduction. Fortunately, it is possible to combine the Takens approach with Singular Spectrum analysis (SSA), and in fact, estimation of key parameters in Takens theorem is performed with Singular Spectrum Analysis. In this thesis, we combine the denoising abilities of SSA with the Takens …
Flesch-Kincaid Reading Grade Level Re-Examined: Creating A Uniform Method For Calculating Readability On A Certification Exam, Emily Neuhoff, Kristiana M. Feeser, Kayla Sutherland, Thomas Hovatter
Flesch-Kincaid Reading Grade Level Re-Examined: Creating A Uniform Method For Calculating Readability On A Certification Exam, Emily Neuhoff, Kristiana M. Feeser, Kayla Sutherland, Thomas Hovatter
Online Journal for Workforce Education and Development
Abstract
Objective: This study attempted to establish a consistent measurement technique of the readability of a state-wide Certified Nursing Assistant’s (CNA) certification exam. Background: Monitoring the readability level of an exam helps ensure all test versions do not exceed the maximum reading level of the exam, and that knowledge of the subject matter, rather than reading ability, is being assessed. Method: A two part approach was used to specify and evaluate readability. First, two methods (Microsoft Word® (MSW) software and published readability formulae) were used to calculate Flesch Reading Ease (FRE) and Flesch-Kincaid Reading Grade Level (FKRGL) for multiple …
Spatiotemporal Meta-Analysis: Reviewing Health Psychology Phenomena Over Space And Time., Blair T. Johnson
Spatiotemporal Meta-Analysis: Reviewing Health Psychology Phenomena Over Space And Time., Blair T. Johnson
CHIP Documents
This supplemental material is meant to support this article:
Johnson, B. T., Crowley, E., & Marrouch, N. Spatiotemporal meta-analysis: Reviewing health psychology phenomena over space and time. Health Psychology Review.
Specifically, it is a database of GDPs per capita for nations in the world between 1800 and 2015. It is archived here to support an online supplement to this article.
GDP per capita
A New Right Tailed Test Of The Ratio Of Variances, Elizabeth Rochelle Lesser
A New Right Tailed Test Of The Ratio Of Variances, Elizabeth Rochelle Lesser
UNF Graduate Theses and Dissertations
It is important to be able to compare variances efficiently and accurately regardless of the parent populations. This study proposes a new right tailed test for the ratio of two variances using the Edgeworth’s expansion. To study the Type I error rate and Power performance, simulation was performed on the new test with various combinations of symmetric and skewed distributions. It is found to have more controlled Type I error rates than the existing tests. Additionally, it also has sufficient power. Therefore, the newly derived test provides a good robust alternative to the already existing methods.
How Often Are Antibiotic-Resistant Bacteria Said To “Evolve” In The News?, Nina Singh, Matthew T. Sit, Deanna M. Chung, Ana A. Lopez, Ranil Weerackoon, Pamela J. Yeh
How Often Are Antibiotic-Resistant Bacteria Said To “Evolve” In The News?, Nina Singh, Matthew T. Sit, Deanna M. Chung, Ana A. Lopez, Ranil Weerackoon, Pamela J. Yeh
Department of Statistics: Faculty Publications
Media plays an important role in informing the general public about scientific ideas.We examine whether the word “evolve,” sometimes considered controversial by the general public, is frequently used in the popular press. Specifically, we ask how often articles discussing antibiotic resistance use the word “evolve” (or its lexemes) as opposed to alternative terms such as “emerge” or “develop.” We chose the topic of antibiotic resistance because it is a medically important issue; bacterial evolution is a central player in human morbidity and mortality. We focused on the most widely-distributed newspapers written in English in the United States, United Kingdom, Canada, …
Systematic Evaluation Of The Impact Of Chip-Seq Read Designs On Genome Coverage, Peak Identification, And Allele-Specific Binding Detection, Qi Zhang, Xin Zeng, Sam Younkin, Trupti Kawli, Michael P. Snyder, Sündüz Kele
Systematic Evaluation Of The Impact Of Chip-Seq Read Designs On Genome Coverage, Peak Identification, And Allele-Specific Binding Detection, Qi Zhang, Xin Zeng, Sam Younkin, Trupti Kawli, Michael P. Snyder, Sündüz Kele
Department of Statistics: Faculty Publications
Background: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiments revolutionized genome-wide profiling of transcription factors and histone modifications. Although maturing sequencing technologies allow these experiments to be carried out with short (36–50 bps), long (75–100 bps), single-end, or paired-end reads, the impact of these read parameters on the downstream data analysis are not well understood. In this paper, we evaluate the effects of different read parameters on genome sequence alignment, coverage of different classes of genomic features, peak identification, and allele-specific binding detection.
Results: We generated 101 bps paired-end ChIP-seq data for many transcription factors from human GM12878 and MCF7 cell …
Enscat: Clustering Of Categorical Data Via Ensembling, Bertrand S. Clarke, Saeid Amiri, Jennifer L. Clarke
Enscat: Clustering Of Categorical Data Via Ensembling, Bertrand S. Clarke, Saeid Amiri, Jennifer L. Clarke
Department of Statistics: Faculty Publications
Background: Clustering is a widely used collection of unsupervised learning techniques for identifying natural classes within a data set. It is often used in bioinformatics to infer population substructure. Genomic data are often categorical and high dimensional, e.g., long sequences of nucleotides. This makes inference challenging: The distance metric is often not well-defined on categorical data; running time for computations using high dimensional data can be considerable; and the Curse of Dimensionality often impedes the interpretation of the results. Up to the present, however, the literature and software addressing clustering for categorical data has not yet led to a standard …
A Genomic Bayesian Multi-Trait And Multi-Environment Model, Osval A. Montesinos-López, Abelardo Montesinos-López, José Crossa, Fernando Toledo, Oscar Pérez-Hernández, Kent M. Eskridge, Jessica Rutkoski
A Genomic Bayesian Multi-Trait And Multi-Environment Model, Osval A. Montesinos-López, Abelardo Montesinos-López, José Crossa, Fernando Toledo, Oscar Pérez-Hernández, Kent M. Eskridge, Jessica Rutkoski
Department of Statistics: Faculty Publications
When information on multiple genotypes evaluated in multiple environments is recorded, a multi-environment single trait model for assessing genotype × environment interaction (G×E) is usually employed. Comprehensive models that simultaneously take into account the correlated traits and trait × genotype × environment interaction (T×G×E) are lacking. In this research, we propose a Bayesian model for analyzing multiple traits and multiple environments for whole-genome prediction (WGP) model. For this model, we used Half-𝑡 priors on each standard deviation term and uniform priors on each correlation of the covariance matrix. These priors were not informative and led to posterior inferences that were …
Genomic Bayesian Prediction Model For Count Data With Genotype X Environment Interaction, Abelardo Montesinos-López, Osval A. Montesinos-López, José Crossa, Juan Burgueño, Kent M. Eskridge, Esteban Falconi-Castillo, Xinyao He, Pawan Singh, Karen Cichy
Genomic Bayesian Prediction Model For Count Data With Genotype X Environment Interaction, Abelardo Montesinos-López, Osval A. Montesinos-López, José Crossa, Juan Burgueño, Kent M. Eskridge, Esteban Falconi-Castillo, Xinyao He, Pawan Singh, Karen Cichy
Department of Statistics: Faculty Publications
Genomic tools allow the study of the whole genome, and facilitate the study of genotype-environment combinations and their relationship with phenotype. However, most genomic prediction models developed so far are appropriate for Gaussian phenotypes. For this reason, appropriate genomic prediction models are needed for count data, since the conventional regression models used on count data with a large sample size (nT ) and a small number of parameters (p) cannot be used for genomic-enabled prediction where the number of parameters (p) is larger than the sample size (nT ). Here, we propose a Bayesian mixed-negative binomial (BMNB) genomic …
The Impact Of Hair Coat Color On Longevity Of Holstein Cows In The Tropics, C. N. Lee, K. S. Baek, A. Parkhurst
The Impact Of Hair Coat Color On Longevity Of Holstein Cows In The Tropics, C. N. Lee, K. S. Baek, A. Parkhurst
Department of Statistics: Faculty Publications
Background: Over two decades of observations in the field in South East Asia and Hawai‘i suggest that majority of the commercial dairy herds are of black hair coat. Hence a simple study to determine the accuracy of the observation was conducted with two large dairy herds in Hawaii in the mid-1990s.
Methods: A retrospective study on longevity of Holstein cattle in the tropics was conducted using DairyComp-305 lactation information coupled with phenotypic evaluation of hair coat color in two large dairy farms. Cows were classified into 3 groups: a) black (B, >90%); b) black/white (BW, 50:50) and c) white (W, …
Sex-Specific Hippocampal 5-Hydroxymethylcytosine Is Disrupted In Response To Acute Stress, Ligia A. Papale, Sisi Li, Andy Madrid, Qi Zhang, Li Chen, Pankaj Chopra, Peng Jin, Sunduz Keles, Reid S. Alisch
Sex-Specific Hippocampal 5-Hydroxymethylcytosine Is Disrupted In Response To Acute Stress, Ligia A. Papale, Sisi Li, Andy Madrid, Qi Zhang, Li Chen, Pankaj Chopra, Peng Jin, Sunduz Keles, Reid S. Alisch
Department of Statistics: Faculty Publications
Environmental stress is among the most important contributors to increased susceptibility to develop psychiatric disorders. While it is well known that acute environmental stress alters gene expression, the molecular mechanisms underlying these changes remain largely unknown. 5-hydroxymethylcytosine (5hmC) is a novel environmentally sensitive epigenetic modification that is highly enriched in neurons and is associated with active neuronal transcription. Recently,we reported a genome-wide disruption of hippocampal 5hmCin male mice following acute stress that was correlated to altered transcript levels of genes in known stress related pathways. Since sex-specific endocrine mechanisms respond to environmental stimulus by altering the neuronal epigenome, we examined …
A Compendium Of Chromatin Contact Maps Reveals Spatially Active Regions In The Human Genome, Anthony D. Schmitt, Ming Hu, Inkyung Jung, Zheng Xu, Yunjiang Qiu, Catherine L. Tan, Yun Li, Shin Lin, Yiing Lin, Cathy L. Barr, Bing Ren
A Compendium Of Chromatin Contact Maps Reveals Spatially Active Regions In The Human Genome, Anthony D. Schmitt, Ming Hu, Inkyung Jung, Zheng Xu, Yunjiang Qiu, Catherine L. Tan, Yun Li, Shin Lin, Yiing Lin, Cathy L. Barr, Bing Ren
Department of Statistics: Faculty Publications
The three-dimensional configuration of DNA is integral to all nuclear processes in eukaryotes, yet our knowledge of the chromosome architecture is still limited. Genome-wide chromosome conformation capture studies have uncovered features of chromatin organization in cultured cells, but genome architecture in human tissues has yet to be explored. Here, we report the most comprehensive survey to date of chromatin organization in human tissues. Through integrative analysis of chromatin contact maps in 21 primary human tissues and cell types, we find topologically associating domains highly conserved in different tissues. We also discover genomic regions that exhibit unusually high levels of local …
Hiview: An Integrative Genome Browser To Leverage Hi‑C Results For The Interpretation Of Gwas Variants, Zheng Xu, Guosheng Zhang, Qing Duan, Shengjie Chai, Baqun Zhang, Cong Wu, Fulai Jin, Feng Yue, Yun Li, Ming Hu
Hiview: An Integrative Genome Browser To Leverage Hi‑C Results For The Interpretation Of Gwas Variants, Zheng Xu, Guosheng Zhang, Qing Duan, Shengjie Chai, Baqun Zhang, Cong Wu, Fulai Jin, Feng Yue, Yun Li, Ming Hu
Department of Statistics: Faculty Publications
Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits and diseases. However, most of them are located in the non-protein coding regions, and therefore it is challenging to hypothesize the functions of these non-coding GWAS variants. Recent large efforts such as the ENCODE and Roadmap Epigenomics projects have predicted a large number of regulatory elements. However, the target genes of these regulatory elements remain largely unknown. Chromatin conformation capture based technologies such as Hi-C can directly measure the chromatin interactions and have generated an increasingly comprehensive catalog of the interactome between the distal regulatory elements …
A Bayesian Gwas Method Utilizing Haplotype Clusters For A Composite Breed Population, Danielle F. Wilson-Wells, Stephen D. Kachman
A Bayesian Gwas Method Utilizing Haplotype Clusters For A Composite Breed Population, Danielle F. Wilson-Wells, Stephen D. Kachman
Department of Statistics: Faculty Publications
Commercial beef cattle are often composites of multiple breeds. Current methods used to produce genomic predictors are based on the underlying assumption of animals being sampled from a homogeneous population. As a result, the predictors can perform poorly when used to predict the relative genetic merit of animals whose breed composition are different. In part, this is due to the changes in linkage disequilibrium between the markers and the quantitative trait loci as we move from one breed to the next. An alternative model based on breed specific haplotype clusters was developed to allow for differences in linkage disequilibrium across …
Design Of Probabilistic Random Forests With Applications To Anticancer Drug Sensitivity Prediction- 2016, Raziur Rahman, Saad Haider, Souparno Ghosh, Ranadip Pal
Design Of Probabilistic Random Forests With Applications To Anticancer Drug Sensitivity Prediction- 2016, Raziur Rahman, Saad Haider, Souparno Ghosh, Ranadip Pal
Department of Statistics: Faculty Publications
Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. …
Monte Carlo Approx. Methods For Stochastic Optimization, John Fowler
Monte Carlo Approx. Methods For Stochastic Optimization, John Fowler
Pomona Senior Theses
This thesis provides an overview of stochastic optimization (SP) problems and looks at how the Sample Average Approximation (SAA) method is used to solve them. We review several applications of this problem-solving technique that have been published in papers over the last few years. The number and variety of the examples should give an indication of the usefulness of this technique. The examples also provide opportunities to discuss important aspects of SPs and the SAA method including model assumptions, optimality gaps, the use of deterministic methods for finite sample sizes, and the accelerated Benders decomposition algorithm. We also give a …
Species Discovery And Diversity In Lobocriconema (Criconematidae: Nematoda) And Related Plant-Parasitic Nematodes From North American Ecoregions, Tom Powers, Ernest C. Bernard, T. Harris, Robert Higgins, M. Olson, S. Olson, M. Lodema, Julianne N. Matczyszyn, P. Mullin, L. Sutton, K.S Powers
Species Discovery And Diversity In Lobocriconema (Criconematidae: Nematoda) And Related Plant-Parasitic Nematodes From North American Ecoregions, Tom Powers, Ernest C. Bernard, T. Harris, Robert Higgins, M. Olson, S. Olson, M. Lodema, Julianne N. Matczyszyn, P. Mullin, L. Sutton, K.S Powers
Department of Statistics: Faculty Publications
There are many nematode species that, following formal description, are seldom mentioned again in the scientific literature. Lobocriconema thornei and L. incrassatum are two such species, described from North American forests, respectively 37 and 49 years ago. In the course of a 3-year nematode biodiversity survey of North American ecoregions, specimens resembling Lobocriconema species appeared in soil samples from both grassland and forested sites. Using a combination of molecular and morphological analyses, together with a set of species delimitation approaches, we have expanded the known range of these species, added to the species descriptions, and discovered a related group of …
Pcr5 And Neutrosophic Probability In Target Identification, Florentin Smarandache, Nassim Abbas, Youcef Chibani, Bilal Hadjadji, Zayen Azzouz Omar
Pcr5 And Neutrosophic Probability In Target Identification, Florentin Smarandache, Nassim Abbas, Youcef Chibani, Bilal Hadjadji, Zayen Azzouz Omar
Branch Mathematics and Statistics Faculty and Staff Publications
In this paper we use PCR5 in order to fusion the information of two sources providing subjective probabilities of an event A to occur in the following form: chance that A occurs, indeterminate chance of occurrence of A, chance that A does not occur.