Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Genetics (8)
- Algorithms (2)
- Methods (2)
- Annotation metadata; Gene Ontology (GO); genomics; microarray; multiple hypothesis testing; resampling (1)
- As-treated analysis; Per-protocol analysis; Causal inference; Instrumental variables; Principal stratification; Propensity scores (1)
-
- Asymptotic bias and variance; Clustered survival data; Efficiency; Estimating equation; Kernel smoothing; Marginal model; Sandwich estimator (1)
- Asymptotic bias; EM algorithm; Maximum likelihood estimator; Measurement error; Structural modeling; Transitional Models (1)
- Asymptotic efficiency; Conditional score method; Functional modeling; Measurement error; Longitudinal data; Semiparametric inference; Transition models (1)
- Automated (1)
- BLUPs; Kernel function; Model/variable selection; Nonparametric regression; Penalized likelihood; REML; Score test; Smoothing parameter; Support vector machines (1)
- Base sequence (1)
- Binding sites (1)
- Block design (1)
- Blocked factorial (1)
- Chemistry (1)
- Chromosome mapping (1)
- Clinical trials; Doubly randomized preference trials; EM algorithm; Partically randomized preference trials; Randomization; Selection bias (1)
- Compost (1)
- Computer simulation (1)
- Computer-assisted (1)
- Consensus sequence (1)
- DNA (1)
- Dairy waste (1)
- Diagnosis (1)
- Dna (1)
- Empirical Bayes; False discovery rate; Clustering; Density estimation (1)
- Factorial Design (1)
- False Discovery Rate; Genetics; High Dimensional Data; Human Immunode Effciency Virus; Kullback-Leibler; Mahalanobis; Multinomial; Sequence Analysis (1)
- Fractional factorial (1)
- Gene expression profiling (1)
Articles 1 - 15 of 15
Full-Text Articles in Statistics and Probability
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Harvard University Biostatistics Working Paper Series
No abstract provided.
Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch
Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch
Harvard University Biostatistics Working Paper Series
An optimal multiple testing procedure is identified for linear hypotheses under the general linear model, maximizing the expected number of false null hypotheses rejected at any significance level. The optimal procedure depends on the unknown data-generating distribution, but can be consistently estimated. Drawing information together across many hypotheses, the estimated optimal procedure provides an empirical alternative hypothesis by adapting to underlying patterns of departure from the null. Proposed multiple testing procedures based on the empirical alternative are evaluated through simulations and an application to gene expression microarray data. Compared to a standard multiple testing procedure, it is not unusual for …
Exploration Of Distributional Models For A Novel Intensity-Dependent Normalization , Nicola Lama, Patrizia Boracchi, Elia Mario Biganzoli
Exploration Of Distributional Models For A Novel Intensity-Dependent Normalization , Nicola Lama, Patrizia Boracchi, Elia Mario Biganzoli
COBRA Preprint Series
Currently used gene intensity-dependent normalization methods, based on regression smoothing techniques, usually approach the two problems of location bias detrending and data re-scaling without taking into account the censoring characteristic of certain gene expressions produced by experiment measurement constraints or by previous normalization steps. Moreover, the bias vs variance balance control of normalization procedures is not often discussed but left to the user's experience. Here an approximate maximum likelihood procedure to fit a model smoothing the dependences of log-fold gene expression differences on average gene intensities is presented. Central tendency and scaling factor were modeled by means of B-splines smoothing …
Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng
Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng
Harvard University Biostatistics Working Paper Series
No abstract provided.
Estimation In Semiparametric Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Donglin Zeng, Xihong Lin
Estimation In Semiparametric Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Donglin Zeng, Xihong Lin
Harvard University Biostatistics Working Paper Series
No abstract provided.
Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin
Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin
Harvard University Biostatistics Working Paper Series
No abstract provided.
Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin
Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin
Harvard University Biostatistics Working Paper Series
No abstract provided.
A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin
A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin
Harvard University Biostatistics Working Paper Series
No abstract provided.
Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross
Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross
Dartmouth Scholarship
The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach.
Genome Scanning Methods For Comparing Sequences Between Groups, With Application To Hiv Vaccine Trials, Peter B. Gilbert, Chunyuan Wu, David V. Jobes
Genome Scanning Methods For Comparing Sequences Between Groups, With Application To Hiv Vaccine Trials, Peter B. Gilbert, Chunyuan Wu, David V. Jobes
UW Biostatistics Working Paper Series
Consider a placebo-controlled preventive HIV vaccine efficacy trial. An HIV amino acid sequence is measured from each volunteer who acquires HIV, and these sequences are aligned together with the reference HIV sequence represented in the vaccine. We develop genome scanning methods to identify HIV positions at which the amino acids in sequences from infected vaccine recipients tend to be more divergent from the corresponding reference amino acid than the amino acids in sequences from infected placebo recipients. We consider five two-sample test statistics, based on Euclidean, Mahalanobis, and Kullback-Leibler divergence measures. Weights are incorporated to reflect biological information contained in …
2^K Factorials In Blocks Of Size 2, With Application To Two-Color Microarray Experiments, Kathleen F. Kerr
2^K Factorials In Blocks Of Size 2, With Application To Two-Color Microarray Experiments, Kathleen F. Kerr
UW Biostatistics Working Paper Series
When a two-level design must be run in blocks of size two, there is a unique blocking scheme that enables estimation of all the main effects. Unfortunately this design does not enable estimation of any two-factor interactions. When the experimental goal is to estimate all main effects and two-factor interactions, it is necessary to combine replicates of the experiment that use different blocking schemes. In this paper we identify such designs for up to eight factors that enable estimation of all main effects and two-factor interactions with the fewest number of replications. In addition, we give a construction for general …
Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan
Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
We propose a general and formal statistical framework for the multiple tests of associations between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known fixed gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating genome-wide transcript levels or DNA copy numbers to possibly censored biological and …
Gpnn: Power Studies And Applications Of A Neural Network Method For Detecting Gene-Gene Interactions In Studies Of Human Disease, Alison A. Motsinger, Stephen L. Lee, George Mellick, Marylyn D. Ritchie
Gpnn: Power Studies And Applications Of A Neural Network Method For Detecting Gene-Gene Interactions In Studies Of Human Disease, Alison A. Motsinger, Stephen L. Lee, George Mellick, Marylyn D. Ritchie
Dartmouth Scholarship
The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease.
Impacts Of A Manure Composting Program On Stream Water Quality, A. Bekele, A. M.S. Mcfarland, A. J. Whisenant
Impacts Of A Manure Composting Program On Stream Water Quality, A. Bekele, A. M.S. Mcfarland, A. J. Whisenant
Faculty Publications
In February 2001, the Texas Commission on Environmental Quality (TCEQ) adopted a Total Maximum Daily Load (TMDL) for soluble reactive phosphorus (SRP) along the North Bosque River. Within this TMDL, dairy waste application fields were identified as the major nonpoint-source contribution of nutrients. In September 2000, a manure composting program was initiated that resulted in about 500,000 metric tons of dairy manure being hauled to composting facilities and exported from the watershed through December 2004. To evaluate the impact of the manure composting program on stream water quality, storm event mean concentrations of nutrients and total suspended solids were compared …
Analyzing Dna Microarrays With Undergraduate Statisticians, Johanna S. Hardin, Laura Hoopes, Ryan Murphy '06
Analyzing Dna Microarrays With Undergraduate Statisticians, Johanna S. Hardin, Laura Hoopes, Ryan Murphy '06
Pomona Faculty Publications and Research
With advances in technology, biologists have been saddled with high dimensional data that need modern statistical methodology for analysis. DNA microarrays are able to simultaneously measure thousands of genes (and the activity of those genes) in a single sample. Biologists use microarrays to trace connections between pathways or to identify all genes that respond to a signal. The statistical tools we usually teach our undergraduates are inadequate for analyzing thousands of measurements on tens of samples. The project materials include readings on microarrays as well as computer lab activities. The topics covered include image analysis, filtering and normalization techniques, and …