Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Genetics (8)
- Genomics (7)
- Statistical Models (7)
- Functional Data Analysis (5)
- Proteomics (4)
-
- Statistical Theory and Methods (4)
- Gene expression (3)
- Biomarkers (2)
- Cancer (2)
- Classification (2)
- Clustering (2)
- Computational Biology/Bioinformatics (2)
- False discovery rate (2)
- Functional data analysis (2)
- Image Analysis (2)
- Mass spectrometry (2)
- Microarray (2)
- Microarrays (2)
- Spot detection (2)
- validation. (1)
- 2-D gel electrophoresis (1)
- 2D Gel Electrophoresis (1)
- 2D gel electrophoresis (1)
- Air pollution; Functional data analysis; Markov chain Monte Carlo; Mixture prior; Panel study; Particulate matter; Wavelets. (1)
- Amino acid (1)
- Analysis of Designed Experiments (1)
- BLUPs; Kernel function; Model/variable selection; Nonparametric regression; Penalized likelihood; REML; Score test; Smoothing parameter; Support vector machines (1)
- Bayesian Modeling (1)
- Bayesian inference (1)
- Bayesian methods (1)
- Publication Year
- Publication
-
- Jeffrey S. Morris (11)
- Harvard University Biostatistics Working Paper Series (10)
- COBRA Preprint Series (6)
- UW Biostatistics Working Paper Series (5)
- Mark R Segal (3)
-
- U.C. Berkeley Division of Biostatistics Working Paper Series (3)
- Electronic Theses and Dissertations (2)
- Dan Nettleton (1)
- Dissertations & Theses (Open Access) (1)
- Dissertations, Master's Theses and Master's Reports (1)
- Doctoral Dissertations (1)
- Electronic Thesis and Dissertation Repository (1)
- Johns Hopkins University, Dept. of Biostatistics Working Papers (1)
- Mathematics and Statistics Department Faculty Publication Series (1)
- School of Computing: Faculty Publications (1)
- Theses and Dissertations--Statistics (1)
- Yale Day of Data (1)
- Publication Type
- File Type
Articles 1 - 30 of 50
Full-Text Articles in Computational Biology
Statistical Methods In Genetic Studies, Cheng Gao
Statistical Methods In Genetic Studies, Cheng Gao
Dissertations, Master's Theses and Master's Reports
This dissertation includes three Chapters. A brief description of each chapter is organized as follows.
In Chapter 1, we proposed a new method, called MF-TOWmuT, for genome-wide association studies with multiple genetic variants and multiple phenotypes using family samples. MF-TOWmuT uses kinship matrix to account for sample relatedness. It is worth mentioning that in simulations, we considered hidden polygenic effects and varied the proportion of variance contributed by it to generate phenotypes. Simulation studies show that MF-TOWmuT can preserve the type I error rates and is more powerful than several existing methods in different simulation scenarios, MFTOWmuT is also quite …
Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das
Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das
Electronic Theses and Dissertations
Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …
Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis
Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis
Dissertations & Theses (Open Access)
Tumor cells have heterogeneous genotypes, which drives progression and treatment resistance. Such genetic intratumor heterogeneity plays a role in the process of clonal evolution that underlies tumor progression and treatment resistance. Single-cell DNA sequencing is a promising experimental method for studying intratumor heterogeneity, but brings unique statistical challenges in interpreting the resulting data. Researchers lack methods to determine whether sufficiently many cells have been sampled from a tumor. In addition, there are no proven computational methods for determining the ploidy of a cell, a necessary step in the determination of copy number. In this work, software for calculating probabilities from …
Estimation And Testing Of Gene Expression Heterosis, Tieming Ji, Peng Liu, Dan Nettleton
Estimation And Testing Of Gene Expression Heterosis, Tieming Ji, Peng Liu, Dan Nettleton
Dan Nettleton
Heterosis, also known as the hybrid vigor, occurs when the mean phenotype of hybrid offspring is superior to that of its two inbred parents. The heterosis phenomenon is extensively utilized in agriculture though the molecular basis is still unknown. In an effort to understand phenotypic heterosis at the molecular level, researchers have begun to compare expression levels of thousands of genes between parental inbred lines and their hybrid offspring to search for evidence of gene expression heterosis. Standard statistical approaches for separately analyzing expression data for each gene can produce biased and highly variable estimates and unreliable tests of heterosis. …
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
COBRA Preprint Series
One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …
A Novel Pathway-Based Distance Score Enhances Assessment Of Disease Heterogeneity In Gene Expression, Yunqing Liu, Xiting Yan
A Novel Pathway-Based Distance Score Enhances Assessment Of Disease Heterogeneity In Gene Expression, Yunqing Liu, Xiting Yan
Yale Day of Data
Distance-based unsupervised clustering of gene expression data is commonly used to identify heterogeneity in biologic samples. However, high noise levels in gene expression data and the relatively high correlation between genes are often encountered, so traditional distances such as Euclidean distance may not be effective at discriminating the biological differences between samples. In this study, we developed a novel computational method to assess the biological differences based on pathways by assuming that ontologically defined biological pathways in biologically similar samples have similar behavior. Application of this distance score results in more accurate, robust, and biologically meaningful clustering results in both …
Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor
Electronic Theses and Dissertations
Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Electronic Thesis and Dissertation Repository
ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.
This thesis presents a novel …
Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz
Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz
School of Computing: Faculty Publications
Evolutionary studies usually assume that the genetic mutations are independent of each other. However, that does not imply that the observed mutations are independent of each other because it is possible that when a nucleotide is mutated, then it may be biologically beneficial if an adjacent nucleotide mutates too. With a number of decoded genes currently available in various genome libraries and online databases, it is now possible to have a large-scale computer-based study to test whether the independence assumption holds for pairs of adjacent amino acids. Hence the independence question also arises for pairs of adjacent amino acids within …
Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu
Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu
Doctoral Dissertations
In this thesis, we propose statistical models for addressing commonly encountered data types and study designs in large epidemiologic investigations aimed at understanding the molecular basis of complex disorders. The motivating applications come from diverse disease areas in Women's Health, including the study of type II diabetes in the Women's Health Initiative (WHI), invasive breast cancer in the Nurses' Health Study and the study of the metabolomic underpinnings of cardiovascular disease in the WHI. We have also put significant effort into making the implementation of the proposed methods accessible through freely available, user-friendly software packages in R. The first chapter …
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
UW Biostatistics Working Paper Series
We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …
Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris
Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris
Jeffrey S. Morris
We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on …
Statistics In The Billera-Holmes-Vogtmann Treespace, Grady S. Weyenberg
Statistics In The Billera-Holmes-Vogtmann Treespace, Grady S. Weyenberg
Theses and Dissertations--Statistics
This dissertation is an effort to adapt two classical non-parametric statistical techniques, kernel density estimation (KDE) and principal components analysis (PCA), to the Billera-Holmes-Vogtmann (BHV) metric space for phylogenetic trees. This adaption gives a more general framework for developing and testing various hypotheses about apparent differences or similarities between sets of phylogenetic trees than currently exists.
For example, while the majority of gene histories found in a clade of organisms are expected to be generated by a common evolutionary process, numerous other coexisting processes (e.g. horizontal gene transfers, gene duplication and subsequent neofunctionalization) will cause some genes to exhibit a …
Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull
Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull
Jeffrey S. Morris
Current methods for conducting expression Quantitative Trait Loci (eQTL) analysis are limited in scope to a pairwise association testing between a single nucleotide polymorphism (SNPs) and expression probe set in a region around a gene of interest, thus ignoring the inherent between-SNP correlation. To determine association, p-values are then typically adjusted using Plug-in False Discovery Rate. As many SNPs are interrogated in the region and multiple probe-sets taken, the current approach requires the fitting of a large number of models. We propose to remedy this by introducing a flexible function-on-scalar regression that models the genome as a functional outcome. The …
Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis
Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis
Mathematics and Statistics Department Faculty Publication Series
When conducting high-throughput biological experiments, it is often necessary to develop a protocol that is both inexpensive and robust. Standard approaches are either not cost-effective or arrive at an optimized protocol that is sensitive to experimental variations. Here, we describe a novel approach that directly minimizes the cost of the protocol while ensuring the protocol is robust to experimental variation. Our approach uses a risk-averse conditional value-at-risk criterion in a robust parameter design framework. We demonstrate this approach on a polymerase chain reaction protocol and show that our improved protocol is less expensive than the standard protocol and more robust …
Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi
Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi
Jeffrey S. Morris
Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.
Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.
Design: A single-arm, phase II trial.
Patients: Twenty-seven patients with FAP.
Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.
Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …
Bayesian Methods For Expression-Based Integration, Elizabeth M. Jennings, Jeffrey S. Morris, Raymond J. Carroll, Ganiraju C. Manyam, Veera Baladandayuthapani
Bayesian Methods For Expression-Based Integration, Elizabeth M. Jennings, Jeffrey S. Morris, Raymond J. Carroll, Ganiraju C. Manyam, Veera Baladandayuthapani
Jeffrey S. Morris
We propose methods to integrate data across several genomic platforms using a hierarchical Bayesian analysis framework that incorporates the biological relationships among the platforms to identify genes whose expression is related to clinical outcomes in cancer. This integrated approach combines information across all platforms, leading to increased statistical power in finding these predictive genes, and further provides mechanistic information about the manner in which the gene affects the outcome. We demonstrate the advantages of the shrinkage estimation used by this approach through a simulation, and finally, we apply our method to a Glioblastoma Multiforme dataset and identify several genes potentially …
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Jeffrey S. Morris
In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Jeffrey S. Morris
Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.
Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …
Multiple Testing Of Local Maxima For Detection Of Peaks In Chip-Seq Data, Armin Schwartzman, Andrew Jaffe, Yulia Gavrilov, Clifford A. Meyer
Multiple Testing Of Local Maxima For Detection Of Peaks In Chip-Seq Data, Armin Schwartzman, Andrew Jaffe, Yulia Gavrilov, Clifford A. Meyer
Harvard University Biostatistics Working Paper Series
No abstract provided.
A Unified Approach To Non-Negative Matrix Factorization And Probabilistic Latent Semantic Indexing, Karthik Devarajan, Guoli Wang, Nader Ebrahimi
A Unified Approach To Non-Negative Matrix Factorization And Probabilistic Latent Semantic Indexing, Karthik Devarajan, Guoli Wang, Nader Ebrahimi
COBRA Preprint Series
Non-negative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two matrices, W and H, each with nonnegative entries, V ~ WH. NMF has been shown to have a unique parts-based, sparse representation of the data. The nonnegativity constraints in NMF allow only additive combinations of the data which enables it to learn parts that have distinct physical representations in reality. In the last few years, NMF has been successfully applied in a variety of areas such as natural language processing, information retrieval, image processing, speech recognition …
Component Extraction Of Complex Biomedical Signal And Performance Analysis Based On Different Algorithm, Hemant Pasusangai Kasturiwale
Component Extraction Of Complex Biomedical Signal And Performance Analysis Based On Different Algorithm, Hemant Pasusangai Kasturiwale
Johns Hopkins University, Dept. of Biostatistics Working Papers
Biomedical signals can arise from one or many sources including heart ,brains and endocrine systems. Multiple sources poses challenge to researchers which may have contaminated with artifacts and noise. The Biomedical time series signal are like electroencephalogram(EEG),electrocardiogram(ECG),etc The morphology of the cardiac signal is very important in most of diagnostics based on the ECG. The diagnosis of patient is based on visual observation of recorded ECG,EEG,etc, may not be accurate. To achieve better understanding , PCA (Principal Component Analysis) and ICA algorithms helps in analyzing ECG signals . The immense scope in the field of biomedical-signal processing Independent Component Analysis( …
Clustering With Exclusion Zones: Genomic Applications, Mark Segal, Yuanyuan Xiao, Fred Huffer
Clustering With Exclusion Zones: Genomic Applications, Mark Segal, Yuanyuan Xiao, Fred Huffer
Mark R Segal
Methods for formally evaluating the clustering of events in space or time, notably the scan statistic, have been richly developed and widely applied. In order to utilize the scan statistic and related approaches, it is necessary to know the extent of the spatial or temporal domains wherein the events arise. Implicit in their usage is that these domains have no “holes”—hereafter “exclusion zones”—regions in which events a priori cannot occur. However, in many contexts, this requirement is not met. When the exclusion zones are known, it is straightforward to correct the scan statistic for their occurrence by simply adjusting the …
Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel
Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel
COBRA Preprint Series
In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …
A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai
A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai
Harvard University Biostatistics Working Paper Series
No abstract provided.
The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel
The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel
COBRA Preprint Series
A general function to quantify the weight of evidence in a sample of data for one hypothesis over another is derived from the law of likelihood and from a statistical formalization of inference to the best explanation. For a fixed parameter of interest, the resulting weight of evidence that favors one composite hypothesis over another is the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function over the parameter of interest. Since the weight of evidence is generally only known up to a nuisance parameter, it is approximated by replacing the likelihood function with …
Wavelet-Based Functional Linear Mixed Models: An Application To Measurement Error–Corrected Distributed Lag Models, Elizabeth J. Malloy, Jeffrey S. Morris, Sara D. Adar, Helen Suh, Diane R. Gold, Brent A. Coull
Wavelet-Based Functional Linear Mixed Models: An Application To Measurement Error–Corrected Distributed Lag Models, Elizabeth J. Malloy, Jeffrey S. Morris, Sara D. Adar, Helen Suh, Diane R. Gold, Brent A. Coull
Jeffrey S. Morris
Frequently, exposure data are measured over time on a grid of discrete values that collectively define a functional observation. In many applications, researchers are interested in using these measurements as covariates to predict a scalar response in a regression setting, with interest focusing on the most biologically relevant time window of exposure. One example is in panel studies of the health effects of particulate matter (PM), where particle levels are measured over time. In such studies, there are many more values of the functional data than observations in the data set so that regularization of the corresponding functional regression coefficient …
Members’ Discoveries: Fatal Flaws In Cancer Research, Jeffrey S. Morris
Members’ Discoveries: Fatal Flaws In Cancer Research, Jeffrey S. Morris
Jeffrey S. Morris
A recent article published in The Annals of Applied Statistics (AOAS) by two MD Anderson researchers—Keith Baggerly and Kevin Coombes—dissects results from a highly-influential series of medical papers involving genomics-driven personalized cancer therapy, and outlines a series of simple yet fatal flaws that raises serious questions about the veracity of the original results. Having immediate and strong impact, this paper, along with related work, is providing the impetus for new standards of reproducibility in scientific research.
Statistical Contributions To Proteomic Research, Jeffrey S. Morris, Keith A. Baggerly, Howard B. Gutstein, Kevin R. Coombes
Statistical Contributions To Proteomic Research, Jeffrey S. Morris, Keith A. Baggerly, Howard B. Gutstein, Kevin R. Coombes
Jeffrey S. Morris
Proteomic profiling has the potential to impact the diagnosis, prognosis, and treatment of various diseases. A number of different proteomic technologies are available that allow us to look at many proteins at once, and all of them yield complex data that raise significant quantitative challenges. Inadequate attention to these quantitative issues can prevent these studies from achieving their desired goals, and can even lead to invalid results. In this chapter, we describe various ways the involvement of statisticians or other quantitative scientists in the study team can contribute to the success of proteomic research, and we outline some of the …
Informatics And Statistics For Analyzing 2-D Gel Electrophoresis Images, Andrew W. Dowsey, Jeffrey S. Morris, Howard G. Gutstein, Guang Z. Yang
Informatics And Statistics For Analyzing 2-D Gel Electrophoresis Images, Andrew W. Dowsey, Jeffrey S. Morris, Howard G. Gutstein, Guang Z. Yang
Jeffrey S. Morris
Whilst recent progress in ‘shotgun’ peptide separation by integrated liquid chromatography and mass spectrometry (LC/MS) has enabled its use as a sensitive analytical technique, proteome coverage and reproducibility is still limited and obtaining enough replicate runs for biomarker discovery is a challenge. For these reasons, recent research demonstrates the continuing need for protein separation by two-dimensional gel electrophoresis (2-DE). However, with traditional 2-DE informatics, the digitized images are reduced to symbolic data though spot detection and quantification before proteins are compared for differential expression by spot matching. Recently, a more robust and automated paradigm has emerged where gels are directly …