Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Gene expression

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 27 of 27

Full-Text Articles in Computational Biology

Modeling Nonsegmented Negative-Strand Rna Virus (Nnsv) Transcription With Ejective Polymerase Collisions And Biased Diffusion, Felipe-Andres Piedra Sep 2023

Modeling Nonsegmented Negative-Strand Rna Virus (Nnsv) Transcription With Ejective Polymerase Collisions And Biased Diffusion, Felipe-Andres Piedra

Research Symposium

Background: The textbook model of NNSV transcription predicts a gene expression gradient. However, multiple studies show non-gradient gene expression patterns or data inconsistent with a simple gradient. Regarding the latter, several studies show a dramatic decrease in gene expression over the last two genes of the respiratory syncytial virus (RSV) genome (a highly studied NNSV). The textbook model cannot explain these phenomena.

Methods: Computational models of RSV and vesicular stomatitis virus (VSV – another highly studied NNSV) transcription were written in the Python programming language using the Scientific Python Development Environment. The model code is freely available on GitHub: …


Adjusting For Gene-Specific Covariates To Improve Rna-Seq Analysis, Hyeongseon Jeon, Kyu-Sang Lim, Yet Nguyen, Dan Nettleton Jan 2023

Adjusting For Gene-Specific Covariates To Improve Rna-Seq Analysis, Hyeongseon Jeon, Kyu-Sang Lim, Yet Nguyen, Dan Nettleton

Mathematics & Statistics Faculty Publications

Summary

This paper suggests a novel positive false discovery rate (pFDR) controlling method for testing gene-specific hypotheses using a gene-specific covariate variable, such as gene length. We suppose the null probability depends on the covariate variable. In this context, we propose a rejection rule that accounts for heterogeneity among tests by employing two distinct types of null probabilities. We establish a pFDR estimator for a given rejection rule by following Storey's q-value framework. A condition on a type 1 error posterior probability is provided that equivalently characterizes our rejection rule. We also present a suitable procedure for selecting a tuning …


Improved Radiation Expression Profiling In Blood By Sequential Application Of Sensitive And Specific Gene Signatures, Eliseos J. Mucaki, Ben C. Shirley, Peter K. Rogan Oct 2021

Improved Radiation Expression Profiling In Blood By Sequential Application Of Sensitive And Specific Gene Signatures, Eliseos J. Mucaki, Ben C. Shirley, Peter K. Rogan

Biochemistry Publications

Purpose. Combinations of expressed genes can discriminate radiation-exposed from normal control blood samples by machine learning based signatures (with 8 to 20% misclassification rates). These signatures can quantify therapeutically-relevant as well as accidental radiation exposures. The prodromal symptoms of Acute Radiation Syndrome (ARS) overlap those present in Influenza and Dengue Fever infections. Surprisingly, these human radiation signatures misclassified gene expression profiles of virally infected samples as false positive exposures. The present study investigates these and other confounders, and then mitigates their impact on signature accuracy.

Methods. This study investigated recall by previous and novel radiation signatures independently derived …


Characterization Of The Growth Factor Receptor Network Oncogenes In Lung Cancer, Ashley Duche Aug 2021

Characterization Of The Growth Factor Receptor Network Oncogenes In Lung Cancer, Ashley Duche

Pharmaceutical Sciences (MS) Theses

Lung cancer remains the leading cause of cancer related deaths worldwide, reportedly contributing to 1.8 million of the 10.0 million mortalities documented in the year 2020. Although advancements have been made in therapeutics and diagnostic methods, formulation of effective treatments and development of drug resistance continues to be a challenge. These challenges arise from our lack of understanding of intricate signaling pathways, such as the Growth Factor Receptor Network (GFRN), which contributes to complex lung tumor heterogeneity allowing for drug resistance development. In this study, gene expression signatures of six GFRN oncogenes overexpressed in human mammary epithelial cells (HMECs) were …


Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das Dec 2020

Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das

Electronic Theses and Dissertations

Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …


Estimation And Testing Of Gene Expression Heterosis, Tieming Ji, Peng Liu, Dan Nettleton Jun 2019

Estimation And Testing Of Gene Expression Heterosis, Tieming Ji, Peng Liu, Dan Nettleton

Dan Nettleton

Heterosis, also known as the hybrid vigor, occurs when the mean phenotype of hybrid offspring is superior to that of its two inbred parents. The heterosis phenomenon is extensively utilized in agriculture though the molecular basis is still unknown. In an effort to understand phenotypic heterosis at the molecular level, researchers have begun to compare expression levels of thousands of genes between parental inbred lines and their hybrid offspring to search for evidence of gene expression heterosis. Standard statistical approaches for separately analyzing expression data for each gene can produce biased and highly variable estimates and unreliable tests of heterosis. …


Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang Mar 2019

Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang

Biostatistics Faculty Publications

With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene’s expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) …


Region Based Gene Expression Via Reanalysis Of Publicly Available Microarray Data Sets., Ernur Saka May 2018

Region Based Gene Expression Via Reanalysis Of Publicly Available Microarray Data Sets., Ernur Saka

Electronic Theses and Dissertations

A DNA microarray is a high-throughput technology used to identify relative gene expression. One of the most widely used platforms is the Affymetrix® GeneChip® technology which detects gene expression levels based on probe sets composed of a set of twenty-five nucleotide probes designed to hybridize with specific gene targets. Given a particular Affymetrix® GeneChip® platform, the design of the probes is fixed. However, the method of analysis is dynamic in nature due to the ability to annotate and group probes into uniquely defined groupings. This is particularly important since publicly available repositories of microarray datasets, such as ArrayExpress and NCBI’s …


Chromatin Accessibility Dynamics In The Arabidopsis Root Epidermis And Endodermis During Cold Acclimation, Shawn Hoogstra Nov 2017

Chromatin Accessibility Dynamics In The Arabidopsis Root Epidermis And Endodermis During Cold Acclimation, Shawn Hoogstra

Electronic Thesis and Dissertation Repository

Understanding cell-type specific transcriptional responses to environmental conditions is limited by a lack of knowledge of transcriptional control due to epigenetic dynamics. Additionally, cell-type analyses are limited by difficulties in applying current technologies to single cell-types. A novel DNase-seq protocol and analysis procedure, deemed DNase-DTS, was developed to identify DHSs in the Arabidopsis epidermis and endodermis under control and cold acclimation conditions. Results identified thousands of DHSs within each cell-type and experimental condition. DHSs showed strong association to gene expression, DNA methylation, and histone modifications. A priori mapping of existing DNA binding motifs within accessible genes and the cold C-repeat/dehydration …


Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li May 2017

Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li

Dissertations & Theses (Open Access)

My dissertation is focused on quantitative methodology development and application for two important topics in translational and clinical cancer research.

The first topic was motivated by the challenge of applying transcriptome sequencing (RNA-seq) to formalin-fixation and paraffin-embedding (FFPE) tumor samples for reliable diagnostic development. We designed a biospecimen study to directly compare gene expression results from different protocols to prepare libraries for RNA-seq from human breast cancer tissues, with randomization to fresh-frozen (FF) or FFPE conditions. To comprehensively evaluate the FFPE RNA-seq data quality for expression profiling, we developed multiple computational methods for assessment, such as the uniformity and continuity …


A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im Aug 2016

A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im

Heather Wheeler

Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual’s genetic profile and correlates ‘imputed’ gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys …


Leveraging Global Gene Expression Patterns To Predict Expression Of Unmeasured Genes, James Rudd, René A. Zelaya, Eugene Demidenko, Ellen L. Goode, Casey S. Greene S. Greene, Jennifer A. Doherty Dec 2015

Leveraging Global Gene Expression Patterns To Predict Expression Of Unmeasured Genes, James Rudd, René A. Zelaya, Eugene Demidenko, Ellen L. Goode, Casey S. Greene S. Greene, Jennifer A. Doherty

Dartmouth Scholarship

BackgroundLarge collections of paraffin-embedded tissue represent a rich resource to test hypotheses based on gene expression patterns; however, measurement of genome-wide expression is cost-prohibitive on a large scale. Using the known expression correlation structure within a given disease type (in this case, high grade serous ovarian cancer; HGSC), we sought to identify reduced sets of directly measured (DM) genes which could accurately predict the expression of a maximized number of unmeasured genes.


Genome-Wide Detection And Analysis Of Multifunctional Genes, Yuri Pritykin, Dario Ghersi, Mona Singh Oct 2015

Genome-Wide Detection And Analysis Of Multifunctional Genes, Yuri Pritykin, Dario Ghersi, Mona Singh

Interdisciplinary Informatics Faculty Publications

Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, …


A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Gtex Consortium, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im Sep 2015

A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Gtex Consortium, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im

Bioinformatics Faculty Publications

Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual’s genetic profile and correlates ‘imputed’ gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys …


Computational Analysis Of Gene Expression And Connectivity Patterns In The Convoluted Structures Of Mouse Cerebellum, Tao Zeng Jun 2014

Computational Analysis Of Gene Expression And Connectivity Patterns In The Convoluted Structures Of Mouse Cerebellum, Tao Zeng

Computer Science Theses & Dissertations

One significant difference between evolved mammalian brains and other species is that mammalian brains exhibit increasingly convoluted structures in the cerebral cortex. Groove and ridge shaped structures named gyri and sulci expand surface area of cerebral cortex, making more functions possible. Prior studies using neuroimaging techniques such as dMRI and DTI have revealed that neural fibers are heavily connected to gyri comparing to those connected to sulci, such macro-scale experiments indicates that gyri are involved in large scale information processing while sulci process information locally. However, molecular and cellar level evidences, namely, gene expression pattern and its resulting neuronal connectivity …


Structural Features Of The Pseudomonas Fluorescens Biofilm Adhesin Lapa Required For Lapg-Dependent Cleavage, Biofilm Formation, And Cell Surface Localization, Chelsea D. Boyd, T. Jarrod Smith, Sofiane El-Kirat-Chatel, Peter D. Newell, Yves F. Dufrêne, George A. O'Toole May 2014

Structural Features Of The Pseudomonas Fluorescens Biofilm Adhesin Lapa Required For Lapg-Dependent Cleavage, Biofilm Formation, And Cell Surface Localization, Chelsea D. Boyd, T. Jarrod Smith, Sofiane El-Kirat-Chatel, Peter D. Newell, Yves F. Dufrêne, George A. O'Toole

Dartmouth Scholarship

The localization of the LapA protein to the cell surface is a key step required by Pseudomonas fluorescens Pf0-1 to irreversibly attach to a surface and form a biofilm. LapA is a member of a diverse family of predicted bacterial adhesins, and although lacking a high degree of sequence similarity, family members do share common predicted domains. Here, using mutational analysis, we determine the significance of each domain feature of LapA in relation to its export and localization to the cell surface and function in biofilm formation. Our previous work showed that the N terminus of LapA is required for …


P53 Maintains Hepatic Cell Identity During Liver Regeneration, Zeynep Hande Coban Akdemir May 2014

P53 Maintains Hepatic Cell Identity During Liver Regeneration, Zeynep Hande Coban Akdemir

Dissertations & Theses (Open Access)

p53 MAINTAINS HEPATIC CELL IDENTITY DURING LIVER REGENERATION

Zeynep Hande Coban Akdemir, B.S.,M.A.

Advisory Professor: Michelle Craig Barton, Ph.D.

p53 is a tumor suppressor that has been well studied in tumor-derived, cultured cells. However, its functions in normal proliferating cells and tissues are generally overlooked. We propose that p53 functions during the G1-S transition can be studied in normal, differentiated cells during surgery-induced liver regeneration. Two-thirds partial hepatectomy (PH) of mouse liver offers a unique model to compare p53 functions in regenerating versus sham (control) cells. My hypothesis is that intersection of global expression analyses (microarray and RNA sequencing) and …


Transcription Factor Binding Profiles Reveal Cyclic Expression Of Human Protein-Coding Genes And Non-Coding Rnas, Chao Cheng, Matthew Ung, Gavin D. Grant, Michael L. Whitfield Jul 2013

Transcription Factor Binding Profiles Reveal Cyclic Expression Of Human Protein-Coding Genes And Non-Coding Rnas, Chao Cheng, Matthew Ung, Gavin D. Grant, Michael L. Whitfield

Dartmouth Scholarship

Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and …


Desert Hedgehog Is A Mammal-Specific Gene Expressed During Testicular And Ovarian Development In A Marsupial, William A. O'Hara Jan 2012

Desert Hedgehog Is A Mammal-Specific Gene Expressed During Testicular And Ovarian Development In A Marsupial, William A. O'Hara

Master's Theses

Desert hedgehog (DHH) belongs to the hedgehog gene family that act as secreted intercellular signal transducers. DHH is an essential morphogen for normal testicular development and function in both mice and humans but is not present in the avian lineage. Like other hedgehog proteins, DHH signals through the patched (PTCH) receptors 1 and 2. Here we examine the expression and protein distribution of DHH, PTCH1 and PTCH2 in the developing testes of a marsupial mammal (the tammar wallaby) to determine whether DHH signalling is a conserved factor in gonadal development in all therian mammals.


Linear Methods For Analysis And Quality Control Of Relative Expression Ratios From Quantitative Real-Time Polymerase Chain Reaction Experiments, Robert B. Page, Arnold J. Stromberg Jan 2011

Linear Methods For Analysis And Quality Control Of Relative Expression Ratios From Quantitative Real-Time Polymerase Chain Reaction Experiments, Robert B. Page, Arnold J. Stromberg

Biology Faculty Publications

Relative expression quantitative real-time polymerase chain reaction (RT-qPCR) experiments are a common means of estimating transcript abundances across biological groups and experimental treatments. One of the most frequently used expression measures that results from such experiments is the relative expression ratio (RE), which describes expression in experimental samples (i.e., RNA isolated from organisms, tissues, and/or cells that were exposed to one or more experimental or nonbaseline condition) in terms of fold change relative to calibrator samples (i.e., RNA isolated from organisms, tissues, and/or cells that were exposed to a control or baseline condition). Over the past decade, several …


Appearance Based Stage Recognition Of Drosophila Embryos, Gopi Chand Nutakki Dec 2010

Appearance Based Stage Recognition Of Drosophila Embryos, Gopi Chand Nutakki

Masters Theses & Specialist Projects

Stages in Drosophila development denote the time after fertilization at which certain specific events occur in the developmental cycle. Stage information of a host embryo, as well as spatial information of a gene expression region is indispensable input for the discovery of the pattern of gene-gene interaction. Manual labeling of stages is becoming a bottleneck under the circumstance of high throughput embryo images. Automatic recognition based on the appearances of embryos is becoming a more desirable scheme. This problem, however, is very challenging due to severe variations of illumination and gene expressions. In this research thesis, we propose an appearance …


An Integrative -Omics Approach To Identify Functional Sub-Networks In Human Colorectal Cancer, Rod K. Nibbe, Mehmet Koyutürk, Mark R. Chance Jan 2010

An Integrative -Omics Approach To Identify Functional Sub-Networks In Human Colorectal Cancer, Rod K. Nibbe, Mehmet Koyutürk, Mark R. Chance

Faculty Scholarship

Emerging evidence indicates that gene products implicated in human cancers often cluster together in "hot spots" in protein-protein interaction (PPI) networks. Additionally, small sub-networks within PPI networks that demonstrate synergistic differential expression with respect to tumorigenic phenotypes were recently shown to be more accurate classifiers of disease progression when compared to single targets identified by traditional approaches. However, many of these studies rely exclusively on mRNA expression data, a useful but limited measure of cellular activity. Proteomic profiling experiments provide information at the post-translational level, yet they generally screen only a limited fraction of the proteome. Here, we demonstrate that …


Selecting 'Significant' Differentially Expressed Genes From The Combined Perspective Of The Null And The Alternative, Beatrijs Moerkerke, Els Goetghebeur Apr 2006

Selecting 'Significant' Differentially Expressed Genes From The Combined Perspective Of The Null And The Alternative, Beatrijs Moerkerke, Els Goetghebeur

Harvard University Biostatistics Working Paper Series

No abstract provided.


Cluster Analysis Of Genomic Data With Applications In R, Katherine S. Pollard, Mark J. Van Der Laan Jan 2005

Cluster Analysis Of Genomic Data With Applications In R, Katherine S. Pollard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in R. We discuss statistical issues and methods in choosing the number of clusters, the choice of clustering algorithm, and the choice of dissimilarity matrix. In particular, we illustrate how the bootstrap can be employed as a statistical method in cluster analysis to establish the reproducibility of the clusters and the overall variability of the followed procedure. We also show how to visualize a clustering result by plotting ordered dissimilarity matrices in R. We present a new R package, hopach, which implements the hybrid clustering method, …


Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman May 2004

Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman

Bioconductor Project Working Papers

The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in …


Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh Feb 2004

Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

The use of DNA microarrays has become quite popular in many scientific and medical disciplines, such as in cancer research. One common goal of these studies is to determine which genes are differentially expressed between cancer and healthy tissue, or more generally, between two experimental conditions. A major complication in the molecular profiling of tumors using gene expression data is that the data represent a combination of tumor and normal cells. Much of the methodology developed for assessing differential expression with microarray data has assumed that tissue samples are homogeneous. In this article, we outline a general framework for determining …


Statistical Inference For Simultaneous Clustering Of Gene Expression Data, Katherine S. Pollard, Mark J. Van Der Laan Jul 2001

Statistical Inference For Simultaneous Clustering Of Gene Expression Data, Katherine S. Pollard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Current methods for analysis of gene expression data are mostly based on clustering and classification of either genes or samples. We offer support for the idea that more complex patterns can be identified in the data if genes and samples are considered simultaneously. We formalize the approach and propose a statistical framework for two-way clustering. A simultaneous clustering parameter is defined as a function of the true data generating distribution, and an estimate is obtained by applying this function to the empirical distribution. We illustrate that a wide range of clustering procedures, including generalized hierarchical methods, can be defined as …