Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Genetics and Genomics

Genomics Of Postprandial Lipidomics In The Genetics Of Lipid-Lowering Drugs And Diet Network Study, Marguerite R. Irvin, May E. Montasser, Tobias Kind, Sili Fan, Dinesh K. Barupal, Amit Patki, Rikki M. Tanner, Nicole D. Armstrong, Kathleen A. Ryan, Steven A. Claas, Jeffrey R. O’Connell, Hemant K. Tiwari, Donna K. Arnett Nov 2021

Genomics Of Postprandial Lipidomics In The Genetics Of Lipid-Lowering Drugs And Diet Network Study, Marguerite R. Irvin, May E. Montasser, Tobias Kind, Sili Fan, Dinesh K. Barupal, Amit Patki, Rikki M. Tanner, Nicole D. Armstrong, Kathleen A. Ryan, Steven A. Claas, Jeffrey R. O’Connell, Hemant K. Tiwari, Donna K. Arnett

Epidemiology and Environmental Health Faculty Publications

Postprandial lipemia (PPL) is an important risk factor for cardiovascular disease. Inter-individual variation in the dietary response to a meal is known to be influenced by genetic factors, yet genes that dictate variation in postprandial lipids are not completely characterized. Genetic studies of the plasma lipidome can help to better understand postprandial metabolism by isolating lipid molecular species which are more closely related to the genome. We measured the plasma lipidome at fasting and 6 h after a standardized high-fat meal in 668 participants from the Genetics of Lipid-Lowering Drugs and Diet Network study (GOLDN) using ultra-performance liquid chromatography coupled …


Epigenome-Wide Association Study Of Kidney Function Identifies Trans-Ethnic And Ethnic-Specific Loci, Charles E. Breeze, Anna Batorsky, Mi Kyeong Lee, Mindy D. Szeto, Xiaoguang Xu, Daniel L. Mccartney, Rong Jiang, Amit Patki, Holly J. Kramer, James M. Eales, Laura Raffield, Leslie Lange, Ethan Lange, Peter Durda, Yongmei Liu, Russ P. Tracy, David Van Den Berg, Nhlbi Trans-Omics For Precision Medicine (Topmed) Consortium, Topmed Mesa Multi-Omics Working Group, Kathryn L. Evans, William E. Kraus, Donna K. Arnett Apr 2021

Epigenome-Wide Association Study Of Kidney Function Identifies Trans-Ethnic And Ethnic-Specific Loci, Charles E. Breeze, Anna Batorsky, Mi Kyeong Lee, Mindy D. Szeto, Xiaoguang Xu, Daniel L. Mccartney, Rong Jiang, Amit Patki, Holly J. Kramer, James M. Eales, Laura Raffield, Leslie Lange, Ethan Lange, Peter Durda, Yongmei Liu, Russ P. Tracy, David Van Den Berg, Nhlbi Trans-Omics For Precision Medicine (Topmed) Consortium, Topmed Mesa Multi-Omics Working Group, Kathryn L. Evans, William E. Kraus, Donna K. Arnett

Epidemiology and Environmental Health Faculty Publications

BACKGROUND: DNA methylation (DNAm) is associated with gene regulation and estimated glomerular filtration rate (eGFR), a measure of kidney function. Decreased eGFR is more common among US Hispanics and African Americans. The causes for this are poorly understood. We aimed to identify trans-ethnic and ethnic-specific differentially methylated positions (DMPs) associated with eGFR using an agnostic, genome-wide approach.

METHODS: The study included up to 5428 participants from multi-ethnic studies for discovery and 8109 participants for replication. We tested the associations between whole blood DNAm and eGFR using beta values from Illumina 450K or EPIC arrays. Ethnicity-stratified analyses were performed using linear …


An Ensemble Of The Icluster Method To Analyze Longitudinal Lncrna Expression Data For Psoriasis Patients, Suyan Tian, Chi Wang Apr 2021

An Ensemble Of The Icluster Method To Analyze Longitudinal Lncrna Expression Data For Psoriasis Patients, Suyan Tian, Chi Wang

Internal Medicine Faculty Publications

BACKGROUND: Psoriasis is an immune-mediated, inflammatory disorder of the skin with chronic inflammation and hyper-proliferation of the epidermis. Since psoriasis has genetic components and the diseased tissue of psoriasis is very easily accessible, it is natural to use high-throughput technologies to characterize psoriasis and thus seek targeted therapies. Transcriptional profiles change correspondingly after an intervention. Unlike cross-sectional gene expression data, longitudinal gene expression data can capture the dynamic changes and thus facilitate causal inference.

METHODS: Using the iCluster method as a building block, an ensemble method was proposed and applied to a longitudinal gene expression dataset for psoriasis, with the …


Chromosome Xq23 Is Associated With Lower Atherogenic Lipid Concentrations And Favorable Cardiometabolic Indices, Pradeep Natarajan, Akhil Pampana, Sarah E. Graham, Sanni E. Ruotsalainen, James A. Perry, Paul S. De Vries, Jai G. Broome, James P. Pirruccello, Michael C. Honigberg, Krishna Aragam, Brooke Wolford, Jennifer A. Brody, Lucinda Antonacci-Fulton, Moscati Arden, Stella Aslibekyan, Themistocles L. Assimes, Christie M. Ballantyne, Lawrence F. Bielak, Joshua C. Bis, Brian E. Cade, Donna K. Arnett Apr 2021

Chromosome Xq23 Is Associated With Lower Atherogenic Lipid Concentrations And Favorable Cardiometabolic Indices, Pradeep Natarajan, Akhil Pampana, Sarah E. Graham, Sanni E. Ruotsalainen, James A. Perry, Paul S. De Vries, Jai G. Broome, James P. Pirruccello, Michael C. Honigberg, Krishna Aragam, Brooke Wolford, Jennifer A. Brody, Lucinda Antonacci-Fulton, Moscati Arden, Stella Aslibekyan, Themistocles L. Assimes, Christie M. Ballantyne, Lawrence F. Bielak, Joshua C. Bis, Brian E. Cade, Donna K. Arnett

Epidemiology and Environmental Health Faculty Publications

Autosomal genetic analyses of blood lipids have yielded key insights for coronary heart disease (CHD). However, X chromosome genetic variation is understudied for blood lipids in large sample sizes. We now analyze genetic and blood lipid data in a high-coverage whole X chromosome sequencing study of 65,322 multi-ancestry participants and perform replication among 456,893 European participants. Common alleles on chromosome Xq23 are strongly associated with reduced total cholesterol, LDL cholesterol, and triglycerides (min P = 8.5 × 10−72), with similar effects for males and females. Chromosome Xq23 lipid-lowering alleles are associated with reduced odds for CHD among 42,545 …


Whole-Exome Sequencing And Hipsc Cardiomyocyte Models Identify Myrip, Trappc11, And Slc27a6 Of Potential Importance To Left Ventricular Hypertrophy In An African Ancestry Population, Marguerite R. Irvin, Praful Aggarwal, Steven A. Claas, Lisa De Las Fuentes, Anh N. Do, C. Charles Gu, Andrea Matter, Benjamin S. Olson, Amit Patki, Karen Schwander, Joshua D. Smith, Vinodh Srinivasasainagendra, Hemant K. Tiwari, Amy J. Turner, Deborah A. Nickerson, Dabeeru C. Rao, Ulrich Broeckel, Donna K. Arnett Feb 2021

Whole-Exome Sequencing And Hipsc Cardiomyocyte Models Identify Myrip, Trappc11, And Slc27a6 Of Potential Importance To Left Ventricular Hypertrophy In An African Ancestry Population, Marguerite R. Irvin, Praful Aggarwal, Steven A. Claas, Lisa De Las Fuentes, Anh N. Do, C. Charles Gu, Andrea Matter, Benjamin S. Olson, Amit Patki, Karen Schwander, Joshua D. Smith, Vinodh Srinivasasainagendra, Hemant K. Tiwari, Amy J. Turner, Deborah A. Nickerson, Dabeeru C. Rao, Ulrich Broeckel, Donna K. Arnett

Epidemiology and Environmental Health Faculty Publications

Background: Indices of left ventricular (LV) structure and geometry represent useful intermediate phenotypes related to LV hypertrophy (LVH), a predictor of cardiovascular (CV) disease (CVD) outcomes.

Methods and Results: We conducted an exome-wide association study of LV mass (LVM) adjusted to height2.7, LV internal diastolic dimension (LVIDD), and relative wall thickness (RWT) among 1,364 participants of African ancestry (AAs) in the Hypertension Genetic Epidemiology Network (HyperGEN). Both single-variant and gene-based sequence kernel association tests were performed to examine whether common and rare coding variants contribute to variation in echocardiographic traits in AAs. We then used a data-driven …


Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang Apr 2019

Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang

Biostatistics Faculty Publications

To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, …


Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang Mar 2019

Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang

Biostatistics Faculty Publications

With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene’s expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) …


Large-Scale Genome-Wide Meta-Analysis Of Polycystic Ovary Syndrome Suggests Shared Genetic Architecture For Different Diagnosis Criteria, Felix Day, Tugce Karaderi, Michelle R. Jones, Cindy Meun, Chunyan He, Alex Drong, Peter Kraft, Nan Lin, Hongyan Huang, Linda Broer, Reedik Magi, Richa Saxena, Triin Laisk, Margrit Urbanek, M. Geoffrey Hayes, Gudmar Thorleifsson, Juan Fernandez-Tajes, Anubha Mahajan, Benjamin H. Mullin, Bronwyn G. A. Stuckey, Timothy D. Spector, Scott G. Wilson, Mark O. Goodarzi, Lea Davis, Barbara Obermayer-Pietsch, André G. Uitterlinden, Verneri Anttila, Benjamin M. Neale, Marjo-Riitta Jarvelin, Bart Fauser Dec 2018

Large-Scale Genome-Wide Meta-Analysis Of Polycystic Ovary Syndrome Suggests Shared Genetic Architecture For Different Diagnosis Criteria, Felix Day, Tugce Karaderi, Michelle R. Jones, Cindy Meun, Chunyan He, Alex Drong, Peter Kraft, Nan Lin, Hongyan Huang, Linda Broer, Reedik Magi, Richa Saxena, Triin Laisk, Margrit Urbanek, M. Geoffrey Hayes, Gudmar Thorleifsson, Juan Fernandez-Tajes, Anubha Mahajan, Benjamin H. Mullin, Bronwyn G. A. Stuckey, Timothy D. Spector, Scott G. Wilson, Mark O. Goodarzi, Lea Davis, Barbara Obermayer-Pietsch, André G. Uitterlinden, Verneri Anttila, Benjamin M. Neale, Marjo-Riitta Jarvelin, Bart Fauser

Internal Medicine Faculty Publications

Polycystic ovary syndrome (PCOS) is a disorder characterized by hyperandrogenism, ovulatory dysfunction and polycystic ovarian morphology. Affected women frequently have metabolic disturbances including insulin resistance and dysregulation of glucose homeostasis. PCOS is diagnosed with two different sets of diagnostic criteria, resulting in a phenotypic spectrum of PCOS cases. The genetic similarities between cases diagnosed based on the two criteria have been largely unknown. Previous studies in Chinese and European subjects have identified 16 loci associated with risk of PCOS. We report a fixed-effect, inverse-weighted-variance meta-analysis from 10,074 PCOS cases and 103,164 controls of European ancestry and characterisation of PCOS related …


A Logitudinal Feature Selection Method Identifies Relevant Genes To Distinguish Complicated Injury And Uncomplicated Injury Over Time, Suyan Tian, Chi Wang, Howard H. Chang Dec 2018

A Logitudinal Feature Selection Method Identifies Relevant Genes To Distinguish Complicated Injury And Uncomplicated Injury Over Time, Suyan Tian, Chi Wang, Howard H. Chang

Biostatistics Faculty Publications

Background: Feature selection and gene set analysis are of increasing interest in the field of bioinformatics. While these two approaches have been developed for different purposes, we describe how some gene set analysis methods can be utilized to conduct feature selection.

Methods: We adopted a gene set analysis method, the significance analysis of microarray gene set reduction (SAMGSR) algorithm, to carry out feature selection for longitudinal gene expression data.

Results: Using a real-world application and simulated data, it is demonstrated that the proposed SAMGSR extension outperforms other relevant methods. In this study, we illustrate that a gene’s expression profiles over …


Association Analyses Of Repeated Measures On Triglyceride And High-Density Lipoprotein Levels: Insights From Gaw20, Saurabh Ghosh, David W. Fardo Sep 2018

Association Analyses Of Repeated Measures On Triglyceride And High-Density Lipoprotein Levels: Insights From Gaw20, Saurabh Ghosh, David W. Fardo

Biostatistics Faculty Publications

Background: The GAW20 group formed on the theme of methods for association analyses of repeated measures comprised 4sets of investigators. The provided “real” data set included genotypes obtained from a human whole-genome association study based on longitudinal measurements of triglycerides (TGs) and high-density lipoprotein in addition to methylation levels before and after administration of fenofibrate. The simulated data set contained 200 replications of methylation levels and posttreatment TGs, mimicking the real data set.

Results: The different investigators in the group focused on the statistical challenges unique to family-based association analyses of phenotypes measured longitudinally and applied a wide spectrum of …


Gaw20: Methods And Strategies For The New Frontiers Of Epigenetics And Pharmacogenomics, Nathan L. Tintle, David W. Fardo, Marzia De Andrade, Stella Aslibekyan, Julia N. Bailey, Justo Lorenzo Bermejo, Rita M. Cantor, Saurabh Ghosh, Philip Melton, Xuexua Wang, Jean W. Maccluer, Laura Almasy Sep 2018

Gaw20: Methods And Strategies For The New Frontiers Of Epigenetics And Pharmacogenomics, Nathan L. Tintle, David W. Fardo, Marzia De Andrade, Stella Aslibekyan, Julia N. Bailey, Justo Lorenzo Bermejo, Rita M. Cantor, Saurabh Ghosh, Philip Melton, Xuexua Wang, Jean W. Maccluer, Laura Almasy

Biostatistics Faculty Publications

GAW20 provided a platform for developing and evaluating statistical methods to analyze human lipid-related phenotypes, DNA methylation, and single-nucleotide markers in a study involving a pharmaceutical intervention. In this article, we present an overview of the data sets and the contributions analyzing these data. The data, donated by the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) investigators, included data from 188 families (N = 1105) which included genome-wide DNA methylation data before and after a 3-week treatment with fenofibrate, single-nucleotide polymorphisms, metabolic syndrome components before and after treatment, and a variety of covariates. The contributions from individual …


Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin Dec 2017

Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin

Biostatistics Faculty Publications

Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, so misuse of statistical methods is a growing concern. Hypothesis and significance testing, and the accompanying P-values are being scrutinized as representing the most widely applied and abused practices. One line of critique is that P-values are inherently unfit to fulfill their ostensible role as measures of credibility for scientific hypotheses. It has also been suggested that while P-values …


Systems Biology Approach To Late-Onset Alzheimer's Disease Genome-Wide Association Study Identifies Novel Candidate Genes Validated Using Brain Expression Data And Caenorhabditis Elegans Experiments, Shubhabrata Mukherjee, Joshua C. Russell, Daniel T. Carr, Jeremy D. Burgess, Mariet Allen, Daniel J. Serie, Kevin L. Boehme, John S. K. Kauwe, Adam C. Naj, David W. Fardo, Dennis W. Dickson, Thomas J. Montine, Nilufer Ertekin-Taner, Matt R. Kaeberlein, Paul K. Crane Oct 2017

Systems Biology Approach To Late-Onset Alzheimer's Disease Genome-Wide Association Study Identifies Novel Candidate Genes Validated Using Brain Expression Data And Caenorhabditis Elegans Experiments, Shubhabrata Mukherjee, Joshua C. Russell, Daniel T. Carr, Jeremy D. Burgess, Mariet Allen, Daniel J. Serie, Kevin L. Boehme, John S. K. Kauwe, Adam C. Naj, David W. Fardo, Dennis W. Dickson, Thomas J. Montine, Nilufer Ertekin-Taner, Matt R. Kaeberlein, Paul K. Crane

Biostatistics Faculty Publications

Introduction—We sought to determine whether a systems biology approach may identify novel late-onset Alzheimer's disease (LOAD) loci.

Methods—We performed gene-wide association analyses and integrated results with human protein-protein interaction data using network analyses. We performed functional validation on novel genes using a transgenic Caenorhabditis elegans Aβ proteotoxicity model and evaluated novel genes using brain expression data from people with LOAD and other neurodegenerative conditions.

Results—We identified 13 novel candidate LOAD genes outside chromosome 19. Of those, RNA interference knockdowns of the C. elegans orthologs of UBC, NDUFS3, EGR1, and ATP5H were associated with Aβ …


Increased Birth Weight Is Associated With Altered Gene Expression In Neonatal Foreskin, Leryn J. Reynolds, Rebecca I. Pollack, Richard J. Charnigo, Cetewayo S. Rashid, Arnold J. Stromberg, Shu Shen, John O'Brien, Kevin J. Pearson Oct 2017

Increased Birth Weight Is Associated With Altered Gene Expression In Neonatal Foreskin, Leryn J. Reynolds, Rebecca I. Pollack, Richard J. Charnigo, Cetewayo S. Rashid, Arnold J. Stromberg, Shu Shen, John O'Brien, Kevin J. Pearson

Pharmacology and Nutritional Sciences Faculty Publications

Elevated birth weight is linked to glucose intolerance and obesity health-related complications later in life. No studies have examined if infant birth weight is associated with gene expression markers of obesity and inflammation in a tissue that comes directly from the infant following birth. We evaluated the association between birth weight and gene expression on fetal programming of obesity. Foreskin samples were collected following circumcision, and gene expression analyzed comparing the 15% greatest birth weight infants (n = 7) v. the remainder of the cohort (n = 40). Multivariate linear regression models were fit to relate expression levels on differentially …


Impact Of Home Visit Capacity On Genetic Association Studies Of Late-Onset Alzheimer's Disease, David W. Fardo, Laura E. Gibbons, Shubhabrata Mukherjee, M. Maria Glymour, Wayne Mccormick, Susan M. Mccurry, James D. Bowen, Eric B. Larson, Paul K. Crane Aug 2017

Impact Of Home Visit Capacity On Genetic Association Studies Of Late-Onset Alzheimer's Disease, David W. Fardo, Laura E. Gibbons, Shubhabrata Mukherjee, M. Maria Glymour, Wayne Mccormick, Susan M. Mccurry, James D. Bowen, Eric B. Larson, Paul K. Crane

Biostatistics Faculty Publications

INTRODUCTION—Findings for genetic correlates of late-onset Alzheimer's disease (LOAD) in studies that rely solely on clinic visits may differ from those with capacity to follow participants unable to attend clinic visits.

METHODS—We evaluated previously identified LOAD-risk single nucleotide variants in the prospective Adult Changes in Thought study, comparing hazard ratios (HRs) estimated using the full data set of both in-home and clinic visits (n = 1697) to HRs estimated using only data that were obtained from clinic visits (n = 1308). Models were adjusted for age, sex, principal components to account for ancestry, and additional health indicators.

RESULTS …


Identification Of Prognostic Genes And Gene Sets For Early-Stage Non-Small Cell Lung Cancer Using Bi-Level Selection Methods, Suyan Tian, Chi Wang, Howard H. Chang, Jianguo Sun Apr 2017

Identification Of Prognostic Genes And Gene Sets For Early-Stage Non-Small Cell Lung Cancer Using Bi-Level Selection Methods, Suyan Tian, Chi Wang, Howard H. Chang, Jianguo Sun

Biostatistics Faculty Publications

In contrast to feature selection and gene set analysis, bi-level selection is a process of selecting not only important gene sets but also important genes within those gene sets. Depending on the order of selections, a bi-level selection method can be classified into three categories – forward selection, which first selects relevant gene sets followed by the selection of relevant individual genes; backward selection which takes the reversed order; and simultaneous selection, which performs the two tasks simultaneously usually with the aids of a penalized regression model. To test the existence of subtype-specific prognostic genes for non-small cell lung cancer …


Statistical Analyses To Detect And Refine Genetic Associations With Neurodegenerative Diseases, Yuriko Katsumata Jan 2017

Statistical Analyses To Detect And Refine Genetic Associations With Neurodegenerative Diseases, Yuriko Katsumata

Theses and Dissertations--Epidemiology and Biostatistics

Dementia is a clinical state caused by neurodegeneration and characterized by a loss of function in cognitive domains and behavior. Alzheimer’s disease (AD) is the most common form of dementia. Although the amyloid β (Aβ) protein and hyperphosphorylated tau aggregates in the brain are considered to be the key pathological hallmarks of AD, the exact cause of AD is yet to be identified. In addition, clinical diagnoses of AD can be error prone. Many previous studies have compared the clinical diagnosis of AD against the gold standard of autopsy confirmation and shown substantial AD misdiagnosis Hippocampal sclerosis of aging (HS-Aging) …


Comparing Performance Of Non-Tree-Based And Tree-Based Association Mapping Methods, Katherine L. Thompson, David W. Fardo Oct 2016

Comparing Performance Of Non-Tree-Based And Tree-Based Association Mapping Methods, Katherine L. Thompson, David W. Fardo

Statistics Faculty Publications

A central goal in the biomedical and biological sciences is to link variation in quantitative traits to locations along the genome (single nucleotide polymorphisms). Sequencing technology has rapidly advanced in recent decades, along with the statistical methodology to analyze genetic data. Two classes of association mapping methods exist: those that account for the evolutionary relatedness among individuals, and those that ignore the evolutionary relationships among individuals. While the former methods more fully use implicit information in the data, the latter methods are more flexible in the types of data they can handle. This study presents a comparison of the 2 …


Causal Effect Estimation In Sequencing Studies: A Bayesian Method To Account For Confounder Adjustment Uncertainty, Chi Wang, Jinpeng Liu, David W. Fardo Oct 2016

Causal Effect Estimation In Sequencing Studies: A Bayesian Method To Account For Confounder Adjustment Uncertainty, Chi Wang, Jinpeng Liu, David W. Fardo

Biostatistics Faculty Publications

Estimating the causal effect of a single nucleotide variant (SNV) on clinical phenotypes is of interest in many genetic studies. The effect estimation may be confounded by other SNVs as a result of linkage disequilibrium as well as demographic and clinical characteristics. Because a large number of these other variables, which we call potential confounders, are collected, it is challenging to select and adjust for the variables that truly confound the causal effect. The Bayesian adjustment for confounding (BAC) method has been proposed as a general method to estimate the average causal effect in the presence of a large number …


Weighted-Samgsr: Combining Significance Analysis Of Microarray-Gene Set Reduction Algorithm With Pathway Topology-Based Weights To Select Relevant Genes, Suyan Tian, Howard H. Chang, Chi Wang Sep 2016

Weighted-Samgsr: Combining Significance Analysis Of Microarray-Gene Set Reduction Algorithm With Pathway Topology-Based Weights To Select Relevant Genes, Suyan Tian, Howard H. Chang, Chi Wang

Biostatistics Faculty Publications

Background: It has been demonstrated that a pathway-based feature selection method that incorporates biological information within pathways during the process of feature selection usually outperforms a gene-based feature selection algorithm in terms of predictive accuracy and stability. Significance analysis of microarray-gene set reduction algorithm (SAMGSR), an extension to a gene set analysis method with further reduction of the selected pathways to their respective core subsets, can be regarded as a pathway-based feature selection method.

Methods: In SAMGSR, whether a gene is selected is mainly determined by its expression difference between the phenotypes, and partially by the number of pathways to …