Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Entire DC Network
Cross-Platform Normalization Of Microarray And Rna-Seq Data For Machine Learning Applications, Jeffrey A. Thompson, Jie Tan, Casey S. Greene
Cross-Platform Normalization Of Microarray And Rna-Seq Data For Machine Learning Applications, Jeffrey A. Thompson, Jie Tan, Casey S. Greene
Dartmouth Scholarship
Large, publicly available gene expression datasets are often analyzed with the aid of machine learning algorithms. Although RNA-seq is increasingly the technology of choice, a wealth of expression data already exist in the form of microarray data. If machine learning models built from legacy data can be applied to RNA-seq data, larger, more diverse training datasets can be created and validation can be performed on newly generated data. We developed Training Distribution Matching (TDM), which transforms RNA-seq data for use with models constructed from legacy platforms. We evaluated TDM, as well as quantile normalization, nonparanormal transformation, and a simple log …