Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Loyola University Chicago

Theses/Dissertations

Genomics

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Life Sciences

Incorporating Sex Chromosomes In Transcriptome Prediction Models And Improving Cross-Population Prediction Performance, Daniel S. Araujo Jan 2023

Incorporating Sex Chromosomes In Transcriptome Prediction Models And Improving Cross-Population Prediction Performance, Daniel S. Araujo

Master's Theses

Transcriptome prediction models built with data from European-descent individuals are less accurate when applied to different populations because of differences in linkage disequilibrium patterns and allele frequencies. We hypothesized multivariate adaptive shrinkage may improve cross-population transcriptome prediction, as it leverages effect size estimates across different conditions - in this case, different populations. To test this hypothesis, we made transcriptome prediction models for use in transcriptome-wide association studies (TWAS) using different methods (Elastic Net, Matrix eQTL and Multivariate Adaptive Shrinkage in R (MASHR)) and tested their out-of-sample transcriptome prediction accuracy in population-matched and cross-population scenarios. Additionally, to evaluate model applicability in …


Optimizing Gene Expression Prediction And Omics Integration In Populations Of African Ancestry, Paul Chukwuebuka Okoro Jan 2020

Optimizing Gene Expression Prediction And Omics Integration In Populations Of African Ancestry, Paul Chukwuebuka Okoro

Master's Theses

Popular transcriptome imputation methods such as PrediXcan and FUSIon use parametric linear assumptions, and thus are unable to flexibly model the complex genetic architecture of the transcriptome. Although non-linear modeling has been shown to improve imputation performance, replicability and potential cross-population differences have not been adequately studied. Therefore, to optimize imputation performance across global populations, we used the non-linear machine learning (ML) models random forest (RF), support vector regression (SVR), and K nearest neighbor (KNN) to build transcriptome imputation models, and evaluated their performance in comparison to elastic net (EN). We trained gene expression prediction models using genotype and blood …