Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Germline Mutation Detection In Next Generation Sequencing Data And Tp53 Mutation Carrier Probability Estimation For Li-Fraumeni Syndrome, Gang Peng Aug 2015

Germline Mutation Detection In Next Generation Sequencing Data And Tp53 Mutation Carrier Probability Estimation For Li-Fraumeni Syndrome, Gang Peng

Dissertations & Theses (Open Access)

Next generation sequencing technology has been widely used in genomic analysis, but its application has been compromised by the missing true variants, especially when these variants are rare. We proposed a family-based variant calling method, FamSeq, integrating Mendelian transmission information with de novo mutation and sequencing data to improve the variant calling accuracy. We investigated the factors impacting the improvement of family-based variant calling in simulation data and validated it in real sequencing data. In both simulation and real data, FamSeq works better than the single individual based method.

In FamSeq, we implemented four different methods for the Mendelian genetic …


Application Of Machine Learning To Mapping And Simulating Gene Regulatory Networks, Hien-Haw Liow May 2015

Application Of Machine Learning To Mapping And Simulating Gene Regulatory Networks, Hien-Haw Liow

Arts & Sciences Electronic Theses and Dissertations

This dissertation explores, proposes, and examines methods of applying modernmachine learning and Bayesian statistics in the quantitative and qualitative modeling of gene regulatory networks using high-throughput gene expression data. A semi-parametric Bayesian model based on random forest is developed to infer quantitative aspects of gene regulation relations; a parametric model is developed to predict geneexpression levels solely from genotype information. Simulation of network behavior is shown to complement regression analysis greatly in capturing the dynamics of gene regulatory networks. Finally, as an application and extension of novel approaches in gene expression analysis, new methods of discovering topological structure of gene …


Bayesian Adaptive Penalized Splines In Nonparametric Regression And In Spectral Time Series Analysis, Luis Angel Mora Jan 2015

Bayesian Adaptive Penalized Splines In Nonparametric Regression And In Spectral Time Series Analysis, Luis Angel Mora

Open Access Theses & Dissertations

A Bayesian approach to nonparametric regression using Penalized splines (P-splines) is presented. The approach uses the linear mixed model formulation of P-spines. The usual model assumes a single value for the smoothing parameter controlling the amount of smoothing of the fitted function. The main focus of the Thesis is on spatially adaptive smoothing where the smoothing parameter is a function of the covariate so that different amounts of smoothing are applied in different regions of the covariate. An application to spectral time series analysis will be demonstrated. Markov chain Monte Carlo methods are used to make inference based on the …