Open Access. Powered by Scholars. Published by Universities.®

Microarrays Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Microarrays

Bayesian Methods For Graphical Models With Neighborhood Selection., Sagnik Bhadury Dec 2022

Bayesian Methods For Graphical Models With Neighborhood Selection., Sagnik Bhadury

Electronic Theses and Dissertations

Graphical models determine associations between variables through the notion of conditional independence. Gaussian graphical models are a widely used class of such models, where the relationships are formalized by non-null entries of the precision matrix. However, in high-dimensional cases, covariance estimates are typically unstable. Moreover, it is natural to expect only a few significant associations to be present in many realistic applications. This necessitates the injection of sparsity techniques into the estimation method. Classical frequentist methods, like GLASSO, use penalization techniques for this purpose. Fully Bayesian methods, on the contrary, are slow because they require iteratively sampling over a quadratic …


Gene Set Testing By Distance Correlation, Sho-Hsien Su Dec 2020

Gene Set Testing By Distance Correlation, Sho-Hsien Su

Graduate Theses and Dissertations

Pathways are the functional building blocks of complex diseases such as cancers. Pathway-level studies may provide insights on some important biological processes. Gene set test is an important tool to study the differential expression of a gene set between two groups, e.g., cancer vs normal. The differential expression of a gene set could be due to the difference in mean, variability, or both. However, most existing gene set tests only target the mean difference but overlook other types of differential expression. In this thesis, we propose to use the recently developed distance correlation for gene set testing. To assess the …


Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das Dec 2020

Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das

Electronic Theses and Dissertations

Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …


Microarray Data Analysis And Classification Of Cancers, Grant Gates Jan 2019

Microarray Data Analysis And Classification Of Cancers, Grant Gates

Williams Honors College, Honors Research Projects

When it comes to cancer, there is no standardized approach for identifying new cancer classes nor is there a standardized approach for assigning cancer tumors to existing classes. These two ideas are known as class discovery and class prediction. For a cancer patient to receive proper treatment, it is important that the type of cancer be accurately identified. For my Senior Honors Project, I would like to use this opportunity to research a topic in bioinformatics. Bioinformatics incorporates a few different subjects into one including biology, computer science and statistics. An intricate method for class discovery and class prediction is …


Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu Apr 2018

Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu

Electronic Thesis and Dissertation Repository

ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.

This thesis presents a novel …


Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao Jan 2015

Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao

Theses and Dissertations

Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …


Methods For Integrative Analysis Of Genomic Data, Paul Manser Jan 2014

Methods For Integrative Analysis Of Genomic Data, Paul Manser

Theses and Dissertations

In recent years, the development of new genomic technologies has allowed for the investigation of many regulatory epigenetic marks besides expression levels, on a genome-wide scale. As the price for these technologies continues to decrease, study sizes will not only increase, but several different assays are beginning to be used for the same samples. It is therefore desirable to develop statistical methods to integrate multiple data types that can handle the increased computational burden of incorporating large data sets. Furthermore, it is important to develop sound quality control and normalization methods as technical errors can compound when integrating multiple genomic …


Survival Prediction For Brain Tumor Patients Using Gene Expression Data, Vinicius Bonato May 2010

Survival Prediction For Brain Tumor Patients Using Gene Expression Data, Vinicius Bonato

Dissertations & Theses (Open Access)

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. …