Open Access. Powered by Scholars. Published by Universities.®

Microarrays Commons

Open Access. Powered by Scholars. Published by Universities.®

123 Full-Text Articles 254 Authors 34,173 Downloads 15 Institutions

All Articles in Microarrays

Faceted Search

123 full-text articles. Page 1 of 2.

Predictions Generated From A Simulation Engine For Gene Expression Micro-Arrays For Use In Research Laboratories, Gopinath R. Mavankal, John Blevins, Dominique Edwards, Monnie McGee, Andrew Hardin 2018 Southern Methodist University

Predictions Generated From A Simulation Engine For Gene Expression Micro-Arrays For Use In Research Laboratories, Gopinath R. Mavankal, John Blevins, Dominique Edwards, Monnie Mcgee, Andrew Hardin

SMU Data Science Review

In this paper we introduce the technical components, the biology and data science involved in the use of microarray technology in biological and clinical research. We discuss how laborious experimental protocols involved in obtaining this data used in laboratories could benefit from using simulations of the data. We discuss the approach used in the simulation engine from [7]. We use this simulation engine to generate a prediction tool in Power BI, a Microsoft, business intelligence tool for analytics and data visualization [22]. This tool could be used in any laboratory using micro-arrays to improve experimental design by comparing how predicted ...


Analysis Challenges For High Dimensional Data, Bangxin Zhao 2018 The University of Western Ontario

Analysis Challenges For High Dimensional Data, Bangxin Zhao

Electronic Thesis and Dissertation Repository

In this thesis, we propose new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.

Two methods ...


A Comparison Of Unsupervised Methods For Dna Microarray Leukemia Data, Denise Harness 2018 East Tennessee State University

A Comparison Of Unsupervised Methods For Dna Microarray Leukemia Data, Denise Harness

Appalachian Student Research Forum

Advancements in DNA microarray data sequencing have created the need for sophisticated machine learning algorithms and feature selection methods. Probabilistic graphical models, in particular, have been used to identify whether microarrays or genes cluster together in groups of individuals having a similar diagnosis. These clusters of genes are informative, but can be misleading when every gene is used in the calculation. First feature reduction techniques are explored, however the size and nature of the data prevents traditional techniques from working efficiently. Our method is to use the partial correlations between the features to create a precision matrix and predict which ...


Contributions To Statistical Testing, Prediction, And Modeling, John C. Pesko 2017 University of New Mexico

Contributions To Statistical Testing, Prediction, And Modeling, John C. Pesko

Mathematics & Statistics ETDs

1. "Parametric Bootstrap (PB) and Objective Bayesian (OB) Testing with Applications to Heteroscedastic ANOVA": For one-way heteroscedastic ANOVA, we show a close relationship between the PB and OB approaches to significance testing, demonstrating the conditions for which the two approaches are equivalent. Using a simulation study, PB and OB performance is compared to a test based on the predictive distribution as well as the unweighted test of Akritas & Papadatos (2004). We extend this work to the RCBD with subsampling model, and prove a repeated sampling property and large sample property for general OB significance testing.

2. "Early Identification of Binswanger ...


The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman 2017 Virginia Commonwealth University

The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman

Theses and Dissertations

With the influx of high-dimensional data there is an immediate need for statistical methods that are able to handle situations when the number of predictors greatly exceeds the number of samples. One such area of growth is in examining how environmental exposures to toxins impact the body long term. The cytokinesis-block micronucleus assay can measure the genotoxic effect of exposure as a count outcome. To investigate potential biomarkers, high-throughput assays that assess gene expression and methylation have been developed. It is of interest to identify biomarkers or molecular features that are associated with elevated micronuclei (MN) or nuclear bud (Nbud ...


Integration Of Multi-Platform High-Dimensional Omic Data, Xuebei An 2016 The University of Texas Graduate School of Biomedical Sciences at Houston

Integration Of Multi-Platform High-Dimensional Omic Data, Xuebei An

UT GSBS Dissertations and Theses (Open Access)

The development of high-throughput biotechnologies have made data accessible from different platforms, including RNA sequencing, copy number variation, DNA methylation, protein lysate arrays, etc. The high-dimensional omic data derived from different technological platforms have been extensively used to facilitate comprehensive understanding of disease mechanisms and to determine personalized health treatments. Although vital to the progress of clinical research, the high dimensional multi-platform data impose new challenges for data analysis. Numerous studies have been proposed to integrate multi-platform omic data; however, few have efficiently and simultaneously addressed the problems that arise from high dimensionality and complex correlations.

In my dissertation, I ...


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang 2016 Fox Chase Cancer Center

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for ...


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret 2016 University of Washington - Seattle Campus

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the ...


A Weighted Gene Co-Expression Network Analysis For Streptococcus Sanguinis Microarray Experiments, Erik C. Dvergsten 2016 Virginia Commonwealth University

A Weighted Gene Co-Expression Network Analysis For Streptococcus Sanguinis Microarray Experiments, Erik C. Dvergsten

Theses and Dissertations

Streptococcus sanguinis is a gram-positive, non-motile bacterium native to human mouths. It is the primary cause of endocarditis and is also responsible for tooth decay. Two-component systems (TCSs) are commonly found in bacteria. In response to environmental signals, TCSs may regulate the expression of virulence factor genes.

Gene co-expression networks are exploratory tools used to analyze system-level gene functionality. A gene co-expression network consists of gene expression profiles represented as nodes and gene connections, which occur if two genes are significantly co-expressed. An adjacency function transforms the similarity matrix containing co-expression similarities into the adjacency matrix containing connection strengths. Gene ...


Development In Normal Mixture And Mixture Of Experts Modeling, Meng Qi 2016 University of Kentucky

Development In Normal Mixture And Mixture Of Experts Modeling, Meng Qi

Theses and Dissertations--Statistics

In this dissertation, first we consider the problem of testing homogeneity and order in a contaminated normal model, when the data is correlated under some known covariance structure. To address this problem, we developed a moment based homogeneity and order test, and design weights for test statistics to increase power for homogeneity test. We applied our test to microarray about Down’s syndrome. This dissertation also studies a singular Bayesian information criterion (sBIC) for a bivariate hierarchical mixture model with varying weights, and develops a new data dependent information criterion (sFLIC).We apply our model and criteria to birth- weight ...


Transcriptomic Analyses Of Onecut1 And Onecut2 Deficient Retinas, Jillian J. Goetz, Jeffrey M. Trimarchi 2015 Iowa State University

Transcriptomic Analyses Of Onecut1 And Onecut2 Deficient Retinas, Jillian J. Goetz, Jeffrey M. Trimarchi

Genetics, Development and Cell Biology Publications

In this article, we further explore the data generated for the research article “Onecut1 and Onecut2 play critical roles in the development of the mouse retina”. To better understand the functionality of the Onecut family of transcription factors in retinogenesis, we investigated the retinal transcriptomes of developing and mature mice to identify genes with differential expression. This data article reports the full transcriptomes resulting from these experiments and provides tables detailing the differentially expressed genes between wildtype and Onecut1 or 2 deficient retinas. The raw array data of our transcriptomes as generated using Affymetrix microarrays are available on the NCBI ...


Bayesian Joint Selection Of Genes And Pathways: Applications In Multiple Myeloma Genomics, Lin Zhang, Jeffrey S. Morris, Jiexin Zhang, Robert Orlowski, Veerabhadran Baladandayuthapani 2014 The University of Texas MD Anderson Cancer Center

Bayesian Joint Selection Of Genes And Pathways: Applications In Multiple Myeloma Genomics, Lin Zhang, Jeffrey S. Morris, Jiexin Zhang, Robert Orlowski, Veerabhadran Baladandayuthapani

Jeffrey S. Morris

It is well-established that the development of a disease, especially cancer, is a complex process that results from the joint effects of multiple genes involved in various molecular signaling pathways. In this article, we propose methods to discover genes and molecular pathways significantly associ- ated with clinical outcomes in cancer samples. We exploit the natural hierarchal structure of genes related to a given pathway as a group of interacting genes to conduct selection of both pathways and genes. We posit the problem in a hierarchical structured variable selection (HSVS) framework to analyze the corresponding gene expression data. HSVS methods conduct ...


Methods For Integrative Analysis Of Genomic Data, Paul Manser 2014 Virginia Commonwealth University

Methods For Integrative Analysis Of Genomic Data, Paul Manser

Theses and Dissertations

In recent years, the development of new genomic technologies has allowed for the investigation of many regulatory epigenetic marks besides expression levels, on a genome-wide scale. As the price for these technologies continues to decrease, study sizes will not only increase, but several different assays are beginning to be used for the same samples. It is therefore desirable to develop statistical methods to integrate multiple data types that can handle the increased computational burden of incorporating large data sets. Furthermore, it is important to develop sound quality control and normalization methods as technical errors can compound when integrating multiple genomic ...


Normal Mixture And Contaminated Model With Nuisance Parameter And Applications, Qian Fan 2014 University of Kentucky

Normal Mixture And Contaminated Model With Nuisance Parameter And Applications, Qian Fan

Theses and Dissertations--Statistics

This paper intend to find the proper hypothesis and test statistic for testing existence of bilaterally contamination when there exists nuisance parameter. The test statistic is based on method of moments estimators. Union-Intersection test is used for testing if the distribution of population can be implemented by a bilaterally contaminated normal model with unknown variance. This paper also developed a hierarchical normal mixture model (HNM) and applied it to birth weight data. EM algorithm is employed for parameter estimation and a singular Bayesian information criterion (sBIC) is applied to choose the number components. We also proposed a singular flexible information ...


Contaminated Chi-Square Modeling And Its Application In Microarray Data Analysis, Feng Zhou 2014 University of Kentucky

Contaminated Chi-Square Modeling And Its Application In Microarray Data Analysis, Feng Zhou

Theses and Dissertations--Statistics

Mixture modeling has numerous applications. One particular interest is microarray data analysis. My dissertation research is focused on the Contaminated Chi-Square (CCS) Modeling and its application in microarray. A moment-based method and two likelihood-based methods including Modified Likelihood Ratio Test (MLRT) and Expectation-Maximization (EM) Test are developed for testing the omnibus null hypothesis of no contamination of a central chi-square distribution by a non-central Chi-Square distribution. When the omnibus null hypothesis is rejected, we further developed the moment-based test and the EM test for testing an extra component to the Contaminated Chi-Square (CCS+EC) Model. The moment-based approach is easy ...


Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong 2013 The University of Texas Graduate School of Biomedical Sciences at Houston

Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong

UT GSBS Dissertations and Theses (Open Access)

It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays ...


Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi 2013 The University of Texas M.D. Anderson Cancer Center

Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi

Jeffrey S. Morris

Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.

Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.

Design: A single-arm, phase II trial.

Patients: Twenty-seven patients with FAP.

Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.

Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were ...


Integrative Analysis Of Prognosis Data On Multiple Cancer Subtypes, Shuangge Ma 2012 Yale University

Integrative Analysis Of Prognosis Data On Multiple Cancer Subtypes, Shuangge Ma

Shuangge Ma

In cancer research, profiling studies have been extensively conducted, searching for genes/SNPs associated with prognosis. Cancer is diverse. Examining similarity and difference in the genetic basis of multiple subtypes of the same cancer can lead to a better understanding of their connections and distinctions. Classic meta-analysis methods analyze each subtype separately and then compare analysis results across subtypes. Integrative analysis methods, in contrast, analyze the raw data on multiple subtypes simultaneously and can outperform meta-analysis methods. In this study, prognosis data on multiple subtypes of the same cancer are analyzed. An AFT (accelerated failure time) model is adopted to ...


Bayesian Methods For Expression-Based Integration, Elizabeth M. Jennings, Jeffrey S. Morris, Raymond J. Carroll, Ganiraju C. Manyam, Veera Baladandayuthapani 2012 Texas A&M University

Bayesian Methods For Expression-Based Integration, Elizabeth M. Jennings, Jeffrey S. Morris, Raymond J. Carroll, Ganiraju C. Manyam, Veera Baladandayuthapani

Jeffrey S. Morris

We propose methods to integrate data across several genomic platforms using a hierarchical Bayesian analysis framework that incorporates the biological relationships among the platforms to identify genes whose expression is related to clinical outcomes in cancer. This integrated approach combines information across all platforms, leading to increased statistical power in finding these predictive genes, and further provides mechanistic information about the manner in which the gene affects the outcome. We demonstrate the advantages of the shrinkage estimation used by this approach through a simulation, and finally, we apply our method to a Glioblastoma Multiforme dataset and identify several genes potentially ...


A Bayesian Model For Pooling Gene Expression Studies That Incorporates Co-Regulation Information, Erin Conlon, Bradley L. Postier, Barbara Methé, Kelly Nevin, Derek Lovley 2012 University of Massachusetts - Amherst

A Bayesian Model For Pooling Gene Expression Studies That Incorporates Co-Regulation Information, Erin Conlon, Bradley L. Postier, Barbara Methé, Kelly Nevin, Derek Lovley

Microbiology Department Faculty Publication Series

Current Bayesian microarray models that pool multiple studies assume gene expression is independent of other genes. However, in prokaryotic organisms, genes are arranged in units that are co-regulated (called operons). Here, we introduce a new Bayesian model for pooling gene expression studies that incorporates operon information into the model. Our Bayesian model borrows information from other genes within the same operon to improve estimation of gene expression. The model produces the gene-specific posterior probability of differential expression, which is the basis for inference. We found in simulations and in biological studies that incorporating co-regulation information improves upon the independence model ...


Digital Commons powered by bepress