Open Access. Powered by Scholars. Published by Universities.®

Microarrays Commons

Open Access. Powered by Scholars. Published by Universities.®

143 Full-Text Articles 299 Authors 54,500 Downloads 18 Institutions

All Articles in Microarrays

Faceted Search

143 full-text articles. Page 2 of 4.

Power In Pairs: Assessing The Statistical Value Of Paired Samples In Tests For Differential Expression, John R. Stevens, Jennifer S. Herrick, Roger K. Wolff, Martha L. Slattery 2018 Utah State University

Power In Pairs: Assessing The Statistical Value Of Paired Samples In Tests For Differential Expression, John R. Stevens, Jennifer S. Herrick, Roger K. Wolff, Martha L. Slattery

Mathematics and Statistics Faculty Publications

Background: When genomics researchers design a high-throughput study to test for differential expression, some biological systems and research questions provide opportunities to use paired samples from subjects, and researchers can plan for a certain proportion of subjects to have paired samples. We consider the effect of this paired samples proportion on the statistical power of the study, using characteristics of both count (RNA-Seq) and continuous (microarray) expression data from a colorectal cancer study.

Results: We demonstrate that a higher proportion of subjects with paired samples yields higher statistical power, for various total numbers of samples, and for various strengths of …


Innate Immunity, The Hepatic Extracellular Matrix, And Liver Injury: Mathematical Modeling Of Metastatic Potential And Tumor Development In Alcoholic Liver Disease., Shanice V. Hudson 2018 University of Louisville

Innate Immunity, The Hepatic Extracellular Matrix, And Liver Injury: Mathematical Modeling Of Metastatic Potential And Tumor Development In Alcoholic Liver Disease., Shanice V. Hudson

Electronic Theses and Dissertations

The overarching goals of the current work are to fill key gaps in the current understanding of alcohol consumption and the risk of metastasis to the liver. Considering the evidence this research group has compiled confirming that the hepatic matrisome responds dynamically to injury, an altered extracellular matrix (ECM) profile appears to be a key feature of pre-fibrotic inflammatory injury in the liver. This group has demonstrated that the hepatic ECM responds dynamically to alcohol exposure, in particular, sensitizing the liver to LPS-induced inflammatory damage. Although the study of alcohol in its role as a contributing factor to oncogenesis and …


Predictions Generated From A Simulation Engine For Gene Expression Micro-Arrays For Use In Research Laboratories, Gopinath R. Mavankal, John Blevins, Dominique Edwards, Monnie McGee, Andrew Hardin 2018 Southern Methodist University

Predictions Generated From A Simulation Engine For Gene Expression Micro-Arrays For Use In Research Laboratories, Gopinath R. Mavankal, John Blevins, Dominique Edwards, Monnie Mcgee, Andrew Hardin

SMU Data Science Review

In this paper we introduce the technical components, the biology and data science involved in the use of microarray technology in biological and clinical research. We discuss how laborious experimental protocols involved in obtaining this data used in laboratories could benefit from using simulations of the data. We discuss the approach used in the simulation engine from [7]. We use this simulation engine to generate a prediction tool in Power BI, a Microsoft, business intelligence tool for analytics and data visualization [22]. This tool could be used in any laboratory using micro-arrays to improve experimental design by comparing how predicted …


Analysis Challenges For High Dimensional Data, Bangxin Zhao 2018 The University of Western Ontario

Analysis Challenges For High Dimensional Data, Bangxin Zhao

Electronic Thesis and Dissertation Repository

In this thesis, we propose new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.

Two methods …


Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu 2018 The University of Western Ontario

Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu

Electronic Thesis and Dissertation Repository

ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.

This thesis presents a novel …


A Comparison Of Unsupervised Methods For Dna Microarray Leukemia Data, Denise Harness 2018 East Tennessee State University

A Comparison Of Unsupervised Methods For Dna Microarray Leukemia Data, Denise Harness

Appalachian Student Research Forum

Advancements in DNA microarray data sequencing have created the need for sophisticated machine learning algorithms and feature selection methods. Probabilistic graphical models, in particular, have been used to identify whether microarrays or genes cluster together in groups of individuals having a similar diagnosis. These clusters of genes are informative, but can be misleading when every gene is used in the calculation. First feature reduction techniques are explored, however the size and nature of the data prevents traditional techniques from working efficiently. Our method is to use the partial correlations between the features to create a precision matrix and predict which …


Contributions To Statistical Testing, Prediction, And Modeling, John C. Pesko 2017 University of New Mexico

Contributions To Statistical Testing, Prediction, And Modeling, John C. Pesko

Mathematics & Statistics ETDs

1. "Parametric Bootstrap (PB) and Objective Bayesian (OB) Testing with Applications to Heteroscedastic ANOVA": For one-way heteroscedastic ANOVA, we show a close relationship between the PB and OB approaches to significance testing, demonstrating the conditions for which the two approaches are equivalent. Using a simulation study, PB and OB performance is compared to a test based on the predictive distribution as well as the unweighted test of Akritas & Papadatos (2004). We extend this work to the RCBD with subsampling model, and prove a repeated sampling property and large sample property for general OB significance testing.

2. "Early Identification of …


The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman 2017 Virginia Commonwealth University

The Generalized Monotone Incremental Forward Stagewise Method For Modeling Longitudinal, Clustered, And Overdispersed Count Data: Application Predicting Nuclear Bud And Micronuclei Frequencies, Rebecca Lehman

Theses and Dissertations

With the influx of high-dimensional data there is an immediate need for statistical methods that are able to handle situations when the number of predictors greatly exceeds the number of samples. One such area of growth is in examining how environmental exposures to toxins impact the body long term. The cytokinesis-block micronucleus assay can measure the genotoxic effect of exposure as a count outcome. To investigate potential biomarkers, high-throughput assays that assess gene expression and methylation have been developed. It is of interest to identify biomarkers or molecular features that are associated with elevated micronuclei (MN) or nuclear bud (Nbud) …


Integration Of Multi-Platform High-Dimensional Omic Data, Xuebei An 2016 The University of Texas Graduate School of Biomedical Sciences at Houston

Integration Of Multi-Platform High-Dimensional Omic Data, Xuebei An

Dissertations & Theses (Open Access)

The development of high-throughput biotechnologies have made data accessible from different platforms, including RNA sequencing, copy number variation, DNA methylation, protein lysate arrays, etc. The high-dimensional omic data derived from different technological platforms have been extensively used to facilitate comprehensive understanding of disease mechanisms and to determine personalized health treatments. Although vital to the progress of clinical research, the high dimensional multi-platform data impose new challenges for data analysis. Numerous studies have been proposed to integrate multi-platform omic data; however, few have efficiently and simultaneously addressed the problems that arise from high dimensionality and complex correlations.

In my dissertation, I …


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang 2016 Fox Chase Cancer Center

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret 2016 University of Washington - Seattle Campus

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


A Weighted Gene Co-Expression Network Analysis For Streptococcus Sanguinis Microarray Experiments, Erik C. Dvergsten 2016 Virginia Commonwealth University

A Weighted Gene Co-Expression Network Analysis For Streptococcus Sanguinis Microarray Experiments, Erik C. Dvergsten

Theses and Dissertations

Streptococcus sanguinis is a gram-positive, non-motile bacterium native to human mouths. It is the primary cause of endocarditis and is also responsible for tooth decay. Two-component systems (TCSs) are commonly found in bacteria. In response to environmental signals, TCSs may regulate the expression of virulence factor genes.

Gene co-expression networks are exploratory tools used to analyze system-level gene functionality. A gene co-expression network consists of gene expression profiles represented as nodes and gene connections, which occur if two genes are significantly co-expressed. An adjacency function transforms the similarity matrix containing co-expression similarities into the adjacency matrix containing connection strengths. Gene …


Development In Normal Mixture And Mixture Of Experts Modeling, Meng Qi 2016 University of Kentucky

Development In Normal Mixture And Mixture Of Experts Modeling, Meng Qi

Theses and Dissertations--Statistics

In this dissertation, first we consider the problem of testing homogeneity and order in a contaminated normal model, when the data is correlated under some known covariance structure. To address this problem, we developed a moment based homogeneity and order test, and design weights for test statistics to increase power for homogeneity test. We applied our test to microarray about Down’s syndrome. This dissertation also studies a singular Bayesian information criterion (sBIC) for a bivariate hierarchical mixture model with varying weights, and develops a new data dependent information criterion (sFLIC).We apply our model and criteria to birth- weight and gestational …


Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao 2015 Virginia Commonwealth University

Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao

Theses and Dissertations

Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …


Normal Mixture And Contaminated Model With Nuisance Parameter And Applications, Qian Fan 2014 University of Kentucky

Normal Mixture And Contaminated Model With Nuisance Parameter And Applications, Qian Fan

Theses and Dissertations--Statistics

This paper intend to find the proper hypothesis and test statistic for testing existence of bilaterally contamination when there exists nuisance parameter. The test statistic is based on method of moments estimators. Union-Intersection test is used for testing if the distribution of population can be implemented by a bilaterally contaminated normal model with unknown variance. This paper also developed a hierarchical normal mixture model (HNM) and applied it to birth weight data. EM algorithm is employed for parameter estimation and a singular Bayesian information criterion (sBIC) is applied to choose the number components. We also proposed a singular flexible information …


Bayesian Joint Selection Of Genes And Pathways: Applications In Multiple Myeloma Genomics, Lin Zhang, Jeffrey S. Morris, Jiexin Zhang, Robert Orlowski, Veerabhadran Baladandayuthapani 2014 The University of Texas MD Anderson Cancer Center

Bayesian Joint Selection Of Genes And Pathways: Applications In Multiple Myeloma Genomics, Lin Zhang, Jeffrey S. Morris, Jiexin Zhang, Robert Orlowski, Veerabhadran Baladandayuthapani

Jeffrey S. Morris

It is well-established that the development of a disease, especially cancer, is a complex process that results from the joint effects of multiple genes involved in various molecular signaling pathways. In this article, we propose methods to discover genes and molecular pathways significantly associ- ated with clinical outcomes in cancer samples. We exploit the natural hierarchal structure of genes related to a given pathway as a group of interacting genes to conduct selection of both pathways and genes. We posit the problem in a hierarchical structured variable selection (HSVS) framework to analyze the corresponding gene expression data. HSVS methods conduct …


Contaminated Chi-Square Modeling And Its Application In Microarray Data Analysis, Feng Zhou 2014 University of Kentucky

Contaminated Chi-Square Modeling And Its Application In Microarray Data Analysis, Feng Zhou

Theses and Dissertations--Statistics

Mixture modeling has numerous applications. One particular interest is microarray data analysis. My dissertation research is focused on the Contaminated Chi-Square (CCS) Modeling and its application in microarray. A moment-based method and two likelihood-based methods including Modified Likelihood Ratio Test (MLRT) and Expectation-Maximization (EM) Test are developed for testing the omnibus null hypothesis of no contamination of a central chi-square distribution by a non-central Chi-Square distribution. When the omnibus null hypothesis is rejected, we further developed the moment-based test and the EM test for testing an extra component to the Contaminated Chi-Square (CCS+EC) Model. The moment-based approach is easy and …


Methods For Integrative Analysis Of Genomic Data, Paul Manser 2014 Virginia Commonwealth University

Methods For Integrative Analysis Of Genomic Data, Paul Manser

Theses and Dissertations

In recent years, the development of new genomic technologies has allowed for the investigation of many regulatory epigenetic marks besides expression levels, on a genome-wide scale. As the price for these technologies continues to decrease, study sizes will not only increase, but several different assays are beginning to be used for the same samples. It is therefore desirable to develop statistical methods to integrate multiple data types that can handle the increased computational burden of incorporating large data sets. Furthermore, it is important to develop sound quality control and normalization methods as technical errors can compound when integrating multiple genomic …


Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong 2013 The University of Texas Graduate School of Biomedical Sciences at Houston

Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong

Dissertations & Theses (Open Access)

It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays …


Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi 2013 The University of Texas M.D. Anderson Cancer Center

Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi

Jeffrey S. Morris

Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.

Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.

Design: A single-arm, phase II trial.

Patients: Twenty-seven patients with FAP.

Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.

Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …


Digital Commons powered by bepress