Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Medicine and Health Sciences

Series

Institution
Keyword
Publication Year
Publication

Articles 1 - 30 of 52

Full-Text Articles in Computational Biology

The Role Of Non-Coding Rnas In Myelodysplastic Neoplasms, Vasileios Georgoulis, Epameinondas Koumpis, Eleftheria Hatzimichael Sep 2023

The Role Of Non-Coding Rnas In Myelodysplastic Neoplasms, Vasileios Georgoulis, Epameinondas Koumpis, Eleftheria Hatzimichael

Computational Medicine Center Faculty Papers

Myelodysplastic syndromes or neoplasms (MDS) are a heterogeneous group of myeloid clonal disorders characterized by peripheral blood cytopenias, blood and marrow cell dysplasia, and increased risk of evolution to acute myeloid leukemia (AML). Non-coding RNAs, especially microRNAs and long non-coding RNAs, serve as regulators of normal and malignant hematopoiesis and have been implicated in carcinogenesis. This review presents a comprehensive summary of the biology and role of non-coding RNAs, including the less studied circRNA, siRNA, piRNA, and snoRNA as potential prognostic and/or predictive biomarkers or therapeutic targets in MDS.


Cellbrf: A Feature Selection Method For Single-Cell Clustering Using Cell Balance And Random Forest, Yunpei Xu, Hong-Dong Li, Cui-Xiang Lin, Ruiqing Zheng, Yaohang Li, Jinhui Xu, Jianxin Wang Jan 2023

Cellbrf: A Feature Selection Method For Single-Cell Clustering Using Cell Balance And Random Forest, Yunpei Xu, Hong-Dong Li, Cui-Xiang Lin, Ruiqing Zheng, Yaohang Li, Jinhui Xu, Jianxin Wang

Computer Science Faculty Publications

Motivation

Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect the complexity of biological tissues through cell sub-population identification in combination with clustering approaches. Feature selection is a critical step for improving the accuracy and interpretability of single-cell clustering. Existing feature selection methods underutilize the discriminatory potential of genes across distinct cell types. We hypothesize that incorporating such information could further boost the performance of single cell clustering. Results

We develop CellBRF, a feature selection method that considers genes’ relevance to cell types for single-cell clustering. The key idea is to identify genes that are most important for discriminating …


Prediction Of Kinase-Substrate Associations Using The Functional Landscape Of Kinases And Phosphorylation Sites, Serhan Yilmaz, Filipa Blasco Tavares Pereira Lopes, Mark R. Chance, Mehmet Koyutürk Jan 2023

Prediction Of Kinase-Substrate Associations Using The Functional Landscape Of Kinases And Phosphorylation Sites, Serhan Yilmaz, Filipa Blasco Tavares Pereira Lopes, Mark R. Chance, Mehmet Koyutürk

Faculty Scholarship

Protein phosphorylation is a key post-translational modification that plays a central role in many cellular processes. With recent advances in biotechnology, thousands of phosphorylated sites can be identified and quantified in a given sample, enabling proteome-wide screening of cellular signaling. However, for most (> 90%) of the phosphorylation sites that are identified in these experiments, the kinase(s) that target these sites are unknown. To broadly utilize available structural, functional, evolutionary, and contextual information in predicting kinase-substrate associations (KSAs), we develop a network-based machine learning framework. Our framework integrates a multitude of data sources to characterize the landscape of functional relationships …


Intergenic Transcription In In Vivo Developed Bovine Oocytes And Pre-Implantation Embryos, Saurav Ranjitkar, Mohammad Shiri, Jiangwen Sun, Xiuchun Tian Jan 2023

Intergenic Transcription In In Vivo Developed Bovine Oocytes And Pre-Implantation Embryos, Saurav Ranjitkar, Mohammad Shiri, Jiangwen Sun, Xiuchun Tian

Computer Science Faculty Publications

Background

Intergenic transcription, either failure to terminate at the transcription end site (TES), or transcription initiation at other intergenic regions, is present in cultured cells and enhanced in the presence of stressors such as viral infection. Transcription termination failure has not been characterized in natural biological samples such as pre-implantation embryos which express more than 10,000 genes and undergo drastic changes in DNA methylation.

Results

Using Automatic Readthrough Transcription Detection (ARTDeco) and data of in vivo developed bovine oocytes and embryos, we found abundant intergenic transcripts that we termed as read-outs (transcribed from 5 to 15 kb after TES) and …


An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He Jan 2023

An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He

Computer Science Faculty Publications

More and more deep learning approaches have been proposed to segment secondary structures from cryo-electron density maps at medium resolution range (5--10Å). Although the deep learning approaches show great potential, only a few small experimental data sets have been used to test the approaches. There is limited understanding about potential factors, in data, that affect the performance of segmentation. We propose an approach to generate data sets with desired specifications in three potential factors - the protein sequence identity, structural contents, and data quality. The approach was implemented and has generated a test set and various training sets to study …


Radiation Exposure Determination In A Secure, Cloud-Based Online Environment, Ben C. Shirley, Eliseos J. Mucaki, Peter Rogan Oct 2022

Radiation Exposure Determination In A Secure, Cloud-Based Online Environment, Ben C. Shirley, Eliseos J. Mucaki, Peter Rogan

Biochemistry Publications

Rapid sample processing and interpretation of estimated exposures will be critical for triaging exposed individuals after a major radiation incident. The dicentric chromosome (DC) assay assesses absorbed radiation using metaphase cells from blood. The Automated Dicentric Chromosome Identifier and Dose Estimator System (ADCI) identifies DCs and determines radiation doses. This study aimed to broaden accessibility and speed of this system, while protecting data and software integrity. ADCI Online is a secure web-streaming platform accessible worldwide from local servers. Cloud-based systems containing data and software are separated until they are linked for radiation exposure estimation. Dose estimates are identical to ADCI …


Patient-Specific Genome-Scale Metabolic Models For Individualized Predictions Of Liver Disease, Alexandra Manchel, Jan B. Hoek, Ramon Bataller, Radhakrishnan Mahadevan, Rajanikanth Vadigepalli Sep 2022

Patient-Specific Genome-Scale Metabolic Models For Individualized Predictions Of Liver Disease, Alexandra Manchel, Jan B. Hoek, Ramon Bataller, Radhakrishnan Mahadevan, Rajanikanth Vadigepalli

Department of Pathology, Anatomy, and Cell Biology Faculty Papers

The prevalence of liver disease is steadily increasing, coupled with the limited availability of therapeutic treatments. Recent literature points to metabolic reprogramming as a key feature of liver failure. Hence, we sought to uncover the metabolic pathways and mechanisms associated with liver disease and acute liver failure. We generated patient-specific genome scale metabolic models by integrating RNA-seq data from patient liver samples with a generalized human metabolic model. Flux balance analysis simulations showed a distinct separation of non-alcohol associated and alcohol-associated disease states. Our analysis suggests that the alcohol associated liver has an increased flux through nucleotide and glycerophospholipid metabolic …


Using Machine Learning To Recognize Chronic Rhinosinusitis, Irene Liu '23 Apr 2022

Using Machine Learning To Recognize Chronic Rhinosinusitis, Irene Liu '23

Student Publications & Research

Chronic Rhinosinusitis (CRS) is a nasal disease characterized by the inflammation of the mucosa and paranasal sinuses with a duration of at least 12 consecutive weeks. So, to diagnose CRS, one needs to keep a record of their symptoms for ~12 weeks before they are recommended to get a tomography which will allow physicians to classify them as a patient with CRS or without. This is a timely and costly process; thus, machine learning should be used to speed the process up. Since patients with CRS have more obstructed noses, the sound produced should be different than an individual without …


A Machine Learning Framework For Identifying Molecular Biomarkers From Transcriptomic Cancer Data, Md Abdullah Al Mamun Mar 2022

A Machine Learning Framework For Identifying Molecular Biomarkers From Transcriptomic Cancer Data, Md Abdullah Al Mamun

FIU Electronic Theses and Dissertations

Cancer is a complex molecular process due to abnormal changes in the genome, such as mutation and copy number variation, and epigenetic aberrations such as dysregulations of long non-coding RNA (lncRNA). These abnormal changes are reflected in transcriptome by turning oncogenes on and tumor suppressor genes off, which are considered cancer biomarkers.

However, transcriptomic data is high dimensional, and finding the best subset of genes (features) related to causing cancer is computationally challenging and expensive. Thus, developing a feature selection framework to discover molecular biomarkers for cancer is critical.

Traditional approaches for biomarker discovery calculate the fold change for each …


The Low Abundance Of Cpg In The Sars-Cov-2 Genome Is Not An Evolutionarily Signature Of Zap, Ali Afrasiabi, Hamid Alinejad-Rokny, Azad Khosh, Mostafa Rahnama, Nigel Lovell, Zhenming Xu, Diako Ebrahimi Feb 2022

The Low Abundance Of Cpg In The Sars-Cov-2 Genome Is Not An Evolutionarily Signature Of Zap, Ali Afrasiabi, Hamid Alinejad-Rokny, Azad Khosh, Mostafa Rahnama, Nigel Lovell, Zhenming Xu, Diako Ebrahimi

Plant Pathology Faculty Publications

The zinc finger antiviral protein (ZAP) is known to restrict viral replication by binding to the CpG rich regions of viral RNA, and subsequently inducing viral RNA degradation. This enzyme has recently been shown to be capable of restricting SARS-CoV-2. These data have led to the hypothesis that the low abundance of CpG in the SARS-CoV-2 genome is due to an evolutionary pressure exerted by the host ZAP. To investigate this hypothesis, we performed a detailed analysis of many coronavirus sequences and ZAP RNA binding preference data. Our analyses showed neither evidence for an evolutionary pressure acting specifically on CpG …


Completing Single-Cell Dna Methylome Profiles Via Transfer Learning Together With Kl-Divergence, Sanjeeva Dodlapati, Zongliang Jiang, Jiangwen Sun Jan 2022

Completing Single-Cell Dna Methylome Profiles Via Transfer Learning Together With Kl-Divergence, Sanjeeva Dodlapati, Zongliang Jiang, Jiangwen Sun

Computer Science Faculty Publications

The high level of sparsity in methylome profiles obtained using whole-genome bisulfite sequencing in the case of low biological material amount limits its value in the study of systems in which large samples are difficult to assemble, such as mammalian preimplantation embryonic development. The recently developed computational methods for addressing the sparsity by imputing missing have their limits when the required minimum data coverage or profiles of the same tissue in other modalities are not available. In this study, we explored the use of transfer learning together with Kullback-Leibler (KL) divergence to train predictive models for completing methylome profiles with …


Improved Radiation Expression Profiling In Blood By Sequential Application Of Sensitive And Specific Gene Signatures, Eliseos J. Mucaki, Ben C. Shirley, Peter K. Rogan Oct 2021

Improved Radiation Expression Profiling In Blood By Sequential Application Of Sensitive And Specific Gene Signatures, Eliseos J. Mucaki, Ben C. Shirley, Peter K. Rogan

Biochemistry Publications

Purpose. Combinations of expressed genes can discriminate radiation-exposed from normal control blood samples by machine learning based signatures (with 8 to 20% misclassification rates). These signatures can quantify therapeutically-relevant as well as accidental radiation exposures. The prodromal symptoms of Acute Radiation Syndrome (ARS) overlap those present in Influenza and Dengue Fever infections. Surprisingly, these human radiation signatures misclassified gene expression profiles of virally infected samples as false positive exposures. The present study investigates these and other confounders, and then mitigates their impact on signature accuracy.

Methods. This study investigated recall by previous and novel radiation signatures independently derived …


Rnase Κ Promotes Robust Pirna Production By Generating 2',3'-Cyclic Phosphate-Containing Precursors, Megumi Shigematsu, Takuya Kawamura, Keisuke Morichika, Natsuko Izumi, Takashi Kiuchi, Shozo Honda, Venetia Pliatsika, Ryuma Matsubara, Isidore Rigoutsos, Susumu Katsuma, Yukihide Tomari, Yohei Kirino Jul 2021

Rnase Κ Promotes Robust Pirna Production By Generating 2',3'-Cyclic Phosphate-Containing Precursors, Megumi Shigematsu, Takuya Kawamura, Keisuke Morichika, Natsuko Izumi, Takashi Kiuchi, Shozo Honda, Venetia Pliatsika, Ryuma Matsubara, Isidore Rigoutsos, Susumu Katsuma, Yukihide Tomari, Yohei Kirino

Computational Medicine Center Faculty Papers

In animal germlines, PIWI proteins and the associated PIWI-interacting RNAs (piRNAs) protect genome integrity by silencing transposons. Here we report the extensive sequence and quantitative correlations between 2′,3′-cyclic phosphate-containing RNAs (cP-RNAs), identified using cP-RNA-seq, and piRNAs in the Bombyx germ cell line and mouse testes. The cP-RNAs containing 5′-phosphate (P-cP-RNAs) identified by P-cP-RNA-seq harbor highly consistent 5′-end positions as the piRNAs and are loaded onto PIWI protein, suggesting their direct utilization as piRNA precursors. We identified Bombyx RNase Kappa (BmRNase κ) as a mitochondria-associated endoribonuclease which produces cP-RNAs during piRNA biogenesis. BmRNase κ-depletion elevated transposon levels and disrupted a piRNA-mediated …


Identifying The Cell Composition And Clonal Diversity Of Supratentorial Ependymoma Using Single Cell Rna-Sequencing, James He May 2021

Identifying The Cell Composition And Clonal Diversity Of Supratentorial Ependymoma Using Single Cell Rna-Sequencing, James He

University Scholar Projects

Ependymoma is a primary solid tumor of the central nervous system. Supratentorial ependymoma (ST-EPN), a subtype of ependymomas, is driven by an oncogenic fusion between the ZFTA and RELA genes in 70% of cases. We introduced this fusion into neural progenitor cells of mice embryos via in utero electroporation of a non-viral binary piggyBac transposon system containing ZFTA-RELA. From preliminary data in the LoTurco lab, inducing the expression of ZFTA-RELA into different neural progenitor cells produces tumors of varying lethality and cellular composition. To define the cellular composition and subclonal diversity of ST-EPN tumors, we used single cell RNA-sequencing to …


Fmri Feature Extraction Model For Adhd Classification Using Convolutional Neural Network, Senuri De Silva, Sanuwani Udara Dayarathna, Gangani Ariyarathne, Dulani Meedeniya, Sampath Jayarathna Jan 2021

Fmri Feature Extraction Model For Adhd Classification Using Convolutional Neural Network, Senuri De Silva, Sanuwani Udara Dayarathna, Gangani Ariyarathne, Dulani Meedeniya, Sampath Jayarathna

Computer Science Faculty Publications

Biomedical intelligence provides a predictive mechanism for the automatic diagnosis of diseases and disorders. With the advancements of computational biology, neuroimaging techniques have been used extensively in clinical data analysis. Attention deficit hyperactivity disorder (ADHD) is a psychiatric disorder, with the symptomology of inattention, impulsivity, and hyperactivity, in which early diagnosis is crucial to prevent unwelcome outcomes. This study addresses ADHD identification using functional magnetic resonance imaging (fMRI) data for the resting state brain by evaluating multiple feature extraction methods. The features of seed-based correlation (SBC), fractional amplitude of low-frequency fluctuation (fALFF), and regional homogeneity (ReHo) are comparatively applied to …


Pathway-Extended Gene Expression Signatures Integrate Novel Biomarkers That Improve Predictions Of Patient Responses To Kinase Inhibitors, Ashis Jem Bagchee-Clark, Eliseos J. Mucaki, Tyson Whitehead, Peter Rogan Nov 2020

Pathway-Extended Gene Expression Signatures Integrate Novel Biomarkers That Improve Predictions Of Patient Responses To Kinase Inhibitors, Ashis Jem Bagchee-Clark, Eliseos J. Mucaki, Tyson Whitehead, Peter Rogan

Biochemistry Publications

No abstract provided.


Machine Learning Prediction Of Glioblastoma Patient One-Year Survival, Andrew Du '20, Warren Mcgee, Jane Y. Wu Jan 2020

Machine Learning Prediction Of Glioblastoma Patient One-Year Survival, Andrew Du '20, Warren Mcgee, Jane Y. Wu

Student Publications & Research

Glioblastoma (GBM) is a grade IV astrocytoma formed primarily from cancerous astrocytes and sustained by intense angiogenesis. GBM often causes non-specific symptoms, creating difficulty for diagnosis. This study aimed to utilize machine learning techniques to provide an accurate one-year survival prognosis for GBM patients using clinical and genomic data from the Chinese Glioma Genome Atlas. Logistic regression (LR), support vector machines (SVM), random forest (RF), and ensemble models were used to identify and select predictors for GBM survival and to classify patients into those with an overall survival (OS) of less than one year and one year or greater. With …


Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang Apr 2019

Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang

Biostatistics Faculty Publications

To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, …


Supervised Dimension Reduction For Large-Scale "Omics" Data With Censored Survival Outcomes Under Possible Non-Proportional Hazards, Lauren Spirko-Burns, Karthik Devarajan Mar 2019

Supervised Dimension Reduction For Large-Scale "Omics" Data With Censored Survival Outcomes Under Possible Non-Proportional Hazards, Lauren Spirko-Burns, Karthik Devarajan

COBRA Preprint Series

The past two decades have witnessed significant advances in high-throughput ``omics" technologies such as genomics, proteomics, metabolomics, transcriptomics and radiomics. These technologies have enabled simultaneous measurement of the expression levels of tens of thousands of features from individual patient samples and have generated enormous amounts of data that require analysis and interpretation. One specific area of interest has been in studying the relationship between these features and patient outcomes, such as overall and recurrence-free survival, with the goal of developing a predictive ``omics" profile. Large-scale studies often suffer from the presence of a large fraction of censored observations and potential …


Cyclin C: The Story Of A Non-Cycling Cyclin., Jan Ježek, Daniel G J Smethurst, David C Stieg, Z A C Kiss, Sara E Hanley, Vidyaramanan Ganesan, Kai-Ti Chang, Katrina F Cooper, Randy Strich Jan 2019

Cyclin C: The Story Of A Non-Cycling Cyclin., Jan Ježek, Daniel G J Smethurst, David C Stieg, Z A C Kiss, Sara E Hanley, Vidyaramanan Ganesan, Kai-Ti Chang, Katrina F Cooper, Randy Strich

Rowan-Virtua School of Osteopathic Medicine Faculty Scholarship

The class I cyclin family is a well-studied group of structurally conserved proteins that interact with their associated cyclin-dependent kinases (Cdks) to regulate different stages of cell cycle progression depending on their oscillating expression levels. However, the role of class II cyclins, which primarily act as transcription factors and whose expression remains constant throughout the cell cycle, is less well understood. As a classic example of a transcriptional cyclin, cyclin C forms a regulatory sub-complex with its partner kinase Cdk8 and two accessory subunits Med12 and Med13 called the Cdk8-dependent kinase module (CKM). The CKM reversibly associates with the multi-subunit …


Transcriptional Profiling Reveals Extraordinary Diversity Among Skeletal Muscle Tissues, Erin E. Terry, Xiping Zhang, Christy Hoffmann, Laura D. Hughes, Scott A. Lewis, Jiajia Li, Matthew J. Wallace, Lance A. Riley, Collin M. Douglas, Miguel A. Gutierrez-Monreal, Nicholas F. Lahens, Ming C. Gong, Francisco H. Andrade, Karyn A. Esser, Michael E. Hughes May 2018

Transcriptional Profiling Reveals Extraordinary Diversity Among Skeletal Muscle Tissues, Erin E. Terry, Xiping Zhang, Christy Hoffmann, Laura D. Hughes, Scott A. Lewis, Jiajia Li, Matthew J. Wallace, Lance A. Riley, Collin M. Douglas, Miguel A. Gutierrez-Monreal, Nicholas F. Lahens, Ming C. Gong, Francisco H. Andrade, Karyn A. Esser, Michael E. Hughes

Physiology Faculty Publications

Skeletal muscle comprises a family of diverse tissues with highly specialized functions. Many acquired diseases, including HIV and COPD, affect specific muscles while sparing others. Even monogenic muscular dystrophies selectively affect certain muscle groups. These observations suggest that factors intrinsic to muscle tissues influence their resistance to disease. Nevertheless, most studies have not addressed transcriptional diversity among skeletal muscles. Here we use RNAseq to profile mRNA expression in skeletal, smooth, and cardiac muscle tissues from mice and rats. Our data set, MuscleDB, reveals extensive transcriptional diversity, with greater than 50% of transcripts differentially expressed among skeletal muscle tissues. We detect …


Modeling And Analyzing An Optogenetic System For Photoactivatable Protein Dissociation, Anvin Thomas, James Schaff May 2018

Modeling And Analyzing An Optogenetic System For Photoactivatable Protein Dissociation, Anvin Thomas, James Schaff

Honors Scholar Theses

Computational modeling of cell-cell interactions can grant clues and can answer questions about an experiment, especially for observations about binding interactions and kinetics. This approach was used to investigate an interaction between a light-oxygen-voltage (LOV) domain and an engineered protein called Zdark (Zdk). The LOV domain is membrane-bound while Zdk is cytosolic. The LOV domain and Zdk bind strongly in dark (Kd 26.2 nM), and weakly upon exposure to blue light (Kd > 4 μM). Total internal reflection fluorescence (TIRF) images are acquired of Zdk, the fluorescent species bound to a mCherry tag, and the loss of fluorescence is …


Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin Dec 2017

Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin

Biostatistics Faculty Publications

Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, so misuse of statistical methods is a growing concern. Hypothesis and significance testing, and the accompanying P-values are being scrutinized as representing the most widely applied and abused practices. One line of critique is that P-values are inherently unfit to fulfill their ostensible role as measures of credibility for scientific hypotheses. It has also been suggested that while P-values …


In Silico Study Of Newly Synthesized Opioid Analgesics Bound To Three Opioid Receptors, Abdullah Allaoa, Mai Zahran Dec 2017

In Silico Study Of Newly Synthesized Opioid Analgesics Bound To Three Opioid Receptors, Abdullah Allaoa, Mai Zahran

Publications and Research

Opioids are the most widely used drugs for the treatment of moderate to severe, chronic pain. They achieve antinociception by activation of mu (MOR-1), kappa (KOR-1), and delta (DOR-1) opioid receptors. Natural products found in kratom plant, Mitragyna speciosa, represent diverse chemical groups with opioid activity, providing opportunities to better understand opioid pharmacology. Pharmacology studies show that Mitragynine pseudoindoxyl is a mu agonist/delta antagonist opioid with a signaling bias for G-protein-mediated signaling pathways in vitro and which produced potent antinociception in vivo. Respiratory depression assays along with other behavioral testing also showed that some of the major problems …


Optimization Of A Genomic Editing System Using Crispr/Cas9-Induced Site-Specific Gene Integration, Jillian L. Mccool Ms., Nick Hum, Gabriela G. Loots Aug 2016

Optimization Of A Genomic Editing System Using Crispr/Cas9-Induced Site-Specific Gene Integration, Jillian L. Mccool Ms., Nick Hum, Gabriela G. Loots

STAR Program Research Presentations

The CRISPR-Cas system is an adaptive immune system found in bacteria which helps protect against the invasion of other microorganisms. This system induces double stranded breaks at precise genomic loci (1) in which repairs are initiated and insertions of a target are completed in the process. This mechanism can be used in eukaryotic cells in combination with sgRNAs (1) as a tool for genome editing. By using this CRISPR-Cas system, in addition to the “safe harbor locus,” ROSAβ26, the incorporation of a target gene into a site that is not susceptible to gene silencing effects can be achieved through few …


Detecting Gene-Gene Interactions Using A Permutation-Based Random Forest Method, Jing Li, James D. Malley, Angeline S. Andrew, Margaret R. Karagas, Jason H. Moore Apr 2016

Detecting Gene-Gene Interactions Using A Permutation-Based Random Forest Method, Jing Li, James D. Malley, Angeline S. Andrew, Margaret R. Karagas, Jason H. Moore

Dartmouth Scholarship

Identifying gene-gene interactions is essential to understand disease susceptibility and to detect genetic architectures underlying complex diseases. Here, we aimed at developing a permutation-based methodology relying on a machine learning method, random forest (RF), to detect gene-gene interactions. Our approach called permuted random forest (pRF) which identified the top interacting single nucleotide polymorphism (SNP) pairs by estimating how much the power of a random forest classification model is influenced by removing pairwise interactions.


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang Feb 2016

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Leveraging Global Gene Expression Patterns To Predict Expression Of Unmeasured Genes, James Rudd, René A. Zelaya, Eugene Demidenko, Ellen L. Goode, Casey S. Greene S. Greene, Jennifer A. Doherty Dec 2015

Leveraging Global Gene Expression Patterns To Predict Expression Of Unmeasured Genes, James Rudd, René A. Zelaya, Eugene Demidenko, Ellen L. Goode, Casey S. Greene S. Greene, Jennifer A. Doherty

Dartmouth Scholarship

BackgroundLarge collections of paraffin-embedded tissue represent a rich resource to test hypotheses based on gene expression patterns; however, measurement of genome-wide expression is cost-prohibitive on a large scale. Using the known expression correlation structure within a given disease type (in this case, high grade serous ovarian cancer; HGSC), we sought to identify reduced sets of directly measured (DM) genes which could accurately predict the expression of a maximized number of unmeasured genes.


Systems Level Analysis Of Systemic Sclerosis Shows A Network Of Immune And Profibrotic Pathways Connected With Genetic Polymorphisms, J. Matthew Mahoney, Jaclyn Taroni, Viktor Martyanov, Tammara A. A. Wood, Casey S. Greene, Patricia A. Pioli, Monique E. Hinchcliff, Michael L. Whitfield Jan 2015

Systems Level Analysis Of Systemic Sclerosis Shows A Network Of Immune And Profibrotic Pathways Connected With Genetic Polymorphisms, J. Matthew Mahoney, Jaclyn Taroni, Viktor Martyanov, Tammara A. A. Wood, Casey S. Greene, Patricia A. Pioli, Monique E. Hinchcliff, Michael L. Whitfield

Dartmouth Scholarship

Systemic sclerosis (SSc) is a rare systemic autoimmune disease characterized by skin and organ fibrosis. The pathogenesis of SSc and its progression are poorly understood. The SSc intrinsic gene expression subsets (inflammatory, fibroproliferative, normal-like, and limited) are observed in multiple clinical cohorts of patients with SSc. Analysis of longitudinal skin biopsies suggests that a patient's subset assignment is stable over 6-12 months. Genetically, SSc is multi-factorial with many genetic risk loci for SSc generally and for specific clinical manifestations. Here we identify the genes consistently associated with the intrinsic subsets across three independent cohorts, show the relationship between these genes …