Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Bioinformatics

Specialized Named Entity Recognition For Breast Cancer Subtyping, Griffith Scheyer Hawblitzel Jun 2022

Specialized Named Entity Recognition For Breast Cancer Subtyping, Griffith Scheyer Hawblitzel

Master's Theses

The amount of data and analysis being published and archived in the biomedical research community is more than can feasibly be sifted through manually, which limits the information an individual or small group can synthesize and integrate into their own research. This presents an opportunity for using automated methods, including Natural Language Processing (NLP), to extract important information from text on various topics. Named Entity Recognition (NER), is one way to automate knowledge extraction of raw text. NER is defined as the task of identifying named entities from text using labels such as people, dates, locations, diseases, and proteins. There …


Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman Jan 2022

Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman

Dissertations and Theses

One approach to interrogating the complexities of human systems in their well-regulated and dysregulated states is through the use of digital twins. Digital twins are virtual representations of physical systems that are descriptive of an individual's state of health, an object fundamentally related to precision medicine. A key element for building a functional digital twin type for a disease or predicting the therapeutic efficacy of a potential treatment is harmonized, machine-parsable domain knowledge. Hypothesis-driven investigations are the gold standard for representing subsystems, but their results encompass a limited knowledge of the full biosystem. Multi-omics data is one rich source of …


Computational Analysis And Prediction Of Intrinsic Disorder And Intrinsic Disorder Functions In Proteins, Akila I. Katuwawala Jan 2021

Computational Analysis And Prediction Of Intrinsic Disorder And Intrinsic Disorder Functions In Proteins, Akila I. Katuwawala

Theses and Dissertations

COMPUTATIONAL ANALYSIS AND PREDICTION OF INTRINSIC DISORDER AND INTRINSIC DISORDER FUNCTIONS IN PROTEINS

By Akila Imesha Katuwawala

A dissertation submitted in partial fulfillment of the requirements for the degree of Engineering, Doctor of Philosophy with a concentration in Computer Science at Virginia Commonwealth University.

Virginia Commonwealth University, 2021

Director: Lukasz Kurgan, Professor, Department of Computer Science

Proteins, as a fundamental class of biomolecules, have been studied from various perspectives over the past two centuries. The traditional notion is that proteins require fixed and stable three-dimensional structures to carry out biological functions. However, there is mounting evidence regarding a “special” class …


Individualized Clinical Practice Guidelines For Pressure Injury Management: Development Of An Integrated Multi-Modal Biomedical Information Resource, Kathie M. Bogie, Guo-Qiang Zhang, Steven K. Roggenkamp, Ningzhou Zeng, Jacinta Seton, Shiqiang Tao, Arielle L. Bloostein, Jiayang Sun Sep 2018

Individualized Clinical Practice Guidelines For Pressure Injury Management: Development Of An Integrated Multi-Modal Biomedical Information Resource, Kathie M. Bogie, Guo-Qiang Zhang, Steven K. Roggenkamp, Ningzhou Zeng, Jacinta Seton, Shiqiang Tao, Arielle L. Bloostein, Jiayang Sun

Institute for Biomedical Informatics Faculty Publications

Background: Pressure ulcers (PU) and deep tissue injuries (DTI), collectively known as pressure injuries are serious complications causing staggering costs and human suffering with over 200 reported risk factors from many domains. Primary pressure injury prevention seeks to prevent the first incidence, while secondary PU/DTI prevention aims to decrease chronic recurrence. Clinical practice guidelines (CPG) combine evidence-based practice and expert opinion to aid clinicians in the goal of achieving best practices for primary and secondary prevention. The correction of all risk factors can be both overwhelming and impractical to implement in clinical practice. There is a need to develop practical …


A Study Of Scalability And Cost-Effectiveness Of Large-Scale Scientific Applications Over Heterogeneous Computing Environment, Arghya K. Das Jun 2018

A Study Of Scalability And Cost-Effectiveness Of Large-Scale Scientific Applications Over Heterogeneous Computing Environment, Arghya K. Das

LSU Doctoral Dissertations

Recent advances in large-scale experimental facilities ushered in an era of data-driven science. These large-scale data increase the opportunity to answer many fundamental questions in basic science. However, these data pose new challenges to the scientific community in terms of their optimal processing and transfer. Consequently, scientists are in dire need of robust high performance computing (HPC) solutions that can scale with terabytes of data.

In this thesis, I address the challenges in three major aspects of scientific big data processing as follows: 1) Developing scalable software and algorithms for data- and compute-intensive scientific applications. 2) Proposing new cluster architectures …


Efficient Alignment Algorithms For Dna Sequencing Data, Nilesh Vinod Khiste Jan 2018

Efficient Alignment Algorithms For Dna Sequencing Data, Nilesh Vinod Khiste

Electronic Thesis and Dissertation Repository

The DNA Next Generation Sequencing (NGS) technologies produce data at a low cost, enabling their application to many ambitious fields such as cancer research, disease control, personalized medicine etc. However, even after a decade of research, the modern aligners and assemblers are far from providing efficient and error free genome alignments and assemblies respectively. This is due to the inherent nature of the genome alignment and assembly problem, which involves many complexities. Many algorithms to address this problem have been proposed over the years, but there still is a huge scope for improvement in this research space.

Many new genome …


Identification Of Prognostic Cancer Biomarkers Through The Application Of Rna-Seq Technologies And Bioinformatics, Nathan Wong Dec 2017

Identification Of Prognostic Cancer Biomarkers Through The Application Of Rna-Seq Technologies And Bioinformatics, Nathan Wong

McKelvey School of Engineering Theses & Dissertations

MicroRNAs (miRNAs) are short single-stranded RNAs that function as the guide sequence of the post-transcriptional regulatory process known as the RNA-induced silencing complex (RISC), which targets mRNA sequences for degradation through complementary binding to the guide miRNA. Changes in miRNA expression have been reported as correlated with numerous biological processes, including embryonic development, cellular differentiation, and disease manifestation. In the latter case, dysregulation has been observed in response to infection by human papillomavirus (HPV), which has also been established as both oncogenic in cervical cancers and oropharyngeal cancers and favorable for overall patient survival after tumor formation. The identification of …


Collaborative Research: North East Cyberinfrastructure Consortium, Michael Eckardt, Vicki Nemeth, Carolyn Mattingly May 2014

Collaborative Research: North East Cyberinfrastructure Consortium, Michael Eckardt, Vicki Nemeth, Carolyn Mattingly

University of Maine Office of Research Administration: Grant Reports

The North East Cyberinfrastructure Consortium has finished its third year of Track-2 funding. In this report we summarize our overall progress and progress for Year 3.

In 2006, we began to organize as the five North Eastern EPSCoR states (ME, NH, VT, Rl, DE) around cyberinfrastructure. The box below describes the state of cyberinfrastructure in 2008 by which time we had developed the North East Cyberinfrastructure Consortium to position ourselves for grant opportunities that would help us to address our cyber deficits.

The Track-2 collaborative proposal submitted in January 2009 was designed to address these barriers in order enable our …


Towards The Prediction Of Mutations In Genomic Sequences, Juan Carlos Martinez Nov 2013

Towards The Prediction Of Mutations In Genomic Sequences, Juan Carlos Martinez

FIU Electronic Theses and Dissertations

Bio-systems are inherently complex information processing systems. Furthermore, physiological complexities of biological systems limit the formation of a hypothesis in terms of behavior and the ability to test hypothesis. More importantly the identification and classification of mutation in patients are centric topics in today’s cancer research.

Next generation sequencing (NGS) technologies can provide genome-wide coverage at a single nucleotide resolution and at reasonable speed and cost. The unprecedented molecular characterization provided by NGS offers the potential for an individualized approach to treatment. These advances in cancer genomics have enabled scientists to interrogate cancer-specific genomic variants and compare them with the …


Classification Of Genomic Sequences By Latent Semantic Analysis, Samuel F. Way Aug 2012

Classification Of Genomic Sequences By Latent Semantic Analysis, Samuel F. Way

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Evolutionary distance measures provide a means of identifying and organizing related organisms by comparing their genomic sequences. As such, techniques that quantify the level of similarity between DNA sequences are essential in our efforts to decipher the genetic code in which they are written.

Traditional methods for estimating the evolutionary distance separating two genomic sequences often require that the sequences first be aligned before they are compared. Unfortunately, this preliminary step imposes great computational burden, making this class of techniques impractical for applications involving a large number of sequences. Instead, we desire new methods for differentiating genomic sequences that eliminate …


Computational Genomic Signatures And Metagenomics, Ozkan U. Nalbantoglu Apr 2011

Computational Genomic Signatures And Metagenomics, Ozkan U. Nalbantoglu

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Mathematical characterizations of biological sequences form one of the main elements of bioinformatics. In this work, a class of DNA sequence characterization, namely computational genomics signatures, which capture global features of these sequences is used to address emerging computational biology challenges. Because of the species specificity and pervasiveness of genome signatures, it is possible to use these signatures to characterize and identify a genome or a taxonomic unit using a short genome fragment from that source. However, the identification accuracy is generally poor when the sequence model and the sequence distance measure are not selected carefully. We show that the …