Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 23 of 23

Full-Text Articles in Entire DC Network

A Method For Identifying Ancient Introgression Between Caballine And Non-Caballine Equids Using Whole Genome High Throughput Data., Kalpani De Silva Dec 2021

A Method For Identifying Ancient Introgression Between Caballine And Non-Caballine Equids Using Whole Genome High Throughput Data., Kalpani De Silva

Electronic Theses and Dissertations

Introgression is one of the main mechanisms that transfer adapted alleles between species. The advantageous variants will get positively selected and retained in the recipient population while rest of the variants undergo negative selection. When analyzing horse genome, two alleles were found in CXCL16 gene, one associated with susceptibility and one with resistance to developing persistent shedding of the Equine Arteritis Virus. The two alleles differ by 4 non-synonymous variants in exon 1 of the gene. Comparison with 3 non-caballine equids (zebras, asses and hemiones) revealed that one haplotype was almost identical to the haplotype found in non-caballines while the …


Deciphering The Perpetual Fight Between Virus And Host: Utilizing Bioinformatics To Elucidate The Host's Genetic Mechanisms That Influence Jc Polyomavirus Infection, Michael P. Wilczek Aug 2021

Deciphering The Perpetual Fight Between Virus And Host: Utilizing Bioinformatics To Elucidate The Host's Genetic Mechanisms That Influence Jc Polyomavirus Infection, Michael P. Wilczek

Electronic Theses and Dissertations

JC polyomavirus (JCPyV) is a human-specific pathogen that infects 50-80% of the population, and can cause a deadly, demyelinating disease, known as progressive multifocal leukoencephalopathy (PML). In most of the population, JCPyV persistently infects the kidneys but during immunosuppression, it can reactivate and spread to the central nervous system (CNS), causing PML. In the CNS, JCPyV targets two cell types, astrocytes, and oligodendrocytes. Due to the hallmark pathology of oligodendrocyte lysis observed in disease, oligodendrocytes were thought to be the main cell type involved during JCPyV infection. However, recent evidence suggests that astrocytes are targeted by the virus and act …


Unveiling Global Roles Of G-Quadruplexes And G4-22 In Human Genetics, Ruth Barros De Paula Aug 2021

Unveiling Global Roles Of G-Quadruplexes And G4-22 In Human Genetics, Ruth Barros De Paula

Dissertations & Theses (Open Access)

G-quadruplexes are non-B DNA structures formed by four or more runs of repeated guanines that confer unique features to living organism’s genomes. These sequences are enriched in regulatory regions, such as promoters and 5’ UTRs, and have distinct regulatory roles in both health and disease states. Even though previous studies showed the impact of G4 in gene expression, none of them summarized the location-specific effect of G4. Also, there is no broad understanding about the most common G4 repeat in the human genome, named here as G4-22, and how it links to the evolution of mammals and their biology. In …


Comparative Genomics Methods And Applications, Emily N. Alden Jul 2021

Comparative Genomics Methods And Applications, Emily N. Alden

Biomedical Sciences ETDs

Virtually all fields of biology have benefited from the advancements in comparative genomics technologies, specifically in the study of evolution. In this dissertation I develop and use comparative genomic technologies to investigate the novel SARS-CoV-2 virus, assembly the first genome of the black lace domestic angelfish and identify germline genetic variants associated with altered breast cancer-specific survival. Our genome tiling array for the novel coronavirus presents a rapid and cost-effective method to sequence the entire viral genome and can be used to track the rapid evolution of viral variants in the population. The domestic angelfish is a member of the …


Biol 4010w/7190g/Cisc2810w: Macromolecular Structure And Bioinformatics, Shaneen Singh Jul 2021

Biol 4010w/7190g/Cisc2810w: Macromolecular Structure And Bioinformatics, Shaneen Singh

Open Educational Resources

No abstract provided.


Characterization Of Iron-Sulfur Cluster Biogenesis In Methanogenic Archaea, Thomas Modlin Deere Jul 2021

Characterization Of Iron-Sulfur Cluster Biogenesis In Methanogenic Archaea, Thomas Modlin Deere

Graduate Theses and Dissertations

Iron-sulfur (Fe-S) clusters are among the oldest cofactors on the planet, used by proteins in almost all forms of life on Earth to carry out processes ranging from energy transfer to DNA replication. Among the organisms believed to use these Fe-S proteins more extensively than almost any others are the methanogens, an ancient lineage of archaeal microbes that produce methane as a required product of their metabolism. Methane, the primary component of commercial natural gas, is both a potent greenhouse gas and an important fossil fuel. It can also be renewably produced as a biofuel. Biogenic methane is almost entirely …


Biomedical Informatics Colloquium, Bio 4050, Course Outline, Eugenia G. Giannopoulou May 2021

Biomedical Informatics Colloquium, Bio 4050, Course Outline, Eugenia G. Giannopoulou

Open Educational Resources

A seminar-based course that exposes students to current research topics in the fields of Bioinformatics and Medical Informatics. Weekly presentations by invited speakers and/or faculty introduce students to the broad diversity of research areas in both fields, and engages them in critical thinking and writing. Online lectures and reading activities will be given periodically.


Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil May 2021

Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil

Open Access Theses & Dissertations

With the rise of high throughput technologies in biomedical research, large volumes of expression profiling, methylation profiling, and RNA-sequencing data are being generated. These high-dimensional data have large number of features with small number of samples, a characteristic called the "curse of dimensionality." The selection of optimal features, which largely affects the performance of classification algorithms in machine learning models, has led to challenging problems in bioinformatics analyses of such high-dimensional datasets. In this work, I focus on the design of two-stage frameworks of feature selection and classification and their applications in multiple sets of colorectal cancer data. The first …


Simulation Of The Interaction Between Striated Muscle Unc-45 And Transcription Factor Gata-4, Drake Alexander Duncan May 2021

Simulation Of The Interaction Between Striated Muscle Unc-45 And Transcription Factor Gata-4, Drake Alexander Duncan

Electronic Theses and Dissertations

Striated Muscle UNC-45, also known as UNC-45b, is an important protein that acts as a chaperone for myosin in cardiac and skeletal muscles, binding to myosin at its C-terminal UCS domain and regulating its assembly into thick filaments and sarcomeric structures. The UCS domain contains a large loop that is believed to be the first point of interaction between myosin and UNC-45b. GATA-4 is an essential transcription factor that facilitates transcription of several genes in cardiac development, particularly alpha-heavy chain myosin in heart tissue. Recently, studies have shown that there is interaction of GATA-4 with UNC-45b and that GATA-4 binds …


Trunctrimmer: A First Step Towards Automating Standard Bioinformatic Analysis, Z. Gunner Lawless, Dana Dittoe, Dale R. Thompson, Steven C. Ricke May 2021

Trunctrimmer: A First Step Towards Automating Standard Bioinformatic Analysis, Z. Gunner Lawless, Dana Dittoe, Dale R. Thompson, Steven C. Ricke

Computer Science and Computer Engineering Undergraduate Honors Theses

Bioinformatic analysis is a time-consuming process for labs performing research on various microbiomes. Researchers use tools like Qiime2 to help standardize the bioinformatic analysis methods, but even large, extensible platforms like Qiime2 have drawbacks due to the attention required by researchers. In this project, we propose to automate additional standard lab bioinformatic procedures by eliminating the existing manual process of determining the trim and truncate locations for paired end 2 sequences. We introduce a new Qiime2 plugin called TruncTrimmer to automate the process that usually requires the researcher to make a decision on where to trim and truncate manually after …


Visualization And Interpretation Of Protein Interactions, Dipanjan Chatterjee Apr 2021

Visualization And Interpretation Of Protein Interactions, Dipanjan Chatterjee

Electronic Thesis and Dissertation Repository

Visualization and interpretation of deep learning models' prediction is a very important area of research in machine learning nowadays. Researchers are not only focused on generating a model with good performance, but also they want to trust the model. Our aim in this thesis is to adapt existing interpretation methods to a protein-protein binding site prediction problem to visualize and understand the model's prediction and learning pattern.

We present three deep learning-based interpretation methods: sensitivity analysis, saliency map and integrated gradients to analyze the amino acid residues which create positive and negative relevance to the deep learning models' prediction. As …


A Test Of Rad Capture Sequencing On Ethanol-Preserved Centennial And Contemporary Specimens Of Philippine Fishes, Madeleine I. Kenton Apr 2021

A Test Of Rad Capture Sequencing On Ethanol-Preserved Centennial And Contemporary Specimens Of Philippine Fishes, Madeleine I. Kenton

Biological Sciences Theses & Dissertations

Understanding the relationship between ecological characteristics and genetic change in natural populations in different time scales can reveal how anthropogenic stressors affect natural populations and can improve the success of conservation strategies. The purpose of the Philippines Partnerships for International Research and Education (PIRE) project is to examine levels of genetic change between historical fish samples collected by the USS Albatross expedition in the early 1900s in the Philippines and contemporary populations collected at the same localities. This study tests genetic protocols to process historical and contemporary DNA for simultaneous comparison. Two DNA library preparation methods, single digest RADseq (“un-baited” …


A Novel Approach To Teaching Hidden Markov Models To A Diverse Undergraduate Population, Philip Heller, Pratyusha Pogaru Mar 2021

A Novel Approach To Teaching Hidden Markov Models To A Diverse Undergraduate Population, Philip Heller, Pratyusha Pogaru

Faculty Research, Scholarly, and Creative Activity

Hidden Markov Models (HMMs) are an essential tool for Bioinformatic analysis, with extensive success at finding patterns (e.g. CRISPR arrays or genes of interest) in DNA or protein sequences. HMMs are conceptually intricate, and the algorithms that make use of them are complicated. Thus they present a challenge to Bioinformatics instructors at the undergraduate level, particularly when the students’ educational backgrounds are broadly diverse. At San Jose State University, many undergraduate Bioinformatics students are Biology majors with little or no prior coursework in mathematics, statistics, or programming. For this population a theory-based approach to teaching HMMs would be ineffective. To …


The Utilization And Optimization Of Omics Trait Prediction Models Within And Across Diverse Populations, Ashley Mulford Jan 2021

The Utilization And Optimization Of Omics Trait Prediction Models Within And Across Diverse Populations, Ashley Mulford

Master's Theses

Most cancer chemotherapeutic agents are ineffective in a subset of patients; thus, it is important to consider the role of genetic variation in drug response. Lymphoblastoid cell lines (LCLs) derived from 1000 Genomes Project populations of diverse ancestries are a useful model for determining how genetic factors impact variation in cytotoxicity. In our study, LCLs from three 1000 Genomes Project populations of diverse ancestries were previously treated with increasing concentrations of eight chemotherapeutic drugs and cell growth inhibition was measured at each dose with half-maximal inhibitory concentration (IC50) or area under the dose-response curve (AUC) as our phenotype for each …


Elucidating The Role Of The Tyrosine Phosphatase, Shp-2, In Regulation Of Pd-L1 Expression In Non-Small Lung Cancer Using Both Biochemical Analyses And Real-World Genomic Information, Keller Toral Jan 2021

Elucidating The Role Of The Tyrosine Phosphatase, Shp-2, In Regulation Of Pd-L1 Expression In Non-Small Lung Cancer Using Both Biochemical Analyses And Real-World Genomic Information, Keller Toral

Theses and Dissertations--Pharmacy

Immune checkpoint inhibitors (ICIs), especially those that target programmed cell death protein 1 (PD-1) and programmed cell death ligand-1 (PD-L1), have been shown to provide substantial clinical benefit in many patients with non-small cell lung cancer (NSCLC). While these therapeutic agents can be highly effective in the correct context, the biological systems that malignant cells draft from normal activities of the cell are poorly characterized. Tumor cell-specific expression of PD-L1 is likely important for clinical benefit from PD-1 and PD-L1 inhibitors. It is known that PD-L1 is inappropriately expressed in many cancers harboring mutations in the RAS family of genes. …


Machine Learning And Bioinformatic Insights Into Key Enzymes For A Bio-Based Circular Economy, Japheth E. Gado Jan 2021

Machine Learning And Bioinformatic Insights Into Key Enzymes For A Bio-Based Circular Economy, Japheth E. Gado

Theses and Dissertations--Chemical and Materials Engineering

The world is presently faced with a sustainability crisis; it is becoming increasingly difficult to meet the energy and material needs of a growing global population without depleting and polluting our planet. Greenhouse gases released from the continuous combustion of fossil fuels engender accelerated climate change, and plastic waste accumulates in the environment. There is need for a circular economy, where energy and materials are renewably derived from waste items, rather than by consuming limited resources. Deconstruction of the recalcitrant linkages in natural and synthetic polymers is crucial for a circular economy, as deconstructed monomers can be used to manufacture …


The Role Of Software Engineering In Bioinformatics, Brendan Sean Lawlor Jan 2021

The Role Of Software Engineering In Bioinformatics, Brendan Sean Lawlor

Theses

This thesis proposes that by applying state-of-the-art software engineering tools, techniques and frameworks to currently recognised challenges in bioinformatics, improved outcomes can be attained in that field. It begins by decomposing software engineering into two categories, namely process and architecture, and choosing two key challenges in the practice of bioinformatics: reproducibility and scalability. The body of the thesis is an exploration of the intersection between these two software engineering categories and these two bioinformatics challenges. The question is asked: Can best practices in professional software engineering be applied to address key issues in the bioinformatics domain, creating positive outcomes? And …


An Automated Method To Enrich And Expand Consumer Health Vocabularies Using Glove Word Embeddings, Mohammed Ibrahim Jan 2021

An Automated Method To Enrich And Expand Consumer Health Vocabularies Using Glove Word Embeddings, Mohammed Ibrahim

Graduate Theses and Dissertations

Clear language makes communication easier between any two parties. However, a layman may have difficulty communicating with a professional due to not understanding the specialized terms common to the domain. In healthcare, it is rare to find a layman knowledgeable in medical jargon, which can lead to poor understanding of their condition and/or treatment. To bridge this gap, several professional vocabularies and ontologies have been created to map laymen medical terms to professional medical terms and vice versa. Many of the presented vocabularies are built manually or semi-automatically requiring large investments of time and human effort and consequently the slow …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


Computational Analysis And Prediction Of Intrinsic Disorder And Intrinsic Disorder Functions In Proteins, Akila I. Katuwawala Jan 2021

Computational Analysis And Prediction Of Intrinsic Disorder And Intrinsic Disorder Functions In Proteins, Akila I. Katuwawala

Theses and Dissertations

COMPUTATIONAL ANALYSIS AND PREDICTION OF INTRINSIC DISORDER AND INTRINSIC DISORDER FUNCTIONS IN PROTEINS

By Akila Imesha Katuwawala

A dissertation submitted in partial fulfillment of the requirements for the degree of Engineering, Doctor of Philosophy with a concentration in Computer Science at Virginia Commonwealth University.

Virginia Commonwealth University, 2021

Director: Lukasz Kurgan, Professor, Department of Computer Science

Proteins, as a fundamental class of biomolecules, have been studied from various perspectives over the past two centuries. The traditional notion is that proteins require fixed and stable three-dimensional structures to carry out biological functions. However, there is mounting evidence regarding a “special” class …


Distribution And Diversity Of Heliothine And Other Lepidopteran Nudiviruses, Emrah Ozel Jan 2021

Distribution And Diversity Of Heliothine And Other Lepidopteran Nudiviruses, Emrah Ozel

Theses and Dissertations--Entomology

Helicoverpa zea nudivirus 2 (HzNV-2) is the only known sterilizing and sexually-transmitted insect virus and causes pathological symptoms in H. zea reproductive tissues. HzNV-2 has features that make it a candidate as a H. zea (corn earworm) control agent, such as the ability to cause asymptomatic (latent) and symptomatic (lytic) infections and the ability to influence mating behavior of its host to favor virus spread. HzNV pathology has been studied and its genome sequenced, however, its prevalence in natural populations is largely unknown. In this study, we developed and used a low-cost PCR-based molecular survey to investigate HzNV-2 prevalence and …


Soda: An Open-Source Library For Visualizing Biological Sequence Annotation, Jack W. Roddy, Travis J. Wheeler Jan 2021

Soda: An Open-Source Library For Visualizing Biological Sequence Annotation, Jack W. Roddy, Travis J. Wheeler

Graduate Student Theses, Dissertations, & Professional Papers

Genome annotation is the process of identifying and labeling known genetic sequences or features within a genome. Across the various subfields within modern molecular biology, there is a common need for the visualization of such annotations. Genomic data is often visualized on web browser platforms, providing users with easy access to visualization tools without the need for installing any software or, in many cases, underlying datasets. While there exists a broad range of web-based visualization tools, there is, to my knowledge, no lightweight, modern library tailored towards the visualization of genomic data. Instead, developers charged with the task of producing …


Composition And Homology In The Taxonomic Classification Of Escherichia Coli, Tanya Irani Jan 2021

Composition And Homology In The Taxonomic Classification Of Escherichia Coli, Tanya Irani

Theses and Dissertations (Comprehensive)

As new techniques have been introduced, specifically the possibility of complete genome sequencing, better methods of defining bacterial species have also been proposed. One of the most recently proposed methods, using bioinformatic techniques, is to calculate the average nucleotide identity (ANI) between the homologous genome segments of different isolates. Another method for species discrimination that has been tested successfully is the similarity of DNA compositional signatures. However, in a recent update, DNA signatures split the available Escherichia coli complete genomes into three groups. To check if this result was consistent with such genomes belonging to different species, we tested methods …