Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Chemistry

University of New Mexico

Theses/Dissertations

Bioinformatics

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

An Integrated Bioinformatic/Experimental Approach For Discovering Novel Type Ii Polyketides Encoded In Actinobacterial Genomes, Wubin Gao Jul 2017

An Integrated Bioinformatic/Experimental Approach For Discovering Novel Type Ii Polyketides Encoded In Actinobacterial Genomes, Wubin Gao

Chemistry and Chemical Biology ETDs

Discovery of new natural products (NPs) is critical both for diseases treatment and crops protection. Numerous NP biosynthetic gene clusters (BGCs) in sequenced microbial genomes allow identification of new NPs through genome mining. Developing an integrated bioinformatic/experimental approach for discovering novel type II polyketides (PK-IIs) facilitates investigation of this family of NPs in an efficient, systematic way. Here, we developed an approach to analyze ketosynthase α/β (KSα/β) gene sequences to predict PK-II core structures, allowing us to target novel PK-II BGCs either from isolated genomic DNA or genomes from the NCBI databank, and to isolate novel PK-IIs produced by these …


Mining Public Databases For Discovery Of Structure And Function Within The Hotdog-Fold Thioesterase And Had Phosphatase Enzyme Families, Sarah Toews Keating Sep 2015

Mining Public Databases For Discovery Of Structure And Function Within The Hotdog-Fold Thioesterase And Had Phosphatase Enzyme Families, Sarah Toews Keating

Chemistry and Chemical Biology ETDs

For my doctoral work, I have developed strategies to mine public databases for data that can be used to infer structural and functional information for the hotdog-fold and HADSF superfamilies. For the hotdog-fold superfamily, I used curated and automatically applied annotations of structure, taxonomic lineage, function, and subfamily membership from the UniProtKB, gene context and taxonomic information from the NCBI, and the results of several in-depth explorations of subfamily/function and structural class membership. Based on the distribution of the aforementioned annotations mapped onto a sequence similarity network (SSN), I applied structural assignments to sequences and/or specific function/subfamily assignments to ~143,000 …