Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Physical Sciences and Mathematics
An Integrated Bioinformatic/Experimental Approach For Discovering Novel Type Ii Polyketides Encoded In Actinobacterial Genomes, Wubin Gao
Chemistry and Chemical Biology ETDs
Discovery of new natural products (NPs) is critical both for diseases treatment and crops protection. Numerous NP biosynthetic gene clusters (BGCs) in sequenced microbial genomes allow identification of new NPs through genome mining. Developing an integrated bioinformatic/experimental approach for discovering novel type II polyketides (PK-IIs) facilitates investigation of this family of NPs in an efficient, systematic way. Here, we developed an approach to analyze ketosynthase α/β (KSα/β) gene sequences to predict PK-II core structures, allowing us to target novel PK-II BGCs either from isolated genomic DNA or genomes from the NCBI databank, and to isolate novel PK-IIs produced by these …
Mining Public Databases For Discovery Of Structure And Function Within The Hotdog-Fold Thioesterase And Had Phosphatase Enzyme Families, Sarah Toews Keating
Mining Public Databases For Discovery Of Structure And Function Within The Hotdog-Fold Thioesterase And Had Phosphatase Enzyme Families, Sarah Toews Keating
Chemistry and Chemical Biology ETDs
For my doctoral work, I have developed strategies to mine public databases for data that can be used to infer structural and functional information for the hotdog-fold and HADSF superfamilies. For the hotdog-fold superfamily, I used curated and automatically applied annotations of structure, taxonomic lineage, function, and subfamily membership from the UniProtKB, gene context and taxonomic information from the NCBI, and the results of several in-depth explorations of subfamily/function and structural class membership. Based on the distribution of the aforementioned annotations mapped onto a sequence similarity network (SSN), I applied structural assignments to sequences and/or specific function/subfamily assignments to ~143,000 …