Open Access. Powered by Scholars. Published by Universities.®

Amino Acids, Peptides, and Proteins Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Amino Acids, Peptides, and Proteins

Enhancement Of Deep Learning Protein Structure Prediction, Ruoming Shen Apr 2023

Enhancement Of Deep Learning Protein Structure Prediction, Ruoming Shen

Modeling, Simulation and Visualization Student Capstone Conference

Protein modeling is a rapidly expanding field with valuable applications in the pharmaceutical industry. Accurate protein structure prediction facilitates drug design, as extensive knowledge about the atomic structure of a given protein enables scientists to target that protein in the human body. However, protein structure identification in certain types of protein images remains challenging, with medium resolution cryogenic electron microscopy (cryo-EM) protein density maps particularly difficult to analyze. Recent advancements in computational methods, namely deep learning, have improved protein modeling. To maximize its accuracy, a deep learning model requires copious amounts of up-to-date training data.

This project explores DeepSSETracer, a …


An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He Jan 2023

An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He

Computer Science Faculty Publications

More and more deep learning approaches have been proposed to segment secondary structures from cryo-electron density maps at medium resolution range (5--10Å). Although the deep learning approaches show great potential, only a few small experimental data sets have been used to test the approaches. There is limited understanding about potential factors, in data, that affect the performance of segmentation. We propose an approach to generate data sets with desired specifications in three potential factors - the protein sequence identity, structural contents, and data quality. The approach was implemented and has generated a test set and various training sets to study …