Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

PDF

University of Tennessee, Knoxville

Doctoral Dissertations

Machine Learning

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Entire DC Network

Computational Analysis Of Microbial Sequence Data Using Statistics And Machine Learning, Zhixiu Lu May 2023

Computational Analysis Of Microbial Sequence Data Using Statistics And Machine Learning, Zhixiu Lu

Doctoral Dissertations

Since the discovery of the double helix of DNA in 1953, modern molecular biology has opened the door to a better understanding of how genes control chemical processes within cells, including protein synthesis. Although we are still far from claiming a complete understanding, recent advances in sequencing technologies, increased computational capacity, and more sophisticated computational methods have allowed the development of various new applications that provide further insight into DNA sequence data and how the information they encode impacts living organisms and their environment. Sequencing data can now be used to start identifying the relationships between microorganisms, where they live, …


Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero Aug 2022

Better Understanding Genomic Architecture With The Use Of Applied Statistics And Explainable Artificial Intelligence, Jonathon C. Romero

Doctoral Dissertations

With the continuous improvements in biological data collection, new techniques are needed to better understand the complex relationships in genomic and other biological data sets. Explainable Artificial Intelligence (X-AI) techniques like Iterative Random Forest (iRF) excel at finding interactions within data, such as genomic epistasis. Here, the introduction of new methods to mine for these complex interactions is shown in a variety of scenarios. The application of iRF as a method for Genomic Wide Epistasis Studies shows that the method is robust in finding interacting sets of features in synthetic data, without requiring the exponentially increasing computation time of many …