Open Access. Powered by Scholars. Published by Universities.®
Health Information Technology Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Health Information Technology
Data Mining Of Pancreatic Cancer Protein Databases, Peter Revesz, Christopher Assi
Data Mining Of Pancreatic Cancer Protein Databases, Peter Revesz, Christopher Assi
CSE Conference and Workshop Papers
Data mining of protein databases poses special challenges because many protein databases are non- relational whereas most data mining and machine learning algorithms assume the input data to be a type of rela- tional database that is also representable as an ARFF file. We developed a method to restructure protein databases so that they become amenable for various data mining and machine learning tools. Our restructuring method en- abled us to apply both decision tree and support vector machine classifiers to a pancreatic protein database. The SVM classifier that used both GO term and PFAM families to characterize proteins gave …