Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Data mining

University of South Florida

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman Jan 2011

Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman

USF Tampa Graduate Theses and Dissertations

This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms, such as association rule mining and decision tree induction, are used to discover classification rules for specific targets. This multi-stage pipeline approach is contrasted with traditional statistical text mining (STM) methods based on term counts and term-by-document frequencies. The aim is to create effective text analytic processes by adapting and combining individual …