Open Access. Powered by Scholars. Published by Universities.®

Library and Information Science Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Computer Sciences

Information retrieval

2014

Articles 1 - 1 of 1

Full-Text Articles in Library and Information Science

Using The Web 1t 5-Gram Database For Attribute Selection In Formal Concept Analysis To Correct Overstemmed Clusters, Guymon Hall May 2014

Using The Web 1t 5-Gram Database For Attribute Selection In Formal Concept Analysis To Correct Overstemmed Clusters, Guymon Hall

UNLV Theses, Dissertations, Professional Papers, and Capstones

Information retrieval is the process of finding information from an unstructured collection of data. The process of information retrieval involves building an index, commonly called an inverted file. As part of the inverted file, information retrieval algorithms often stem words to a common root. Stemming involves reducing a document term to its root. There are many ways to stem a word: affix removal and successor variety are two common categories of stemmers. The Porter Stemming Algorithm is a suffix removal stemmer that operates as a rule-based process on English words. We can think of stemming as a way to cluster …