Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

Theses/Dissertations

Document retrieval

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Suffix Trees For Document Retrieval, Ryan Reck Jun 2012

Suffix Trees For Document Retrieval, Ryan Reck

Master's Theses

This thesis presents a look at the suitability of Suffix Trees for full text indexing and retrieval. Typically suffix trees are built on a character level, where the tree records which characters follow each other character. By building suffix trees for documents based on words instead of characters, the resulting tree effectively indexes every word or sequence of words that occur in any of the documents. Ukkonnen's algorithm is adapted to build word-level suffix trees. But the primary focus is on developing Algorithms for searching the suffix tree for exact and approximate, or fuzzy, matches to arbitrary query strings. A …


Knowledge-Based Document Retrieval With Application To Texpros, Fang Sheng May 2001

Knowledge-Based Document Retrieval With Application To Texpros, Fang Sheng

Dissertations

Document retrieval in an information system is most often accomplished through keyword search. The common technique behind keyword search is indexing. The major drawback of such a search technique is its lack of effectiveness and accuracy. It is very common in a typical keyword search over the Internet to identify hundreds or even thousands of records as the potentially desired records. However, often few of them are relevant to users' interests.

This dissertation presents knowledge-based document retrieval architecture with application to TEXPROS. The architecture is based on a dual document model that consists of a document type hierarchy and, a …