Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Brigham Young University

Series

Information retrieval

Publication Year

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Spamed: A Spam Email Detection Approach Based On Phrase Similarity, Yiu-Kai D. Ng, Maria Soledad Pera Feb 2009

Spamed: A Spam Email Detection Approach Based On Phrase Similarity, Yiu-Kai D. Ng, Maria Soledad Pera

Faculty Publications

Emails are unquestionably one of the most popular communication media these days. Not only they are fast and reliable, but also free in general. Unfortunately, a significant number of emails received by email users on a daily basis are spam. This fact is annoying, since spam emails translate into a waste of user’s time in reviewing and deleting them. In addition, spam emails consume resources, such as storage, bandwidth, and computer processing time. Many attempts have been made in the past to eradicate spam emails; however, none has been proved highly effective. In this paper, we propose a spam-email detection …


Synthesizing Correlated Rss News Articles Based On A Fuzzy Equivalence Relation, Yiu-Kai D. Ng, Maria Soledad Pera Jan 2009

Synthesizing Correlated Rss News Articles Based On A Fuzzy Equivalence Relation, Yiu-Kai D. Ng, Maria Soledad Pera

Faculty Publications

Tens of thousands of news articles are posted on-line each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of information, RSS (news) feeds are used to categorize newly posted articles. Nonetheless, most RSS users must filter through many articles within the same or different RSS feeds to locate articles pertaining to their particular interests. Due to the large number of news articles in individual RSS feeds, there is a need for further organizing articles to aid users in locating non-redundant, informative, and related articles of interest quickly. In this paper, we …


Using Vagueness Measures To Re-Rank Documents Retrieved By A Fuzzy Set Information Retrieval Model, Stephen Lynn, Yiu-Kai D. Ng Oct 2008

Using Vagueness Measures To Re-Rank Documents Retrieved By A Fuzzy Set Information Retrieval Model, Stephen Lynn, Yiu-Kai D. Ng

Faculty Publications

Traditional information retrieval (IR) systems evaluate user queries and retrieve/rank documents based on matching keywords in user queries with words in documents. These exact word-matching and ranking approaches ignore too many relevant documents that do not contain the exact keywords as specified in a user query. Instead of considering these traditional approaches, we propose to retrieve documents using a fuzzy set IR model and rank retrieved documents for any vague query using the “vagueness score” of the documents based on the word senses as defined in WordNet. Using the vagueness scores, we rank the most highest “relevant” documents of a …