Physical Sciences and Mathematics | Open Access Articles

Retrieval Models Based On Linguistic Features Of Verbose Queries, Jae Hyun Park Nov 2014

Retrieval Models Based On Linguistic Features Of Verbose Queries, Jae Hyun Park

Doctoral Dissertations

Natural language expressions are more familiar to users than choosing keywords for queries. Given that, people can use natural language expressions to represent their sophisticated information needs. Instead of listing keywords, verbose queries are expressed in a grammatically well-formed phrase or sentence in which terms are used together to represent the more specific meanings of a concept, and the relationships of these concepts are expressed by function words. The goal of this thesis is to investigate methods of using the semantic and syntactic features of natural language queries to maximize the effectiveness of search. For this purpose, we propose the …

Go to article

Entity-Based Enrichment For Information Extraction And Retrieval, Jeffrey Dalton Aug 2014

Entity-Based Enrichment For Information Extraction And Retrieval, Jeffrey Dalton

Doctoral Dissertations

The goal of this work is to leverage cross-document entity relationships for improved understanding of queries and documents. We define an entity to be a thing or concept that exists in the world, such as a politician, a battle, a film, or a color. Entity-based enrichment (EBE) is a new expansion model for both queries and documents using features from similar entitymentions in the document collection and external knowledge resources. It uses task-specific features from entities beyond words that include: name aliases, fine-grained entity types, categories, and relationships to other entities. EBE addresses the problem of sparse or noisy local …

Go to article

Indexing Proximity-Based Dependencies For Information Retrieval, Samuel Huston Apr 2014

Indexing Proximity-Based Dependencies For Information Retrieval, Samuel Huston

Doctoral Dissertations

Research into term dependencies for information retrieval has demonstrated that dependency retrieval models are able to consistently improve retrieval effectiveness over bag-of-words models. However, the computation of term dependency statistics is a major efficiency bottleneck in the execution of these retrieval models. This thesis investigates the problem of improving the efficiency of dependency retrieval models without compromising the effectiveness benefits of the term dependency features. Despite the large number of published comparisons between dependency models and bag-of-words approaches, there has been a lack of direct comparisons between alternate dependency models. We provide this comparison and investigate different types of proximity …

Go to article

Retrieval And Evaluation Techniquesfor Personal Information, Jinyoung Kim Sep 2012

Retrieval And Evaluation Techniquesfor Personal Information, Jinyoung Kim

Open Access Dissertations

Providing an effective mechanism for personal information retrieval is important for many applications, and requires different techniques than have been developed for general web search. This thesis focuses on developing retrieval models and representations for personal search, and on designing evaluation frameworks that can be used to demonstrate retrieval effectiveness in a personal environment.

From the retrieval model perspective, personal information can be viewed as a collection of multiple document types each of which has unique metadata. Based on this perspective, we propose a retrieval model that exploits document metadata and multi-type structure. Proposed retrieval models were found to be …

Go to article

Query-Dependent Selection Of Retrieval Alternatives, Niranjan Balasubramanian Sep 2011

Query-Dependent Selection Of Retrieval Alternatives, Niranjan Balasubramanian

Open Access Dissertations

The main goal of this thesis is to investigate query-dependent selection of retrieval alternatives for Information Retrieval (IR) systems. Retrieval alternatives include choices in representing queries (query representations), and choices in methods used for scoring documents. For example, an IR system can represent a user query without any modification, automatically expand it to include more terms, or reduce it by dropping some terms. The main motivation for this work is that no single query representation or retrieval model performs the best for all queries. This suggests that selecting the best representation or retrieval model for each query can yield improved …

Go to article

Discovering And Using Implicit Data For Information Retrieval, Xing Yi Sep 2011

Discovering And Using Implicit Data For Information Retrieval, Xing Yi

Open Access Dissertations

In real-world information retrieval (IR) tasks, the searched items and/or the users' queries often have implicit information associated with them -- information that describes unspecified aspects of the items or queries. For example, in web search tasks, web pages are often pointed to by hyperlinks (known as anchors) from other pages, and thus have human-generated succinct descriptions of their content (anchor text) associated with them. This indirectly available information has been shown to improve search effectiveness for different retrieval tasks. However, in many real-world IR challenges this information is sparse in the data; i.e., it is incomplete or missing in …

Go to article

A Framework To Predict The Quality Of Answers With Nontextual, University Of Massachusetts Amherst Aug 2006

A Framework To Predict The Quality Of Answers With Nontextual, University Of Massachusetts Amherst

Computer Science Department Faculty Publication Series

New types of document collections are being developed by various web services. The service providers keep track of non-textual features such as click counts. In this paper, we present a framework to use non-textual features to pre- dict the quality of documents. We also show our quality measure can be successfully incorporated into the language modeling-based retrieval model. We test our approach on a collection of question and answer pairs gathered from a community based question answering service where people ask and answer questions. Experimental results using our quality measure show a signi¯cant improvement over our baseline.

Go to article

Finding Similar Questions In Large Question And Answer Archives, Jiwoon Jeon

Computer Science Department Faculty Publication Series

There has recently been a significant increase in the number of community-based question and answer services on the Web where people answer other peoples’ questions. These services rapidly build up large archives of questions and answers, and these archives are a valuable linguistic resource. One of the major tasks in a question and answer service is to find questions in the archive that a semantically similar to a user’s question. This enables high quality answers from the archive to be retrieved and removes the time lag associated with a community-based system. In this paper, we discuss methods for question retrieval …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Retrieval Models Based On Linguistic Features Of Verbose Queries, Jae Hyun Park

Doctoral Dissertations

Entity-Based Enrichment For Information Extraction And Retrieval, Jeffrey Dalton

Doctoral Dissertations

Indexing Proximity-Based Dependencies For Information Retrieval, Samuel Huston

Doctoral Dissertations

Retrieval And Evaluation Techniquesfor Personal Information, Jinyoung Kim

Open Access Dissertations

Query-Dependent Selection Of Retrieval Alternatives, Niranjan Balasubramanian

Open Access Dissertations

Discovering And Using Implicit Data For Information Retrieval, Xing Yi

Open Access Dissertations

A Framework To Predict The Quality Of Answers With Nontextual, University Of Massachusetts Amherst

Computer Science Department Faculty Publication Series

Finding Similar Questions In Large Question And Answer Archives, Jiwoon Jeon

Computer Science Department Faculty Publication Series