Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

University of Massachusetts Amherst

PDF

Evaluation

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna Nov 2023

Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna

Doctoral Dissertations

Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to …


Probabilistic Commonsense Knowledge, Xiang Li Oct 2022

Probabilistic Commonsense Knowledge, Xiang Li

Doctoral Dissertations

Commonsense knowledge is critical to achieving artificial general intelligence. This shared common background knowledge is implicit in all human communication, facilitating efficient information exchange and understanding. But commonsense research is hampered by its immense quantity of knowledge because an explicit categorization is impossible. Furthermore, a plumber could repair a sink in a kitchen or a bathroom, indicating that common sense reveals a probable assumption rather than a definitive answer. To align with these properties of commonsense fundamentally, we want to not only model but also evaluate such knowledge human-like using abstractions and probabilistic principles. Traditional combinatorial probabilistic models, e.g., probabilistic …


Improving Evaluation Methods For Causal Modeling, Amanda Gentzel Jun 2021

Improving Evaluation Methods For Causal Modeling, Amanda Gentzel

Doctoral Dissertations

Causal modeling is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. Active communities of researchers in machine learning, statistics, social science, and other fields develop and enhance algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. However, evaluation techniques for causal modeling algorithms have remained somewhat primitive, limiting what we can learn from the experimental studies of algorithm performance, constraining the types of algorithms and model representations that researchers consider, and creating a gap between theory and practice. We argue for expanding …


Exploiting Social Media Sources For Search, Fusion And Evaluation, Chia-Jung Lee Nov 2015

Exploiting Social Media Sources For Search, Fusion And Evaluation, Chia-Jung Lee

Doctoral Dissertations

The web contains heterogeneous information that is generated with different characteristics and is presented via different media. Social media, as one of the largest content carriers, has generated information from millions of users worldwide, creating material rapidly in all types of forms such as comments, images, tags, videos and ratings, etc. In social applications, the formation of online communities contributes to conversations of substantially broader aspects, as well as unfiltered opinions about subjects that are rarely covered in public media. Information accrued on social platforms, therefore, presents a unique opportunity to augment web sources such as Wikipedia or news pages, …


Adaptive Step-Sizes For Reinforcement Learning, William C. Dabney Nov 2014

Adaptive Step-Sizes For Reinforcement Learning, William C. Dabney

Doctoral Dissertations

The central theme motivating this dissertation is the desire to develop reinforcement learning algorithms that “just work” regardless of the domain in which they are applied. The largest impediment to this goal is the sensitivity of reinforcement learning algorithms to the step-size parameter used to rescale incremental updates. Adaptive step-size algorithms attempt to reduce this sensitivity or eliminate the step-size parameter entirely by automatically adjusting the step size throughout the learning process. Such algorithms provide an alternative to the standard “guess-and-check” methods used to find parameters known as parameter tuning. However, the problems with parameter tuning are currently masked by …


Indexing Proximity-Based Dependencies For Information Retrieval, Samuel Huston Apr 2014

Indexing Proximity-Based Dependencies For Information Retrieval, Samuel Huston

Doctoral Dissertations

Research into term dependencies for information retrieval has demonstrated that dependency retrieval models are able to consistently improve retrieval effectiveness over bag-of-words models. However, the computation of term dependency statistics is a major efficiency bottleneck in the execution of these retrieval models. This thesis investigates the problem of improving the efficiency of dependency retrieval models without compromising the effectiveness benefits of the term dependency features. Despite the large number of published comparisons between dependency models and bag-of-words approaches, there has been a lack of direct comparisons between alternate dependency models. We provide this comparison and investigate different types of proximity …


Incremental Test Collections, Ben Carterette, James Allan Jan 2005

Incremental Test Collections, Ben Carterette, James Allan

Computer Science Department Faculty Publication Series

Corpora and topics are readily available for information retrieval research. Relevance judgments, which are necessary for system evaluation, are expensive; the cost of obtaining them prohibits in-house evaluation of retrieval systems on new corpora or new topics. We present an algorithm for cheaply constructing sets of relevance judgments. Our method intelligently selects documents to be judged and decides when to stop in such a way that with very little work there can be a high degree of confidence in the result of the evaluation. We demonstrate the algorithm's effectiveness by showing that it produces small sets of relevance judgments that …


Strategy-Based Interactive Cluster Visualization For Information Retrieval, Anton Leuski, James Allan Jan 1999

Strategy-Based Interactive Cluster Visualization For Information Retrieval, Anton Leuski, James Allan

Computer Science Department Faculty Publication Series

In this paper we investigate a general purpose interactive information organization system. The system organizes documents by placing them into 1-, 2-, or 3- dimensional space based on their similarity and a springembedding algorithm. We begin by developing a method for estimating the quality of the organization when it is applied to a set of documents returned in response to a query. We show how the relevant documents tend to clump together in space. We proceed by presenting amethod for measuring the amount of structure in the organization and explain how this knowledge can be used to refine the system. …