Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Artificial Intelligence and Robotics

Learning With Aggregate Data, Tao Sun Mar 2019

Learning With Aggregate Data, Tao Sun

Doctoral Dissertations

Various real-world applications involve directly dealing with aggregate data. In this work, we study Learning with Aggregate Data from several perspectives and try to address their combinatorial challenges. At first, we study the problem of learning in Collective Graphical Models (CGMs), where only noisy aggregate observations are available. Inference in CGMs is NP- hard and we proposed an approximate inference algorithm. By solving the inference problems, we are empowered to build large-scale bird migration models, and models for human mobility under the differential privacy setting. Secondly, we consider problems given bags of instances and bag-level aggregate supervisions. Specifically, we study …


Epistemological Databases For Probabilistic Knowledge Base Construction, Michael Louis Wick Mar 2015

Epistemological Databases For Probabilistic Knowledge Base Construction, Michael Louis Wick

Doctoral Dissertations

Knowledge bases (KB) facilitate real world decision making by providing access to structured relational information that enables pattern discovery and semantic queries. Although there is a large amount of data available for populating a KB; the data must first be gathered and assembled. Traditionally, this integration is performed automatically by storing the output of an information extraction pipeline directly into a database as if this prediction were the ``truth.'' However, the resulting KB is often not reliable because (a) errors accumulate in the integration pipeline, and (b) they persist in the KB even after new information arrives that could rectify …


Learning With Joint Inference And Latent Linguistic Structure In Graphical Models, Jason Narad Mar 2015

Learning With Joint Inference And Latent Linguistic Structure In Graphical Models, Jason Narad

Doctoral Dissertations

Constructing end-to-end NLP systems requires the processing of many types of linguistic information prior to solving the desired end task. A common approach to this problem is to construct a pipeline, one component for each task, with each system's output becoming input for the next. This approach poses two problems. First, errors propagate, and, much like the childhood game of "telephone", combining systems in this manner can lead to unintelligible outcomes. Second, each component task requires annotated training data to act as supervision for training the model. These annotations are often expensive and time-consuming to produce, may differ from each …


Causal Discovery For Relational Domains: Representation, Reasoning, And Learning, Marc Maier Nov 2014

Causal Discovery For Relational Domains: Representation, Reasoning, And Learning, Marc Maier

Doctoral Dissertations

Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks. Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively …


Scaling Mcmc Inference And Belief Propagation To Large, Dense Graphical Models, Sameer Singh Aug 2014

Scaling Mcmc Inference And Belief Propagation To Large, Dense Graphical Models, Sameer Singh

Doctoral Dissertations

With the physical constraints of semiconductor-based electronics becoming increasingly limiting in the past decade, single-core CPUs have given way to multi-core and distributed computing platforms. At the same time, access to large data collections is progressively becoming commonplace due to the lowering cost of storage and bandwidth. Traditional machine learning paradigms that have been designed to operate sequentially on single processor architectures seem destined to become obsolete in this world of multi-core, multi-node systems and massive data sets. Inference for graphical models is one such example for which most existing algorithms are sequential in nature and are difficult to scale …