Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 42

Full-Text Articles in Entire DC Network

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan Mar 2024

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan

Doctoral Dissertations

The Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) has proven to be an invaluable tool for the mineralogical analysis of the Martian surface. It has been crucial in identifying and mapping the spatial extents of various minerals. Primarily, the identification and mapping of these mineral spectral-shapes have been performed manually. Given the size of the CRISM image dataset, manual analysis of the full dataset would be arduous/infeasible. This dissertation attempts to address this issue by describing an (machine learning based) automated processing pipeline for CRISM data that can be used to identify and map the unique mineral signatures present in …


Data To Science With Ai And Human-In-The-Loop, Gustavo Perez Sarabia Mar 2024

Data To Science With Ai And Human-In-The-Loop, Gustavo Perez Sarabia

Doctoral Dissertations

AI has the potential to accelerate scientific discovery by enabling scientists to analyze vast datasets more efficiently than traditional methods. For example, this thesis considers the detection of star clusters in high-resolution images of galaxies taken from space telescopes, as well as studying bird migration from RADAR images. In these applications, the goal is to make measurements to answer scientific questions, such as how the star formation rate is affected by mass, or how the phenology of bird migration is influenced by climate change. However, current computer vision systems are far from perfect for conducting these measurements directly. They may …


Exact Models, Heuristics, And Supervised Learning Approaches For Vehicle Routing Problems, Zefeng Lyu Dec 2023

Exact Models, Heuristics, And Supervised Learning Approaches For Vehicle Routing Problems, Zefeng Lyu

Doctoral Dissertations

This dissertation presents contributions to the field of vehicle routing problems by utilizing exact methods, heuristic approaches, and the integration of machine learning with traditional algorithms. The research is organized into three main chapters, each dedicated to a specific routing problem and a unique methodology. The first chapter addresses the Pickup and Delivery Problem with Transshipments and Time Windows, a variant that permits product transfers between vehicles to enhance logistics flexibility and reduce costs. To solve this problem, we propose an efficient mixed-integer linear programming model that has been shown to outperform existing ones. The second chapter discusses a practical …


Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna Nov 2023

Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna

Doctoral Dissertations

Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to …


Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar Nov 2023

Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar

Doctoral Dissertations

Federated learning is an emerging distributed learning paradigm that allows multiple users to collaboratively train a joint machine learning model without having to share their private data with any third party. Due to many of its attractive properties, federated learning has received significant attention from academia as well as industry and now powers major applications, e.g., Google's Gboard and Assistant, Apple's Siri, Owkin's health diagnostics, etc. However, federated learning is yet to see widespread adoption due to a number of challenges. One such challenge is its susceptibility to poisoning by malicious users who aim to manipulate the joint machine learning …


Incremental Non-Greedy Clustering At Scale, Nicholas Monath Mar 2022

Incremental Non-Greedy Clustering At Scale, Nicholas Monath

Doctoral Dissertations

Clustering is the task of organizing data into meaningful groups. Modern clustering applications such as entity resolution put several demands on clustering algorithms: (1) scalability to massive numbers of points as well as clusters, (2) incremental additions of data, (3) support for any user-specified similarity functions. Hierarchical clusterings are often desired as they represent multiple alternative flat clusterings (e.g., at different granularity levels). These tree-structured clusterings provide for both fine-grained clusters as well as uncertainty in the presence of newly arriving data. Previous work on hierarchical clustering does not fully address all three of the aforementioned desiderata. Work on incremental …


3d Shape Understanding And Generation, Matheus Gadelha Oct 2021

3d Shape Understanding And Generation, Matheus Gadelha

Doctoral Dissertations

In recent years, Machine Learning techniques have revolutionized solutions to longstanding image-based problems, like image classification, generation, semantic segmentation, object detection and many others. However, if we want to be able to build agents that can successfully interact with the real world, those techniques need to be capable of reasoning about the world as it truly is: a tridimensional space. There are two main challenges while handling 3D information in machine learning models. First, it is not clear what is the best 3D representation. For images, convolutional neural networks (CNNs) operating on raster images yield the best results in virtually …


Improving Evaluation Methods For Causal Modeling, Amanda Gentzel Jun 2021

Improving Evaluation Methods For Causal Modeling, Amanda Gentzel

Doctoral Dissertations

Causal modeling is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. Active communities of researchers in machine learning, statistics, social science, and other fields develop and enhance algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. However, evaluation techniques for causal modeling algorithms have remained somewhat primitive, limiting what we can learn from the experimental studies of algorithm performance, constraining the types of algorithms and model representations that researchers consider, and creating a gap between theory and practice. We argue for expanding …


Utilizing Graph Structure For Machine Learning, Stefan Dernbach Apr 2021

Utilizing Graph Structure For Machine Learning, Stefan Dernbach

Doctoral Dissertations

The information age has led to an explosion in the size and availability of data. This data often exhibits graph-structure that is either explicitly defined, as in the web of a social network, or is implicitly defined and can be determined by measuring similarity between objects. Utilizing this graph-structure allows for the design of machine learning algorithms that reflect not only the attributes of individual objects but their relationships to every other object in the domain as well. This thesis investigates three machine learning problems and proposes novel methods that leverage the graph-structure inherent in the tasks. Quantum walk neural …


Reasoning About User Feedback Under Identity Uncertainty In Knowledge Base Construction, Ariel Kobren Dec 2020

Reasoning About User Feedback Under Identity Uncertainty In Knowledge Base Construction, Ariel Kobren

Doctoral Dissertations

Intelligent, automated systems that are intertwined with everyday life---such as Google Search and virtual assistants like Amazon’s Alexa or Apple’s Siri---are often powered in part by knowledge bases (KBs), i.e., structured data repositories of entities, their attributes, and the relationships among them. Despite a wealth of research focused on automated KB construction methods, KBs are inevitably imperfect, with errors stemming from various points in the construction pipeline. Making matters more challenging, new data is created daily and must be integrated with existing KBs so that they remain up-to-date. As the primary consumers of KBs, human users have tremendous potential to …


Bayesian Topological Machine Learning, Christopher A. Oballe Aug 2020

Bayesian Topological Machine Learning, Christopher A. Oballe

Doctoral Dissertations

Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …


A Framework For Performance-Based Facade Design: Approach For Automated And Multi-Objective Simulation And Optimization, Mahsa Minaei Jul 2020

A Framework For Performance-Based Facade Design: Approach For Automated And Multi-Objective Simulation And Optimization, Mahsa Minaei

Doctoral Dissertations

Buildings have a considerable impact on the environment, and it is crucial to consider environmental and energy performance in building design. Buildings account for about 40% of the global energy consumption and contribute over 30% of the CO2 emissions. A large proportion of this energy is used for meeting occupants’ thermal comfort in buildings, followed by lighting. The building facade forms a barrier between the exterior and interior environments; therefore, it has a crucial role in improving energy efficiency and building performance. In this regard, decision-makers are required to establish an optimal solution, considering multi-objective problems that are usually competitive …


The Limits Of Location Privacy In Mobile Devices, Keen Yuun Sung Jul 2020

The Limits Of Location Privacy In Mobile Devices, Keen Yuun Sung

Doctoral Dissertations

Mobile phones are widely adopted by users across the world today. However, the privacy implications of persistent connectivity are not well understood. This dissertation focuses on one important concern of mobile phone users: location privacy. I approach this problem from the perspective of three adversaries that users are exposed to via smartphone apps: the mobile advertiser, the app developer, and the cellular service provider. First, I quantify the proportion of mobile users who use location permissive apps and are able to be tracked through their advertising identifier, and demonstrate a mark and recapture attack that allows continued tracking of users …


Learning Latent Characteristics Of Data And Models Using Item Response Theory, John P. Lalor Mar 2020

Learning Latent Characteristics Of Data And Models Using Item Response Theory, John P. Lalor

Doctoral Dissertations

A supervised machine learning model is trained with a large set of labeled training data, and evaluated on a smaller but still large set of test data. Especially with deep neural networks (DNNs), the complexity of the model requires that an extremely large data set is collected to prevent overfitting. It is often the case that these models do not take into account specific attributes of the training set examples, but instead treat each equally in the process of model training. This is due to the fact that it is difficult to model latent traits of individual examples at the …


Neural Models For Information Retrieval Without Labeled Data, Hamed Zamani Oct 2019

Neural Models For Information Retrieval Without Labeled Data, Hamed Zamani

Doctoral Dissertations

Recent developments of machine learning models, and in particular deep neural networks, have yielded significant improvements on several computer vision, natural language processing, and speech recognition tasks. Progress with information retrieval (IR) tasks has been slower, however, due to the lack of large-scale training data as well as neural network models specifically designed for effective information retrieval. In this dissertation, we address these two issues by introducing task-specific neural network architectures for a set of IR tasks and proposing novel unsupervised or \emph{weakly supervised} solutions for training the models. The proposed learning solutions do not require labeled training data. Instead, …


Extracting And Representing Entities, Types, And Relations, Patrick Verga Oct 2019

Extracting And Representing Entities, Types, And Relations, Patrick Verga

Doctoral Dissertations

Making complex decisions in areas like science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate data sources. While some decisions can be made from a single source of information, others require considering multiple pieces of evidence and how they relate to one another. Knowledge graphs (KGs) provide a natural approach for addressing this type of problem: they can serve as long-term stores of abstracted knowledge organized around concepts and their relationships, and can be populated from heterogeneous sources including databases and text. KGs can facilitate higher level reasoning, influence the interpretation of new data, and …


Adaptive Feature Engineering Modeling For Ultrasound Image Classification For Decision Support, Hatwib Mugasa Oct 2019

Adaptive Feature Engineering Modeling For Ultrasound Image Classification For Decision Support, Hatwib Mugasa

Doctoral Dissertations

Ultrasonography is considered a relatively safe option for the diagnosis of benign and malignant cancer lesions due to the low-energy sound waves used. However, the visual interpretation of the ultrasound images is time-consuming and usually has high false alerts due to speckle noise. Improved methods of collection image-based data have been proposed to reduce noise in the images; however, this has proved not to solve the problem due to the complex nature of images and the exponential growth of biomedical datasets. Secondly, the target class in real-world biomedical datasets, that is the focus of interest of a biopsy, is usually …


From Optimization To Equilibration: Understanding An Emerging Paradigm In Artificial Intelligence And Machine Learning, Ian Gemp Jul 2019

From Optimization To Equilibration: Understanding An Emerging Paradigm In Artificial Intelligence And Machine Learning, Ian Gemp

Doctoral Dissertations

Many existing machine learning (ML) algorithms cannot be viewed as gradient descent on some single objective. The solution trajectories taken by these algorithms naturally exhibit rotation, sometimes forming cycles, a behavior that is not expected with (full-batch) gradient descent. However, these algorithms can be viewed more generally as solving for the equilibrium of a game with possibly multiple competing objectives. Moreover, some recent ML models, specifically generative adversarial networks (GANs) and its variants, are now explicitly formulated as equilibrium problems. Equilibrium problems present challenges beyond those encountered in optimization such as limit-cycles and chaotic attractors and are able to abstract …


Learning With Aggregate Data, Tao Sun Mar 2019

Learning With Aggregate Data, Tao Sun

Doctoral Dissertations

Various real-world applications involve directly dealing with aggregate data. In this work, we study Learning with Aggregate Data from several perspectives and try to address their combinatorial challenges. At first, we study the problem of learning in Collective Graphical Models (CGMs), where only noisy aggregate observations are available. Inference in CGMs is NP- hard and we proposed an approximate inference algorithm. By solving the inference problems, we are empowered to build large-scale bird migration models, and models for human mobility under the differential privacy setting. Secondly, we consider problems given bags of instances and bag-level aggregate supervisions. Specifically, we study …


Applications Of Machine Learning In Nuclear Imaging And Radiation Detection, Shaikat Mahmood Galib Jan 2019

Applications Of Machine Learning In Nuclear Imaging And Radiation Detection, Shaikat Mahmood Galib

Doctoral Dissertations

"The main focus of this work is to use machine learning and data mining techniques to address some challenging problems that arise from nuclear data. Specifically, two problem areas are discussed: nuclear imaging and radiation detection. The techniques to approach these problems are primarily based on a variant of Artificial Neural Network (ANN) called Convolutional Neural Network (CNN), which is one of the most popular forms of 'deep learning' technique.

The first problem is about interpreting and analyzing 3D medical radiation images automatically. A method is developed to identify and quantify deformable image registration (DIR) errors from lung CT scans …


Machine Learning Methods For Activity Detection In Wearable Sensor Data Streams, Roy Adams Oct 2018

Machine Learning Methods For Activity Detection In Wearable Sensor Data Streams, Roy Adams

Doctoral Dissertations

Wearable wireless sensors have the potential for transformative impact on the fields of health and behavioral science. Recent advances in wearable sensor technology have made it possible to simultaneously collect multiple streams of physiological and context data from individuals in natural environments; however, extracting reliable high-level inferences from these raw data streams remains a key data analysis challenge. In this dissertation, we address three challenges that arise when trying to perform activity detection from wearable sensor streams. First, we address the challenge of learning from small amounts of noisy data by proposing a class of conditional random field models for …


Transfer Learning With Mixtures Of Manifolds, Thomas Boucher Jul 2018

Transfer Learning With Mixtures Of Manifolds, Thomas Boucher

Doctoral Dissertations

Advances in scientific instrumentation technology have increased the speed of data acquisition and the precision of sampling, creating an abundance of high-dimensional data sets. The ability to combine these disparate data sets and to transfer information between them is critical to accurate scientific analysis. Many modern-day instruments can record data at many thousands of channels, far greater than the actual degrees of freedom in the sample data. This makes manifold learning, a class of methods that exploit the observation that high-dimensional data tend to lie on lower-dimensional manifolds, especially well-suited to this transfer learning task. Existing manifold-based transfer learning methods …


Using Latent Variable Models To Improve Causal Estimation, Huseyin Oktay Mar 2018

Using Latent Variable Models To Improve Causal Estimation, Huseyin Oktay

Doctoral Dissertations

Estimating the causal effect of a treatment from data has been a key goal for a large number of studies in many domains. Traditionally, researchers use carefully designed randomized experiments for causal inference. However, such experiments can not only be costly in terms of time and money but also infeasible for some causal questions. To overcome these challenges, causal estimation methods from observational data have been developed by researchers from diverse disciplines and increasingly studies using such methods account for a large share in empirical work. Such growing interest has also brought together two arguably separate fields: machine learning and …


Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang Nov 2017

Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang

Doctoral Dissertations

Digital representations of 3D shapes are becoming increasingly useful in several emerging applications, such as 3D printing, virtual reality and augmented reality. However, traditional modeling softwares require users to have extensive modeling experience, artistic skills and training to handle their complex interfaces and perform the necessary low-level geometric manipulation commands. Thus, there is an emerging need for computer algorithms that help novice and casual users to quickly and easily generate 3D content. In this work, I will present deep learning algorithms that are capable of automatically inferring parametric representations of shape families, which can be used to generate new 3D …


Deep Energy-Based Models For Structured Prediction, David Belanger Nov 2017

Deep Energy-Based Models For Structured Prediction, David Belanger

Doctoral Dissertations

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy …


Problems In Graph-Structured Modeling And Learning, James Atwood Jul 2017

Problems In Graph-Structured Modeling And Learning, James Atwood

Doctoral Dissertations

This thesis investigates three problems in graph-structured modeling and learning. We first present a method for efficiently generating large instances from nonlinear preferential attachment models of network structure. This is followed by a description of diffusion-convolutional neural networks, a new model for graph-structured data which is able to outperform probabilistic relational models and kernel-on-graph methods at node classification tasks. We conclude with an optimal privacy-protection method for users of online services that remains effective when users have poor knowledge of an adversary's behavior.


Learning From Pairwise Proximity Data, Hamid Dadkhahi Nov 2016

Learning From Pairwise Proximity Data, Hamid Dadkhahi

Doctoral Dissertations

In many areas of machine learning, the characterization of the input data is given by a form of proximity measure between data points. Examples of such representations are pairwise differences, pairwise distances, and pairwise comparisons. In this work, we investigate different learning problems on data represented in terms of such pairwise proximities. More specifically, we consider three problems: masking (feature selection) for dimensionality reduction, extension of the dimensionality reduction for time series, and online collaborative filtering. For each of these problems, we start with a form of pairwise proximity which is relevant in the problem at hand. We evaluate the …


Algorithms For First-Order Sparse Reinforcement Learning, Bo Liu Mar 2016

Algorithms For First-Order Sparse Reinforcement Learning, Bo Liu

Doctoral Dissertations

This thesis presents a general framework for first-order temporal difference learning algorithms with an in-depth theoretical analysis. The main contribution of the thesis is the development and design of a family of first-order regularized temporal-difference (TD) algorithms using stochastic approximation and stochastic optimization. To scale up TD algorithms to large-scale problems, we use first-order optimization to explore regularized TD methods using linear value function approximation. Previous regularized TD methods often use matrix inversion, which requires cubic time and quadratic memory complexity. We propose two algorithms, sparse-Q and RO-TD, for on-policy and off-policy learning, respectively. These two algorithms exhibit linear computational …


Neuroscience-Inspired Dynamic Architectures, Catherine Dorothy Schuman May 2015

Neuroscience-Inspired Dynamic Architectures, Catherine Dorothy Schuman

Doctoral Dissertations

Biological brains are some of the most powerful computational devices on Earth. Computer scientists have long drawn inspiration from neuroscience to produce computational tools. This work introduces neuroscience-inspired dynamic architectures (NIDA), spiking neural networks embedded in a geometric space that exhibit dynamic behavior. A neuromorphic hardware implementation based on NIDA networks, Dynamic Adaptive Neural Network Array (DANNA), is discussed. Neuromorphic implementations are one alternative/complement to traditional von Neumann computation. A method for designing/training NIDA networks, based on evolutionary optimization, is introduced. We demonstrate the utility of NIDA networks on classification tasks, a control task, and an anomaly detection task. There …


Epistemological Databases For Probabilistic Knowledge Base Construction, Michael Louis Wick Mar 2015

Epistemological Databases For Probabilistic Knowledge Base Construction, Michael Louis Wick

Doctoral Dissertations

Knowledge bases (KB) facilitate real world decision making by providing access to structured relational information that enables pattern discovery and semantic queries. Although there is a large amount of data available for populating a KB; the data must first be gathered and assembled. Traditionally, this integration is performed automatically by storing the output of an information extraction pipeline directly into a database as if this prediction were the ``truth.'' However, the resulting KB is often not reliable because (a) errors accumulate in the integration pipeline, and (b) they persist in the KB even after new information arrives that could rectify …