Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 42

Full-Text Articles in Entire DC Network

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan Mar 2024

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan

Doctoral Dissertations

The Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) has proven to be an invaluable tool for the mineralogical analysis of the Martian surface. It has been crucial in identifying and mapping the spatial extents of various minerals. Primarily, the identification and mapping of these mineral spectral-shapes have been performed manually. Given the size of the CRISM image dataset, manual analysis of the full dataset would be arduous/infeasible. This dissertation attempts to address this issue by describing an (machine learning based) automated processing pipeline for CRISM data that can be used to identify and map the unique mineral signatures present in …


Data To Science With Ai And Human-In-The-Loop, Gustavo Perez Sarabia Mar 2024

Data To Science With Ai And Human-In-The-Loop, Gustavo Perez Sarabia

Doctoral Dissertations

AI has the potential to accelerate scientific discovery by enabling scientists to analyze vast datasets more efficiently than traditional methods. For example, this thesis considers the detection of star clusters in high-resolution images of galaxies taken from space telescopes, as well as studying bird migration from RADAR images. In these applications, the goal is to make measurements to answer scientific questions, such as how the star formation rate is affected by mass, or how the phenology of bird migration is influenced by climate change. However, current computer vision systems are far from perfect for conducting these measurements directly. They may …


Enabling Privacy And Trust In Edge Ai Systems, Akanksha Atrey Mar 2024

Enabling Privacy And Trust In Edge Ai Systems, Akanksha Atrey

Doctoral Dissertations

Recent advances in mobile computing and the Internet of Things (IoT) enable the global integration of heterogeneous smart devices via wireless networks. A common characteristic across these modern day systems is their ability to collect and communicate streaming data, making machine learning (ML) appealing for processing, reasoning, and predicting about the environment. More recently, low network latency requirements have made offloading intelligence to the cloud undesirable. These novel requirements have led to the emergence of edge computing, an approach that brings computation closer to the device with low latency, high throughput, and enhanced reliability. Together, they enable ML-powered information processing …


Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna Nov 2023

Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna

Doctoral Dissertations

Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to …


Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar Nov 2023

Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar

Doctoral Dissertations

Federated learning is an emerging distributed learning paradigm that allows multiple users to collaboratively train a joint machine learning model without having to share their private data with any third party. Due to many of its attractive properties, federated learning has received significant attention from academia as well as industry and now powers major applications, e.g., Google's Gboard and Assistant, Apple's Siri, Owkin's health diagnostics, etc. However, federated learning is yet to see widespread adoption due to a number of challenges. One such challenge is its susceptibility to poisoning by malicious users who aim to manipulate the joint machine learning …


Automating The Formal Verification Of Software, Emily First Aug 2023

Automating The Formal Verification Of Software, Emily First

Doctoral Dissertations

Formally verified correctness is one of the most desirable properties of software systems. Despite great progress made toward verification via interactive proof assistants, such as Coq and Isabelle/HOL, such verification remains one of the most effort-intensive (and often prohibitively difficult) software development activities. Recent work has created tools that automatically synthesize proofs either through reasoning using precomputed facts or using machine learning to model proofs and then perform biased search through the proof space. However, models in existing tools fail to capture the richness present in proofs, such as the information the programmer has access to when writing proofs and …


Predicting Water Quality Vulnerability Under Climate Change With Machine Learning, Khanh Thi Nhu Nguyen Oct 2022

Predicting Water Quality Vulnerability Under Climate Change With Machine Learning, Khanh Thi Nhu Nguyen

Doctoral Dissertations

Water quality deterioration is a global and pervasive issue due to pollution caused by industrialization, urbanization, agriculturalization, and human population growth in the modern era. This issue is even more challenging in the context of climate change due to warming temperatures and the intensification of precipitation. Therefore, assessing the potential impacts of climate change on water quality is a concern. Assessment is necessary so that planners can prepare for and reduce the negative impacts on water quality. At present, climate change impact assessment frameworks are relatively adolescent. Most studies rely on climate projections from General Circulation Models for simulations of …


Data Parallel Frameworks For Training Machine Learning Models, Guoyi Zhao Jun 2022

Data Parallel Frameworks For Training Machine Learning Models, Guoyi Zhao

Doctoral Dissertations

Machine learning is the study of computer algorithms that focuses on analyzing and interpreting patterns and structures in data. It has been successfully applied to many areas in computer science and achieved state-of-the-art results to enable learning, reasoning, and decision-making without human interactions. This research aims to develop innovated data parallel frameworks to accommodate the computing resources to parallelize different machine learning and deep learning algorithms and speed up the training. To achieve that, we explore three interesting frameworks in this dissertation: (1) Sync-on-the-fly framework for gradient descent algorithms on transient resources; (2) Asynchronous Proactive Data Parallel framework for both …


Models And Machine Learning Techniques For Improving The Planning And Operation Of Electricity Systems In Developing Regions, Santiago Correa Cardona Jun 2022

Models And Machine Learning Techniques For Improving The Planning And Operation Of Electricity Systems In Developing Regions, Santiago Correa Cardona

Doctoral Dissertations

The enormous innovation in computational intelligence has disrupted the traditional ways we solve the main problems of our society and allowed us to make more data-informed decisions. Energy systems and the ways we deliver electricity are not exceptions to this trend: cheap and pervasive sensing systems and new communication technologies have enabled the collection of large amounts of data that are being used to monitor and predict in real-time the behavior of this infrastructure. Bringing intelligence to the power grid creates many opportunities to integrate new renewable energy sources more efficiently, facilitate grid planning and expansion, improve reliability, optimize electricity …


Incremental Non-Greedy Clustering At Scale, Nicholas Monath Mar 2022

Incremental Non-Greedy Clustering At Scale, Nicholas Monath

Doctoral Dissertations

Clustering is the task of organizing data into meaningful groups. Modern clustering applications such as entity resolution put several demands on clustering algorithms: (1) scalability to massive numbers of points as well as clusters, (2) incremental additions of data, (3) support for any user-specified similarity functions. Hierarchical clusterings are often desired as they represent multiple alternative flat clusterings (e.g., at different granularity levels). These tree-structured clusterings provide for both fine-grained clusters as well as uncertainty in the presence of newly arriving data. Previous work on hierarchical clustering does not fully address all three of the aforementioned desiderata. Work on incremental …


High-Dimensional Feature Selection And Multi-Level Causal Mediation Analysis With Applications To Human Aging And Cluster-Based Intervention Studies, Hachem Saddiki Oct 2021

High-Dimensional Feature Selection And Multi-Level Causal Mediation Analysis With Applications To Human Aging And Cluster-Based Intervention Studies, Hachem Saddiki

Doctoral Dissertations

Many questions in public health and medicine are fundamentally causal in that our objective is to learn the effect of some exposure, randomized or not, on an outcome of interest. As a result, causal inference frameworks and methodologies have gained interest as a promising tool to reliably answer scientific questions. However, the tasks of identifying and efficiently estimating causal effects from observed data still pose significant challenges under complex data generating scenarios. We focus on (1) high-dimensional settings where the number of variables is orders of magnitude higher than the number of observations; and (2) multi-level settings, where study participants …


Care Work In Chile’S Segregated Cities, Manuel Garcia Oct 2021

Care Work In Chile’S Segregated Cities, Manuel Garcia

Doctoral Dissertations

This project combines diverse theoretical and methodological tools to examine the relationship between space and care work in Chile. The chapters are stand-alone articles that come together to tell a single story. The social production of urban space has marginalized thousands of female caregivers from the labor market as Chile’s care system unravels. I argue that community caregiving could simultaneously improve the conditions of caregivers and dependents. Chapter 1 examines the role of residential segregation in reproducing Chile’s meager female labor market participation rates. I use spatial and econometric analysis to show that the social forces that segregate Santiago create …


3d Shape Understanding And Generation, Matheus Gadelha Oct 2021

3d Shape Understanding And Generation, Matheus Gadelha

Doctoral Dissertations

In recent years, Machine Learning techniques have revolutionized solutions to longstanding image-based problems, like image classification, generation, semantic segmentation, object detection and many others. However, if we want to be able to build agents that can successfully interact with the real world, those techniques need to be capable of reasoning about the world as it truly is: a tridimensional space. There are two main challenges while handling 3D information in machine learning models. First, it is not clear what is the best 3D representation. For images, convolutional neural networks (CNNs) operating on raster images yield the best results in virtually …


Improving Evaluation Methods For Causal Modeling, Amanda Gentzel Jun 2021

Improving Evaluation Methods For Causal Modeling, Amanda Gentzel

Doctoral Dissertations

Causal modeling is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. Active communities of researchers in machine learning, statistics, social science, and other fields develop and enhance algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. However, evaluation techniques for causal modeling algorithms have remained somewhat primitive, limiting what we can learn from the experimental studies of algorithm performance, constraining the types of algorithms and model representations that researchers consider, and creating a gap between theory and practice. We argue for expanding …


Utilizing Graph Structure For Machine Learning, Stefan Dernbach Apr 2021

Utilizing Graph Structure For Machine Learning, Stefan Dernbach

Doctoral Dissertations

The information age has led to an explosion in the size and availability of data. This data often exhibits graph-structure that is either explicitly defined, as in the web of a social network, or is implicitly defined and can be determined by measuring similarity between objects. Utilizing this graph-structure allows for the design of machine learning algorithms that reflect not only the attributes of individual objects but their relationships to every other object in the domain as well. This thesis investigates three machine learning problems and proposes novel methods that leverage the graph-structure inherent in the tasks. Quantum walk neural …


Reasoning About User Feedback Under Identity Uncertainty In Knowledge Base Construction, Ariel Kobren Dec 2020

Reasoning About User Feedback Under Identity Uncertainty In Knowledge Base Construction, Ariel Kobren

Doctoral Dissertations

Intelligent, automated systems that are intertwined with everyday life---such as Google Search and virtual assistants like Amazon’s Alexa or Apple’s Siri---are often powered in part by knowledge bases (KBs), i.e., structured data repositories of entities, their attributes, and the relationships among them. Despite a wealth of research focused on automated KB construction methods, KBs are inevitably imperfect, with errors stemming from various points in the construction pipeline. Making matters more challenging, new data is created daily and must be integrated with existing KBs so that they remain up-to-date. As the primary consumers of KBs, human users have tremendous potential to …


A Framework For Performance-Based Facade Design: Approach For Automated And Multi-Objective Simulation And Optimization, Mahsa Minaei Jul 2020

A Framework For Performance-Based Facade Design: Approach For Automated And Multi-Objective Simulation And Optimization, Mahsa Minaei

Doctoral Dissertations

Buildings have a considerable impact on the environment, and it is crucial to consider environmental and energy performance in building design. Buildings account for about 40% of the global energy consumption and contribute over 30% of the CO2 emissions. A large proportion of this energy is used for meeting occupants’ thermal comfort in buildings, followed by lighting. The building facade forms a barrier between the exterior and interior environments; therefore, it has a crucial role in improving energy efficiency and building performance. In this regard, decision-makers are required to establish an optimal solution, considering multi-objective problems that are usually competitive …


The Limits Of Location Privacy In Mobile Devices, Keen Yuun Sung Jul 2020

The Limits Of Location Privacy In Mobile Devices, Keen Yuun Sung

Doctoral Dissertations

Mobile phones are widely adopted by users across the world today. However, the privacy implications of persistent connectivity are not well understood. This dissertation focuses on one important concern of mobile phone users: location privacy. I approach this problem from the perspective of three adversaries that users are exposed to via smartphone apps: the mobile advertiser, the app developer, and the cellular service provider. First, I quantify the proportion of mobile users who use location permissive apps and are able to be tracked through their advertising identifier, and demonstrate a mark and recapture attack that allows continued tracking of users …


Learning Latent Characteristics Of Data And Models Using Item Response Theory, John P. Lalor Mar 2020

Learning Latent Characteristics Of Data And Models Using Item Response Theory, John P. Lalor

Doctoral Dissertations

A supervised machine learning model is trained with a large set of labeled training data, and evaluated on a smaller but still large set of test data. Especially with deep neural networks (DNNs), the complexity of the model requires that an extremely large data set is collected to prevent overfitting. It is often the case that these models do not take into account specific attributes of the training set examples, but instead treat each equally in the process of model training. This is due to the fact that it is difficult to model latent traits of individual examples at the …


Noise-Aware Inference For Differential Privacy, Garrett Bernstein Mar 2020

Noise-Aware Inference For Differential Privacy, Garrett Bernstein

Doctoral Dissertations

Domains involving sensitive human data, such as health care, human mobility, and online activity, are becoming increasingly dependent upon machine learning algorithms. This leads to scenarios in which data owners wish to protect the privacy of individuals comprising the sensitive data, while at the same time data modelers wish to analyze and draw conclusions from the data. Thus there is a growing demand to develop effective private inference methods that can marry the needs of both parties. For this we turn to differential privacy, which provides a framework for executing algorithms in a private fashion by injecting specifically-designed randomization at …


Neural Models For Information Retrieval Without Labeled Data, Hamed Zamani Oct 2019

Neural Models For Information Retrieval Without Labeled Data, Hamed Zamani

Doctoral Dissertations

Recent developments of machine learning models, and in particular deep neural networks, have yielded significant improvements on several computer vision, natural language processing, and speech recognition tasks. Progress with information retrieval (IR) tasks has been slower, however, due to the lack of large-scale training data as well as neural network models specifically designed for effective information retrieval. In this dissertation, we address these two issues by introducing task-specific neural network architectures for a set of IR tasks and proposing novel unsupervised or \emph{weakly supervised} solutions for training the models. The proposed learning solutions do not require labeled training data. Instead, …


Extracting And Representing Entities, Types, And Relations, Patrick Verga Oct 2019

Extracting And Representing Entities, Types, And Relations, Patrick Verga

Doctoral Dissertations

Making complex decisions in areas like science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate data sources. While some decisions can be made from a single source of information, others require considering multiple pieces of evidence and how they relate to one another. Knowledge graphs (KGs) provide a natural approach for addressing this type of problem: they can serve as long-term stores of abstracted knowledge organized around concepts and their relationships, and can be populated from heterogeneous sources including databases and text. KGs can facilitate higher level reasoning, influence the interpretation of new data, and …


Essays On The Minimum Wage, Immigration, And Privatization, Doruk Cengiz Oct 2019

Essays On The Minimum Wage, Immigration, And Privatization, Doruk Cengiz

Doctoral Dissertations

This dissertation empirically examines effects of the minimum wage, immigration, and privatization; three of the most crucial policies that impact workers worldwide using recent advances in statistics and econometrics to provide causally interpretable results, and to reconcile controversies in the literature. In the first chapter, titled “Seeing Beyond the Trees: Using machine learning to estimate the impact of minimum wages on affected individuals”, I identify minimum wage workers prior to estimating its effects using machine learning tools, and provide highly representative demographically-based groups that capture as much as 73.4% of all likely minimum wage workers. I find that there is …


From Optimization To Equilibration: Understanding An Emerging Paradigm In Artificial Intelligence And Machine Learning, Ian Gemp Jul 2019

From Optimization To Equilibration: Understanding An Emerging Paradigm In Artificial Intelligence And Machine Learning, Ian Gemp

Doctoral Dissertations

Many existing machine learning (ML) algorithms cannot be viewed as gradient descent on some single objective. The solution trajectories taken by these algorithms naturally exhibit rotation, sometimes forming cycles, a behavior that is not expected with (full-batch) gradient descent. However, these algorithms can be viewed more generally as solving for the equilibrium of a game with possibly multiple competing objectives. Moreover, some recent ML models, specifically generative adversarial networks (GANs) and its variants, are now explicitly formulated as equilibrium problems. Equilibrium problems present challenges beyond those encountered in optimization such as limit-cycles and chaotic attractors and are able to abstract …


Learning With Aggregate Data, Tao Sun Mar 2019

Learning With Aggregate Data, Tao Sun

Doctoral Dissertations

Various real-world applications involve directly dealing with aggregate data. In this work, we study Learning with Aggregate Data from several perspectives and try to address their combinatorial challenges. At first, we study the problem of learning in Collective Graphical Models (CGMs), where only noisy aggregate observations are available. Inference in CGMs is NP- hard and we proposed an approximate inference algorithm. By solving the inference problems, we are empowered to build large-scale bird migration models, and models for human mobility under the differential privacy setting. Secondly, we consider problems given bags of instances and bag-level aggregate supervisions. Specifically, we study …


Machine Learning Methods For Activity Detection In Wearable Sensor Data Streams, Roy Adams Oct 2018

Machine Learning Methods For Activity Detection In Wearable Sensor Data Streams, Roy Adams

Doctoral Dissertations

Wearable wireless sensors have the potential for transformative impact on the fields of health and behavioral science. Recent advances in wearable sensor technology have made it possible to simultaneously collect multiple streams of physiological and context data from individuals in natural environments; however, extracting reliable high-level inferences from these raw data streams remains a key data analysis challenge. In this dissertation, we address three challenges that arise when trying to perform activity detection from wearable sensor streams. First, we address the challenge of learning from small amounts of noisy data by proposing a class of conditional random field models for …


Transfer Learning With Mixtures Of Manifolds, Thomas Boucher Jul 2018

Transfer Learning With Mixtures Of Manifolds, Thomas Boucher

Doctoral Dissertations

Advances in scientific instrumentation technology have increased the speed of data acquisition and the precision of sampling, creating an abundance of high-dimensional data sets. The ability to combine these disparate data sets and to transfer information between them is critical to accurate scientific analysis. Many modern-day instruments can record data at many thousands of channels, far greater than the actual degrees of freedom in the sample data. This makes manifold learning, a class of methods that exploit the observation that high-dimensional data tend to lie on lower-dimensional manifolds, especially well-suited to this transfer learning task. Existing manifold-based transfer learning methods …


Using Latent Variable Models To Improve Causal Estimation, Huseyin Oktay Mar 2018

Using Latent Variable Models To Improve Causal Estimation, Huseyin Oktay

Doctoral Dissertations

Estimating the causal effect of a treatment from data has been a key goal for a large number of studies in many domains. Traditionally, researchers use carefully designed randomized experiments for causal inference. However, such experiments can not only be costly in terms of time and money but also infeasible for some causal questions. To overcome these challenges, causal estimation methods from observational data have been developed by researchers from diverse disciplines and increasingly studies using such methods account for a large share in empirical work. Such growing interest has also brought together two arguably separate fields: machine learning and …


Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang Nov 2017

Deep-Learned Generative Representations Of 3d Shape Families, Haibin Huang

Doctoral Dissertations

Digital representations of 3D shapes are becoming increasingly useful in several emerging applications, such as 3D printing, virtual reality and augmented reality. However, traditional modeling softwares require users to have extensive modeling experience, artistic skills and training to handle their complex interfaces and perform the necessary low-level geometric manipulation commands. Thus, there is an emerging need for computer algorithms that help novice and casual users to quickly and easily generate 3D content. In this work, I will present deep learning algorithms that are capable of automatically inferring parametric representations of shape families, which can be used to generate new 3D …


Deep Energy-Based Models For Structured Prediction, David Belanger Nov 2017

Deep Energy-Based Models For Structured Prediction, David Belanger

Doctoral Dissertations

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy …