Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Entire DC Network

Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna Nov 2023

Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna

Doctoral Dissertations

Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to …


Human-Centered Technologies For Inclusive Collection And Analysis Of Public-Generated Data, Mahmood Jasim Nov 2023

Human-Centered Technologies For Inclusive Collection And Analysis Of Public-Generated Data, Mahmood Jasim

Doctoral Dissertations

The meteoric rise in the popularity of public engagement platforms such as social media, customer review websites, and public input solicitation efforts strives for establishing an inclusive environment for the public to share their thoughts, ideas, opinions, and experiences. Many decisions made at a personal, local, or national scale are often fueled by data generated by the public. As such, inclusive collection, analysis, sensemaking, and utilization of pubic-generated data are crucial to support the exercise of successful decision-making processes. However, people often struggle to engage, participate, and share their opinions due to inaccessibility, the rigidity of traditional public engagement methods, …


Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar Nov 2023

Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar

Doctoral Dissertations

Federated learning is an emerging distributed learning paradigm that allows multiple users to collaboratively train a joint machine learning model without having to share their private data with any third party. Due to many of its attractive properties, federated learning has received significant attention from academia as well as industry and now powers major applications, e.g., Google's Gboard and Assistant, Apple's Siri, Owkin's health diagnostics, etc. However, federated learning is yet to see widespread adoption due to a number of challenges. One such challenge is its susceptibility to poisoning by malicious users who aim to manipulate the joint machine learning …


Learning To See With Minimal Human Supervision, Zezhou Cheng Nov 2023

Learning To See With Minimal Human Supervision, Zezhou Cheng

Doctoral Dissertations

Deep learning has significantly advanced computer vision in the past decade, paving the way for practical applications such as facial recognition and autonomous driving. However, current techniques depend heavily on human supervision, limiting their broader deployment. This dissertation tackles this problem by introducing algorithms and theories to minimize human supervision in three key areas: data, annotations, and neural network architectures, in the context of various visual understanding tasks such as object detection, image restoration, and 3D generation. First, we present self-supervised learning algorithms to handle in-the-wild images and videos that traditionally require time-consuming manual curation and labeling. We demonstrate that …


Foundations Of Node Representation Learning, Sudhanshu Chanpuriya Nov 2023

Foundations Of Node Representation Learning, Sudhanshu Chanpuriya

Doctoral Dissertations

Low-dimensional node representations, also called node embeddings, are a cornerstone in the modeling and analysis of complex networks. In recent years, advances in deep learning have spurred development of novel neural network-inspired methods for learning node representations which have largely surpassed classical 'spectral' embeddings in performance. Yet little work asks the central questions of this thesis: Why do these novel deep methods outperform their classical predecessors, and what are their limitations? We pursue several paths to answering these questions. To further our understanding of deep embedding methods, we explore their relationship with spectral methods, which are better understood, and show …


Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty Nov 2023

Bayesian Structural Causal Inference With Probabilistic Programming, Sam A. Witty

Doctoral Dissertations

Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in …


Machine Learning Modeling Of Polymer Coating Formulations: Benchmark Of Feature Representation Schemes, Nelson I. Evbarunegbe Nov 2023

Machine Learning Modeling Of Polymer Coating Formulations: Benchmark Of Feature Representation Schemes, Nelson I. Evbarunegbe

Masters Theses

Polymer coatings offer a wide range of benefits across various industries, playing a crucial role in product protection and extension of shelf life. However, formulating them can be a non-trivial task given the multitude of variables and factors involved in the production process, rendering it a complex, high-dimensional problem. To tackle this problem, machine learning (ML) has emerged as a promising tool, showing considerable potential in enhancing various polymer and chemistry-based applications, particularly those dealing with high dimensional complexities.

Our research aims to develop a physics-guided ML approach to facilitate the formulations of polymer coatings. As the first step, this …


Effective And Efficient Transfer Learning In The Era Of Large Language Models, Tu Vu Nov 2023

Effective And Efficient Transfer Learning In The Era Of Large Language Models, Tu Vu

Doctoral Dissertations

Substantial progress has been made in the field of natural language processing (NLP) due to the advent of large language models (LLMs)—deep neural networks with millions or billions of parameters pre-trained on large amounts of unlabeled data. However, these models have common weaknesses, including degenerate performance in data-scarce scenarios, and substantial computational resource requirements. This thesis aims to develop methods to address these limitations for improved applicability and performance of LLMs in resource-constrained settings with limited data and/or computational resources. To address the need for labeled data in data-scarce scenarios, I present two methods, in Chapter 2 and Chapter 3, …


Graph Representation Learning With Box Embeddings, Dongxu Zhang Aug 2023

Graph Representation Learning With Box Embeddings, Dongxu Zhang

Doctoral Dissertations

Graphs are ubiquitous data structures, present in many machine-learning tasks, such as link prediction of products and node classification of scientific papers. As gradient descent drives the training of most modern machine learning architectures, the ability to encode graph-structured data using a differentiable representation is essential to make use of this data. Most approaches encode graph structure in Euclidean space, however, it is non-trivial to model directed edges. The naive solution is to represent each node using a separate "source" and "target" vector, however, this can decouple the representation, making it harder for the model to capture information within longer …


Evidence Assisted Learning For Clinical Decision Support Systems, Bhanu Pratap Singh Rawat Aug 2023

Evidence Assisted Learning For Clinical Decision Support Systems, Bhanu Pratap Singh Rawat

Doctoral Dissertations

Clinical decision support systems (CDSS) provide intelligently filtered knowledge and patient-specific and population information to the clinicians, nursing staff and healthcare professionals. CDSS can significantly improve the quality, safety, efficiency and effectiveness of health care. Over the last decade, American hospitals have adopted electronic health records (EHRs) widely resulting in a massive collection of clinical notes such as admission notes, physician notes, nursing notes and discharge summaries. For the past couple of decades, most of the work in CDSS has been focused on developing knowledge-based systems using structured data such as medications and ICD codes. In contrast, the EHR notes …


Improving User Experience By Optimizing Cloud Services, Ishita Dasgupta Aug 2023

Improving User Experience By Optimizing Cloud Services, Ishita Dasgupta

Doctoral Dissertations

Today, cloud services offer myriads of applications, tailor made for different users in the field of weather, health, finance, entertainment, etc. These services fulfill varying genres of user demands over the Internet. For example, these services can be live (live weather radar, ESPN Live) or on-demand services (weather forecasting, Netflix). While these applications cater to different customer requirements, it is necessary for these services to be efficient with respect to latency, scalability, robustness and quality of experience. These systems need to constantly evolve to provide the best user experience and meet the most current demands of the customer. For instance, …


An Introspective Approach For Competence-Aware Autonomy, Connor Basich Aug 2023

An Introspective Approach For Competence-Aware Autonomy, Connor Basich

Doctoral Dissertations

Building and deploying autonomous systems in the open world has long been a goal of both the artificial intelligence (AI) and robotics communities. From autonomous driving, to health care, to office assistance, these systems have the potential to transform society and alter our everyday lives. The open world, however, presents numerous challenges that question the typical assumptions made by the models and frameworks often used in contemporary AI and robotics. Systems in the open world are faced with an unconstrained and non-stationary environment with a range of heterogeneous actors that is too complex to be modeled in its entirety. Moreover, …


Data-Driven Modeling And Analytics For Greening The Energy Ecosystem, John Wamburu Apr 2023

Data-Driven Modeling And Analytics For Greening The Energy Ecosystem, John Wamburu

Doctoral Dissertations

The energy ecosystem is undergoing a major transition from primarily using carbon-intensive energy sources to greener and renewable sources of energy. For instance, electric vehicles (EVs) are rapidly increasing in popularity thereby eliminating gas-based carbon emissions. Similarly, the increased adoption of solar is injecting greener energy into the grid, thus reducing the grid’s overall carbon footprint. At the same time, the proliferation of networked devices and sensors in the grid is enabling energy usage analysis at fine granularity. In this thesis, I argue that data-driven modeling and analytics applied to energy usage data can facilitate optimal carbon reduction in the …


Rigorous Experimentation For Reinforcement Learning, Scott M. Jordan Apr 2023

Rigorous Experimentation For Reinforcement Learning, Scott M. Jordan

Doctoral Dissertations

Scientific fields make advancements by leveraging the knowledge created by others to push the boundary of understanding. The primary tool in many fields for generating knowledge is empirical experimentation. Although common, generating accurate knowledge from empirical experiments is often challenging due to inherent randomness in execution and confounding variables that can obscure the correct interpretation of the results. As such, researchers must hold themselves and others to a high degree of rigor when designing experiments. Unfortunately, most reinforcement learning (RL) experiments lack this rigor, making the knowledge generated from experiments dubious. This dissertation proposes methods to address central issues in …


Learning From Sequential User Data: Models And Sample-Efficient Algorithms, Aritra Ghosh Apr 2023

Learning From Sequential User Data: Models And Sample-Efficient Algorithms, Aritra Ghosh

Doctoral Dissertations

Recent advances in deep learning have made learning representation from ever-growing datasets possible in the domain of vision, natural language processing (NLP), and robotics, among others. However, deep networks are notoriously data-hungry; for example, training language models with attention mechanisms sometimes requires trillions of parameters and tokens. In contrast, we can often access a limited number of samples in many tasks. It is crucial to learn models from these `limited' datasets. Learning with limited datasets can take several forms. In this thesis, we study how to select data samples sequentially such that downstream task performance is maximized. Moreover, we study …


Thermal Transport Across 2d/3d Van Der Waals Interfaces, Cameron Foss Apr 2023

Thermal Transport Across 2d/3d Van Der Waals Interfaces, Cameron Foss

Doctoral Dissertations

Designing improved field-effect-transistors (FETs) that are mass-producible and meet the fabrication standards set by legacy silicon CMOS manufacturing is required for pushing the microelectronics industry into further enhanced technological generations. Historically, the downscaling of feature sizes in FETs has enabled improved performance, reduced power consumption, and increased packing density in microelectronics for several decades. However, many are claiming Moore's law no longer applies as the era of silicon CMOS scaling potentially nears its end with designs approaching fundamental atomic-scale limits -- that is, the few- to sub-nanometer range. Ultrathin two-dimensional (2D) materials present a new paradigm of materials science and …