Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms Commons

Open Access. Powered by Scholars. Published by Universities.®

Doctoral Dissertations

Theses/Dissertations

Discipline
Institution
Keyword
Publication Year

Articles 1 - 30 of 46

Full-Text Articles in Theory and Algorithms

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan Mar 2024

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan

Doctoral Dissertations

The Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) has proven to be an invaluable tool for the mineralogical analysis of the Martian surface. It has been crucial in identifying and mapping the spatial extents of various minerals. Primarily, the identification and mapping of these mineral spectral-shapes have been performed manually. Given the size of the CRISM image dataset, manual analysis of the full dataset would be arduous/infeasible. This dissertation attempts to address this issue by describing an (machine learning based) automated processing pipeline for CRISM data that can be used to identify and map the unique mineral signatures present in …


Foundations Of Node Representation Learning, Sudhanshu Chanpuriya Nov 2023

Foundations Of Node Representation Learning, Sudhanshu Chanpuriya

Doctoral Dissertations

Low-dimensional node representations, also called node embeddings, are a cornerstone in the modeling and analysis of complex networks. In recent years, advances in deep learning have spurred development of novel neural network-inspired methods for learning node representations which have largely surpassed classical 'spectral' embeddings in performance. Yet little work asks the central questions of this thesis: Why do these novel deep methods outperform their classical predecessors, and what are their limitations? We pursue several paths to answering these questions. To further our understanding of deep embedding methods, we explore their relationship with spectral methods, which are better understood, and show …


Graph Representation Learning With Box Embeddings, Dongxu Zhang Aug 2023

Graph Representation Learning With Box Embeddings, Dongxu Zhang

Doctoral Dissertations

Graphs are ubiquitous data structures, present in many machine-learning tasks, such as link prediction of products and node classification of scientific papers. As gradient descent drives the training of most modern machine learning architectures, the ability to encode graph-structured data using a differentiable representation is essential to make use of this data. Most approaches encode graph structure in Euclidean space, however, it is non-trivial to model directed edges. The naive solution is to represent each node using a separate "source" and "target" vector, however, this can decouple the representation, making it harder for the model to capture information within longer …


Explicit Constructions Of Canonical And Absolute Minimal Degree Lifts Of Twisted Edwards Curves, William Coleman Bitting Iv May 2023

Explicit Constructions Of Canonical And Absolute Minimal Degree Lifts Of Twisted Edwards Curves, William Coleman Bitting Iv

Doctoral Dissertations

Twisted Edwards Curves are a representation of Elliptic Curves given by the solutions of bx^2 + y^2 = 1 + ax^2y^2. Due to their simple and unified formulas for adding distinct points and doubling, Twisted Edwards Curves have found extensive applications in fields such as cryptography. In this thesis, we study the Canonical Liftings of Twisted Edwards Curves and the associated lift of points Elliptic Teichmu ̈ller Lift. The coordinate functions of the latter are proved to be polynomials, and their degrees and derivatives are computed. Moreover, an algorithm is described for explicit computations, and some properties of the general …


Hyperspectral Unmixing: A Theoretical Aspect And Applications To Crism Data Processing, Yuki Itoh Oct 2022

Hyperspectral Unmixing: A Theoretical Aspect And Applications To Crism Data Processing, Yuki Itoh

Doctoral Dissertations

Hyperspectral imaging has been deployed in earth and planetary remote sensing, and has contributed the development of new methods for monitoring the earth environment and new discoveries in planetary science. It has given scientists and engineers a new way to observe the surface of earth and planetary bodies by measuring the spectroscopic spectrum at a pixel scale. Hyperspectal images require complex processing before practical use. One of the important goals of hyperspectral imaging is to obtain the images of reflectance spectrum. A raw image obtained by hyperspectral remote sensing usually undergoes conversion to a physical quantity representing the intensity of …


Combinatorial Algorithms For Graph Discovery And Experimental Design, Raghavendra K. Addanki Oct 2022

Combinatorial Algorithms For Graph Discovery And Experimental Design, Raghavendra K. Addanki

Doctoral Dissertations

In this thesis, we study the design and analysis of algorithms for discovering the structure and properties of an unknown graph, with applications in two different domains: causal inference and sublinear graph algorithms. In both these domains, graph discovery is possible using restricted forms of experiments, and our objective is to design low-cost experiments. First, we describe efficient experimental approaches to the causal discovery problem, which in its simplest form, asks us to identify the causal relations (edges of the unknown graph) between variables (vertices of the unknown graph) of a given system. For causal discovery, we study algorithms …


Scalable Data Analytics For Relational Databases, Graphs And Videos, Fubao Wu Jun 2022

Scalable Data Analytics For Relational Databases, Graphs And Videos, Fubao Wu

Doctoral Dissertations

Data analytics is to analyze raw data and mine insights, trends, and patterns from them. Due to the dramatic increase in data volume and size in recent years with the development of big data and cloud storage, big data analytics algorithms and techniques have been faced with more challenges. Moreover, there are various types of data formats, such as relational databases, text data, audio data, and image/video data. It is challenging to generate a unified framework or algorithm for data analytics on various data formats. Different data formats still need refined and scalable algorithms. In this dissertation, we explore three …


Mixture Models In Machine Learning, Soumyabrata Pal Mar 2022

Mixture Models In Machine Learning, Soumyabrata Pal

Doctoral Dissertations

Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings. In this thesis, we look at three groups of problems. The first part …


Algorithms To Exploit Data Sparsity, Larkin H. Flodin Oct 2021

Algorithms To Exploit Data Sparsity, Larkin H. Flodin

Doctoral Dissertations

While data in the real world is very high-dimensional, it generally has some underlying structure; for instance, if we think of an image as a set of pixels with associated color values, most possible settings of color values correspond to something more like random noise than what we typically think of as a picture. With an appropriate transformation of basis, this underlying structure can often be converted into "sparsity" in data, giving an equivalent representation of the data where the magnitude is large in only a few directions relative to the ambient dimension. This motivates a variety of theoretical questions …


Optimal Communication Structures For Concurrent Computing, Andrii Berdnikov May 2021

Optimal Communication Structures For Concurrent Computing, Andrii Berdnikov

Doctoral Dissertations

This research focuses on communicative solvers that run concurrently and exchange information to improve performance. This “team of solvers” enables individual algorithms to communicate information regarding their progress and intermediate solutions, and allows them to synchronize memory structures with more “successful” counterparts. The result is that fewer nodes spend computational resources on “struggling” processes. The research is focused on optimization of communication structures that maximize algorithmic efficiency using the theoretical framework of Markov chains. Existing research addressing communication between the cooperative solvers on parallel systems lacks generality: Most studies consider a limited number of communication topologies and strategies, while the …


Machine Learning With Topological Data Analysis, Ephraim Robert Love May 2021

Machine Learning With Topological Data Analysis, Ephraim Robert Love

Doctoral Dissertations

Topological Data Analysis (TDA) is a relatively new focus in the fields of statistics and machine learning. Methods of exploiting the geometry of data, such as clustering, have proven theoretically and empirically invaluable. TDA provides a general framework within which to study topological invariants (shapes) of data, which are more robust to noise and can recover information on higher dimensional features than immediately apparent in the data. A common tool for conducting TDA is persistence homology, which measures the significance of these invariants. Persistence homology has prominent realizations in methods of data visualization, statistics and machine learning. Extending ML with …


Compact Representations Of Uncertainty In Clustering, Craig Stuart Greenberg Apr 2021

Compact Representations Of Uncertainty In Clustering, Craig Stuart Greenberg

Doctoral Dissertations

Flat clustering and hierarchical clustering are two fundamental tasks, often used to discover meaningful structures in data, such as subtypes of cancer, phylogenetic relationships, taxonomies of concepts, and cascades of particle decays in particle physics. When multiple clusterings of the data are possible, it is useful to represent uncertainty in clustering through various probabilistic quantities, such as the distribution over partitions or tree structures, and the marginal probabilities of subpartitions or subtrees. Many compact representations exist for structured prediction problems, enabling the efficient computation of probability distributions, e.g., a trellis structure and corresponding Forward-Backward algorithm for Markov models that model …


Algorithms For Massive, Expensive, Or Otherwise Inconvenient Graphs, David Tench Dec 2020

Algorithms For Massive, Expensive, Or Otherwise Inconvenient Graphs, David Tench

Doctoral Dissertations

A long-standing assumption common in algorithm design is that any part of the input is accessible at any time for unit cost. However, as we work with increasingly large data sets, or as we build smaller devices, we must revisit this assumption. In this thesis, I present some of my work on graph algorithms designed for circumstances where traditional assumptions about inputs do not apply.
1. Classical graph algorithms require direct access to the input graph and this is not feasible when the graph is too large to fit in memory. For computation on massive graphs we consider the dynamic …


Benchmarks And Controls For Optimization With Quantum Annealing, Erica Kelley Grant Dec 2020

Benchmarks And Controls For Optimization With Quantum Annealing, Erica Kelley Grant

Doctoral Dissertations

Quantum annealing (QA) is a metaheuristic specialized for solving optimization problems which uses principles of adiabatic quantum computing, namely the adiabatic theorem. Some devices implement QA using quantum mechanical phenomena. These QA devices do not perfectly adhere to the adiabatic theorem because they are subject to thermal and magnetic noise. Thus, QA devices return statistical solutions with some probability of success where this probability is affected by the level of noise of the system. As these devices improve, it is believed that they will become less noisy and more accurate. However, some tuning strategies may further improve that probability of …


Bayesian Topological Machine Learning, Christopher A. Oballe Aug 2020

Bayesian Topological Machine Learning, Christopher A. Oballe

Doctoral Dissertations

Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …


A Framework For Performance-Based Facade Design: Approach For Automated And Multi-Objective Simulation And Optimization, Mahsa Minaei Jul 2020

A Framework For Performance-Based Facade Design: Approach For Automated And Multi-Objective Simulation And Optimization, Mahsa Minaei

Doctoral Dissertations

Buildings have a considerable impact on the environment, and it is crucial to consider environmental and energy performance in building design. Buildings account for about 40% of the global energy consumption and contribute over 30% of the CO2 emissions. A large proportion of this energy is used for meeting occupants’ thermal comfort in buildings, followed by lighting. The building facade forms a barrier between the exterior and interior environments; therefore, it has a crucial role in improving energy efficiency and building performance. In this regard, decision-makers are required to establish an optimal solution, considering multi-objective problems that are usually competitive …


Tools For Tutoring Theoretical Computer Science Topics, Mark Mccartin-Lim Nov 2019

Tools For Tutoring Theoretical Computer Science Topics, Mark Mccartin-Lim

Doctoral Dissertations

This thesis introduces COMPLEXITY TUTOR, a tutoring system to assist in learning abstract proof-based topics, which has been specifically targeted towards the population of computer science students studying theoretical computer science. Existing literature has shown tremendous educational benefits produced by active learning techniques, student-centered pedagogy, gamification and intelligent tutoring systems. However, previously, there had been almost no research on adapting these ideas to the domain of theoretical computer science. As a population, computer science students receive immediate feedback from compilers and debuggers, but receive no similar level of guidance for theoretical coursework. One hypothesis of this thesis is that immediate …


Software-Defined Infrastructure For Iot-Based Energy Systems, Stephen Lee Oct 2019

Software-Defined Infrastructure For Iot-Based Energy Systems, Stephen Lee

Doctoral Dissertations

Internet of Things (IoT) devices are becoming an essential part of our everyday lives. These physical devices are connected to the internet and can measure or control the environment around us. Further, IoT devices are increasingly being used to monitor buildings, farms, health, and transportation. As these connected devices become more pervasive, these devices will generate vast amounts of data that can be used to gain insights and build intelligence into the system. At the same time, large-scale deployment of these devices will raise new challenges in efficiently managing and controlling them. In this thesis, I argue that the IoT …


Function And Dissipation In Finite State Automata - From Computing To Intelligence And Back, Natesh Ganesh Oct 2019

Function And Dissipation In Finite State Automata - From Computing To Intelligence And Back, Natesh Ganesh

Doctoral Dissertations

Society has benefited from the technological revolution and the tremendous growth in computing powered by Moore's law. However, we are fast approaching the ultimate physical limits in terms of both device sizes and the associated energy dissipation. It is important to characterize these limits in a physically grounded and implementation-agnostic manner, in order to capture the fundamental energy dissipation costs associated with performing computing operations with classical information in nano-scale quantum systems. It is also necessary to identify and understand the effect of quantum in-distinguishability, noise, and device variability on these dissipation limits. Identifying these parameters is crucial to designing …


Massive Graph Analysis In The Data Stream Model, Sofya Vorotnikova Jul 2019

Massive Graph Analysis In The Data Stream Model, Sofya Vorotnikova

Doctoral Dissertations

Graphs have become an abstraction of choice in modeling highly-structured data. The need to compute graph-theoretic properties of datasets arises in many applications that involve entities and pairwise relations between them. However, in practice the datasets in question can be too large to be stored in main memory, distributed across many machines, or changing over time. Moreover, in an increasing number of applications the algorithm has to make real time decisions as the data arrives, which puts further limitations on the time and space that can realistically be used. These characteristics render classical algorithmic approaches obsolete and necessitate the development …


Data Stream Algorithms For Large Graphs And High Dimensional Data, Hoa Vu Oct 2018

Data Stream Algorithms For Large Graphs And High Dimensional Data, Hoa Vu

Doctoral Dissertations

In contrast to the traditional random access memory computational model where the entire input is available in the working memory, the data stream model only provides sequential access to the input. The data stream model is a natural framework to handle large and dynamic data. In this model, we focus on designing algorithms that use sublinear memory and a small number of passes over the stream. Other desirable properties include fast update time, query time, and post processing time. In this dissertation, we consider different problems in graph theory, combinatorial optimization, and high dimensional data processing. The first part of …


A Study Of High Performance Multiple Precision Arithmetic On Graphics Processing Units, Niall Emmart Mar 2018

A Study Of High Performance Multiple Precision Arithmetic On Graphics Processing Units, Niall Emmart

Doctoral Dissertations

Multiple precision (MP) arithmetic is a core building block of a wide variety of algorithms in computational mathematics and computer science. In mathematics MP is used in computational number theory, geometric computation, experimental mathematics, and in some random matrix problems. In computer science, MP arithmetic is primarily used in cryptographic algorithms: securing communications, digital signatures, and code breaking. In most of these application areas, the factor that limits performance is the MP arithmetic. The focus of our research is to build and analyze highly optimized libraries that allow the MP operations to be offloaded from the CPU to the GPU. …


Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu Nov 2017

Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu

Doctoral Dissertations

Cyber-physical systems frequently have to use massive redundancy to meet application requirements for high reliability. While such redundancy is required, it can be activated adaptively, based on the current state of the controlled plant. Most of the time the physical plant is in a state that allows for a lower level of fault-tolerance. Avoiding the continuous deployment of massive fault-tolerance will greatly reduce the workload of CPSs. In this dissertation, we demonstrate a software simulation framework (AdaFT) that can automatically generate the sub-spaces within which our adaptive fault-tolerance can be applied. We also show the theoretical benefits of AdaFT, and …


The Complexity Of Resilience, Cibele Matos Freire Nov 2017

The Complexity Of Resilience, Cibele Matos Freire

Doctoral Dissertations

One focus area in data management research is to understand how changes in the data can affect the output of a view or standing query. Example applications are explaining query results and propagating updates through views. In this thesis we study the complexity of the Resilience problem, which is the problem of finding the minimum number of tuples that need to be deleted from the database in order to change the result of a query. We will see that resilience is closely related to the well-studied problems of deletion propagation and causal responsibility, and that analyzing its complexity offers important …


Problems In Graph-Structured Modeling And Learning, James Atwood Jul 2017

Problems In Graph-Structured Modeling And Learning, James Atwood

Doctoral Dissertations

This thesis investigates three problems in graph-structured modeling and learning. We first present a method for efficiently generating large instances from nonlinear preferential attachment models of network structure. This is followed by a description of diffusion-convolutional neural networks, a new model for graph-structured data which is able to outperform probabilistic relational models and kernel-on-graph methods at node classification tasks. We conclude with an optimal privacy-protection method for users of online services that remains effective when users have poor knowledge of an adversary's behavior.


Applications Of Sampling And Estimation On Networks, Fabricio Murai Ferreira Nov 2016

Applications Of Sampling And Estimation On Networks, Fabricio Murai Ferreira

Doctoral Dissertations

Networks or graphs are fundamental abstractions that allow us to study many important real systems, such as the Web, social networks and scientific collaboration. It is impossible to completely understand these systems and answer fundamental questions related to them without considering the way their components are connected, i.e., their topology. However, topology is not the only relevant aspect of networks. Nodes often have information associated with them, which can be regarded as node attributes or labels. An important problem is then how to characterize a network w.r.t. topology and node label distributions. Another important problem is how to design efficient …


Stochastic Network Design: Models And Scalable Algorithms, Xiaojian Wu Nov 2016

Stochastic Network Design: Models And Scalable Algorithms, Xiaojian Wu

Doctoral Dissertations

Many natural and social phenomena occur in networks. Examples include the spread of information, ideas, and opinions through a social network, the propagation of an infectious disease among people, and the spread of species within an interconnected habitat network. The ability to modify a phenomenon towards some desired outcomes has widely recognized benefits to our society and the economy. The outcome of a phenomenon is largely determined by the topology or properties of its underlying network. A decision maker can take management actions to modify a network and, therefore, change the outcome of the phenomenon. A management action is an …


Lattice Boltzmann Methods For Wind Energy Analysis, Stephen Lloyd Wood Aug 2016

Lattice Boltzmann Methods For Wind Energy Analysis, Stephen Lloyd Wood

Doctoral Dissertations

An estimate of the United States wind potential conducted in 2011 found that the energy available at an altitude of 80 meters is approximately triple the wind energy available 50 meters above ground. In 2012, 43% of all new electricity generation installed in the U.S. (13.1 GW) came from wind power. The majority of this power, 79%, comes from large utility scale turbines that are being manufactured at unprecedented sizes. Existing wind plants operate with a capacity factor of only approximately 30%. Measurements have shown that the turbulent wake of a turbine persists for many rotor diameters, inducing increased vibration …


Efficient Inference, Search And Evaluation For Latent Variable Models Of Text With Applications To Information Retrieval And Machine Translation, Kriste Krstovski Jul 2016

Efficient Inference, Search And Evaluation For Latent Variable Models Of Text With Applications To Information Retrieval And Machine Translation, Kriste Krstovski

Doctoral Dissertations

Latent variable models of text, such as topic models, have been explored in many areas of natural language processing, information retrieval and machine translation to aid tasks such as exploratory data analysis, automated topic clustering and finding similar documents in mono- and multilingual collections. Many additional applications of these models, however, could be enabled by more efficient techniques for processing large datasets. In this thesis, we introduce novel methods that offer efficient inference, search and evaluation for latent variable models of text. We present efficient, online inference for representing documents in several languages in a common topic space and fast …


Wind Farm Wake Modeling And Analysis Of Wake Impacts In A Wind Farm, Yujia Hao Jul 2016

Wind Farm Wake Modeling And Analysis Of Wake Impacts In A Wind Farm, Yujia Hao

Doctoral Dissertations

More and more wind turbines have been grouped in the same location during the last decades to take the advantage of profitable wind resources and reduced maintenance cost. However wind turbines located in a wind farm are subject to a wind field that is substantially modified compared to the ambient wind field due to wake effects. The wake results in a reduced power production, increased load variation on the waked turbine, and reduced wake farm efficiency. Therefore the wake has long been an important concern for the wind farm installation, maintenance, and control. Thus a wake simulation tool is required. …