Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 495

Full-Text Articles in Physical Sciences and Mathematics

Model-Based Deep Autoencoders For Clustering Single-Cell Rna Sequencing Data With Side Information, Xiang Lin Dec 2023

Model-Based Deep Autoencoders For Clustering Single-Cell Rna Sequencing Data With Side Information, Xiang Lin

Dissertations

Clustering analysis has been conducted extensively in single-cell RNA sequencing (scRNA-seq) studies. scRNA-seq can profile tens of thousands of genes' activities within a single cell. Thousands or tens of thousands of cells can be captured simultaneously in a typical scRNA-seq experiment. Biologists would like to cluster these cells for exploring and elucidating cell types or subtypes. Numerous methods have been designed for clustering scRNA-seq data. Yet, single-cell technologies develop so fast in the past few years that those existing methods do not catch up with these rapid changes and fail to fully fulfil their potential. For instance, besides profiling transcription …


Models And Algorithms For Promoting Diverse And Fair Query Results, Md Mouinul Islam Aug 2023

Models And Algorithms For Promoting Diverse And Fair Query Results, Md Mouinul Islam

Dissertations

Ensuring fairness and diversity in search results are two key concerns in compelling search and recommendation applications. This work explicitly studies these two aspects given multiple users' preferences as inputs, in an effort to create a single ranking or top-k result set that satisfies different fairness and diversity criteria. From group fairness standpoint, it adapts demographic parity like group fairness criteria and proposes new models that are suitable for ranking or producing top-k set of results. This dissertation also studies equitable exposure of individual search results in long tail data, a concept related to individual fairness. First, the dissertation focuses …


Quantifying Balance: Computational And Learning Frameworks For The Characterization Of Balance In Bipedal Systems, Kubra Akbas Aug 2023

Quantifying Balance: Computational And Learning Frameworks For The Characterization Of Balance In Bipedal Systems, Kubra Akbas

Dissertations

In clinical practice and general healthcare settings, the lack of reliable and objective balance and stability assessment metrics hinders the tracking of patient performance progression during rehabilitation; the assessment of bipedal balance plays a crucial role in understanding stability and falls in humans and other bipeds, while providing clinicians important information regarding rehabilitation outcomes. Bipedal balance has often been examined through kinematic or kinetic quantities, such as the Zero Moment Point and Center of Pressure; however, analyzing balance specifically through the body's Center of Mass (COM) state offers a holistic and easily comprehensible view of balance and stability.

Building upon …


Learning Representations For Effective And Explainable Software Bug Detection And Fixing, Yi Li Aug 2023

Learning Representations For Effective And Explainable Software Bug Detection And Fixing, Yi Li

Dissertations

Software has an integral role in modern life; hence software bugs, which undermine software quality and reliability, have substantial societal and economic implications. The advent of machine learning and deep learning in software engineering has led to major advances in bug detection and fixing approaches, yet they fall short of desired precision and recall. This shortfall arises from the absence of a 'bridge,' known as learning code representations, that can transform information from source code into a suitable representation for effective processing via machine and deep learning.

This dissertation builds such a bridge. Specifically, it presents solutions for effectively learning …


Fortifying Robustness: Unveiling The Intricacies Of Training And Inference Vulnerabilities In Centralized And Federated Neural Networks, Guanxiong Liu Aug 2023

Fortifying Robustness: Unveiling The Intricacies Of Training And Inference Vulnerabilities In Centralized And Federated Neural Networks, Guanxiong Liu

Dissertations

Neural network (NN) classifiers have gained significant traction in diverse domains such as natural language processing, computer vision, and cybersecurity, owing to their remarkable ability to approximate complex latent distributions from data. Nevertheless, the conventional assumption of an attack-free operating environment has been challenged by the emergence of adversarial examples. These perturbed samples, which are typically imperceptible to human observers, can lead to misclassifications by the NN classifiers. Moreover, recent studies have uncovered the ability of poisoned training data to generate Trojan backdoored classifiers that exhibit misclassification behavior triggered by predefined patterns.

In recent years, significant research efforts have been …


Bacterial Motion And Spread In Porous Environments, Yasser Almoteri Aug 2023

Bacterial Motion And Spread In Porous Environments, Yasser Almoteri

Dissertations

Micro-swimmers are ubiquitous in nature from soil and water to mammalian bodies and even many technological processes. Common known examples are microbes such as bacteria, micro-algae and micro-plankton, cells such as spermatozoa and organisms such as nematodes. These swimmers live and have evolved in multiplex environments and complex flows in the presence of other swimmers and types, inert particles and fibers, interfaces and non-trivial confinements and more. Understanding the locomotion and interactions of these individual micro-swimmers in such impure viscous fluids is crucial to understanding the emergent dynamics of such complex systems, and to further enabling us to control and …


Diversification And Fairness In Top-K Ranking Algorithms, Mahsa Asadi Aug 2023

Diversification And Fairness In Top-K Ranking Algorithms, Mahsa Asadi

Dissertations

Given a user query, the typical user interfaces, such as search engines and recommender systems, only allow a small number of results to be returned to the user. Hence, figuring out what would be the top-k results is an important task in information retrieval, as it helps to ensure that the most relevant results are presented to the user. There exists an extensive body of research that studies how to score the records and return top-k to the user. Moreover, there exists an extensive set of criteria that researchers identify to present the user with top-k results, and result diversification …


Human-Ai Complex Task Planning, Sepideh Nikookar Aug 2023

Human-Ai Complex Task Planning, Sepideh Nikookar

Dissertations

The process of complex task planning is ubiquitous and arises in a variety of compelling applications. A few leading examples include designing a personalized course plan or trip plan, designing music playlists/work sessions in web applications, or even planning routes of naval assets to collaboratively discover an unknown destination. For all of these aforementioned applications, creating a plan requires satisfying a basic construct, i.e., composing a sequence of sub-tasks (or items) that optimizes several criteria and satisfies constraints. For instance, in course planning, sub-tasks or items are core and elective courses, and degree requirements capture their complex dependencies as constraints. …


Program Analysis For Android Security And Reliability, Sydur Rahaman Aug 2023

Program Analysis For Android Security And Reliability, Sydur Rahaman

Dissertations

The recent, widespread growth and adoption of mobile devices have revolutionized the way users interact with technology. As mobile apps have become increasingly prevalent, concerns regarding their security and reliability have gained significant attention. The ever-expanding mobile app ecosystem presents unique challenges in ensuring the protection of user data and maintaining app robustness. This dissertation expands the field of program analysis with techniques and abstractions tailored explicitly to enhancing Android security and reliability. This research introduces approaches for addressing critical issues related to sensitive information leakage, device and user fingerprinting, mobile medical score calculators, as well as termination-induced data loss. …


Toward Smart And Efficient Scientific Data Management, Jinzhen Wang Aug 2023

Toward Smart And Efficient Scientific Data Management, Jinzhen Wang

Dissertations

Scientific research generates vast amounts of data, and the scale of data has significantly increased with advancements in scientific applications. To manage this data effectively, lossy data compression techniques are necessary to reduce storage and transmission costs. Nevertheless, the use of lossy compression introduces uncertainties related to its performance. This dissertation aims to answer key questions surrounding lossy data compression, such as how the performance changes, how much reduction can be achieved, and how to optimize these techniques for modern scientific data management workflows.

One of the major challenges in adopting lossy compression techniques is the trade-off between data accuracy …


Data-Driven 2d Materials Discovery For Next-Generation Electronics, Zeyu Zhang Aug 2023

Data-Driven 2d Materials Discovery For Next-Generation Electronics, Zeyu Zhang

Dissertations

The development of material discovery and design has lasted centuries in human history. After the concept of modern chemistry and material science was established, the strategy of material discovery relies on the experiments. Such a strategy becomes expensive and time-consuming with the increasing number of materials nowadays. Therefore, a novel strategy that is faster and more comprehensive is urgently needed. In this dissertation, an experiment-guided material discovery strategy is developed and explained using metal-organic frameworks (MOFs) as instances. The advent of 7r-stacked layered MOFs, which offer electrical conductivity on top of permanent porosity and high surface area, opened up new …


Machine Learning And Network Embedding Methods For Gene Co-Expression Networks, Niloofar Aghaieabiane May 2023

Machine Learning And Network Embedding Methods For Gene Co-Expression Networks, Niloofar Aghaieabiane

Dissertations

High-throughput technologies such as DNA microarrays and RNA-seq are used to measure the expression levels of large numbers of genes simultaneously. To support the extraction of biological knowledge, individual gene expression levels are transformed into Gene Co-expression Networks (GCNs). GCNs are analyzed to discover gene modules. GCN construction and analysis is a well-studied topic, for nearly two decades. While new types of sequencing and the corresponding data are now available, the software package WGCNA and its most recent variants are still widely used, contributing to biological discovery.

The discovery of biologically significant modules of genes from raw expression data is …


Trustworthy Machine Learning Through The Lens Of Privacy And Security, Thi Kim Phung Lai May 2023

Trustworthy Machine Learning Through The Lens Of Privacy And Security, Thi Kim Phung Lai

Dissertations

Nowadays, machine learning (ML) becomes ubiquitous and it is transforming society. However, there are still many incidents caused by ML-based systems when ML is deployed in real-world scenarios. Therefore, to allow wide adoption of ML in the real world, especially in critical applications such as healthcare, finance, etc., it is crucial to develop ML models that are not only accurate but also trustworthy (e.g., explainable, privacy-preserving, secure, and robust). Achieving trustworthy ML with different machine learning paradigms (e.g., deep learning, centralized learning, federated learning, etc.), and application domains (e.g., computer vision, natural language, human study, malware systems, etc.) is challenging, …


Mapping Programs To Equations, Hessamaldin Mohammadi May 2023

Mapping Programs To Equations, Hessamaldin Mohammadi

Dissertations

Extracting the function of a program from a static analysis of its source code is a valuable capability in software engineering; at a time when there is increasing talk of using AI (Artificial Intelligence) to generate software from natural language specifications, it becomes increasingly important to determine the exact function of software as written, to figure out what AI has understood the natural language specification to mean. For all its criticality, the ability to derive the domain-to-range function of a program has proved to be an elusive goal, due primarily to the difficulty of deriving the function of iterative statements. …


Deep Hybrid Modeling Of Neuronal Dynamics Using Generative Adversarial Networks, Soheil Saghafi May 2023

Deep Hybrid Modeling Of Neuronal Dynamics Using Generative Adversarial Networks, Soheil Saghafi

Dissertations

Mechanistic modeling and machine learning methods are powerful techniques for approximating biological systems and making accurate predictions from data. However, when used in isolation these approaches suffer from distinct shortcomings: model and parameter uncertainty limit mechanistic modeling, whereas machine learning methods disregard the underlying biophysical mechanisms. This dissertation constructs Deep Hybrid Models that address these shortcomings by combining deep learning with mechanistic modeling. In particular, this dissertation uses Generative Adversarial Networks (GANs) to provide an inverse mapping of data to mechanistic models and identifies the distributions of mechanistic model parameters coherent to the data.

Chapter 1 provides background information on …


A Survey On Online Matching And Ad Allocation, Ryan Lee May 2023

A Survey On Online Matching And Ad Allocation, Ryan Lee

Theses

One of the classical problems in graph theory is matching. Given an undirected graph, find a matching which is a set of edges without common vertices. In 1990s, Richard Karp, Umesh Vazirani, and Vijay Vazirani would be the first computer scientists to use matchings for online algorithms [8]. In our domain, an online algorithm operates in the online setting where a bipartite graph is given. On one side of the graph there is a set of advertisers and on the other side we have a set of impressions. During the online phase, multiple impressions will arrive and the objective of …


Using Materialized Views For Answering Graph Pattern Queries, Michael Lan Dec 2022

Using Materialized Views For Answering Graph Pattern Queries, Michael Lan

Dissertations

Discovering patterns in graphs by evaluating graph pattern queries involving direct (edge-to-edge mapping) and reachability (edge-to-path mapping) relationships under homomorphisms on data graphs has been extensively studied. Previous studies have aimed to reduce the evaluation time of graph pattern queries due to the potentially numerous matches on large data graphs.

In this work, the concept of the summary graph is developed to improve the evaluation of tree pattern queries and graph pattern queries. The summary graph first filters out candidate matches which violate certain reachability constraints, and then finds local matches of query edges. This reduces redundancy in the representation …


Android Security: Analysis And Applications, Raina Samuel Dec 2022

Android Security: Analysis And Applications, Raina Samuel

Dissertations

The Android mobile system is home to millions of apps that offer a wide range of functionalities. Users rely on Android apps in various facets of daily life, including critical, e.g., medical, settings. Generally, users trust that apps perform their stated purpose safely and accurately. However, despite the platform’s efforts to maintain a safe environment, apps routinely manage to evade scrutiny. This dissertation analyzes Android app behavior and has revealed several weakness: lapses in device authentication schemes, deceptive practices such as apps covering their traces, as well as behavioral and descriptive inaccuracies in medical apps. Examining a large corpus of …


Machine Learning-Based Data Analytics For Understanding Space Weather And Climate, Yasser Abduallah Dec 2022

Machine Learning-Based Data Analytics For Understanding Space Weather And Climate, Yasser Abduallah

Dissertations

This dissertation addresses multiple crucial problems in space weather and climate, presenting new machine learning-based data analytics algorithms and models for tackling the problems.

First, the dissertation presents two new approaches to predicting solar flares. One approach, called DeepSun, predicts solar flares by utilizing a machine-learning-as-a-service (MLaaS) platform. The DeepSun system provides a friendly interface for Web users and an application programming interface (API) for remote programming users. It adopts an ensemble learning method that employs several machine learning algorithms to perform multiclass flare prediction. The other approach, named SolarFlareNet, forecasts the occurrence of solar flares within the next 24 …


A Neural Analysis-Synthesis Approach To Learning Procedural Audio Models, Danzel Serrano Dec 2022

A Neural Analysis-Synthesis Approach To Learning Procedural Audio Models, Danzel Serrano

Theses

The effective sound design of environmental sounds is crucial to demonstrating an immersive experience. Classical Procedural Audio (PA) models have been developed to give the sound designer a fast way to synthesize a specific class of environmental sounds in a physically accurate and computationally efficient manner. These models are controllable due to the choice of parameters from analyzing a class of sound. However, the resulting synthesis lacks the fidelity for the preferred immersive experience; thus, the sound designer would rather search through an extensive database for real recordings of a target sound class. This thesis proposes the Procedural audio Variational …


Computation Of Risk Measures In Finance And Parallel Real-Time Scheduling, Yajuan Li Aug 2022

Computation Of Risk Measures In Finance And Parallel Real-Time Scheduling, Yajuan Li

Dissertations

Many application areas employ various risk measures, such as a quantile, to assess risks. For example, in finance, risk managers employ a quantile to help determine appropriate levels of capital needed to be able to absorb (with high probability) large unexpected losses in credit portfolios comprising loans, bonds, and other financial instruments subject to default. This dissertation discusses the computation of risk measures in finance and parallel real-time scheduling.

Firstly, two estimation approaches are compared for one risk measure, a quantile, via randomized quasi-Monte Carlo (RQMC) in an asymptotic setting where the number of randomizations for RQMC grows large, but …


Low-Reynolds-Number Locomotion Via Reinforcement Learning, Yuexin Liu Aug 2022

Low-Reynolds-Number Locomotion Via Reinforcement Learning, Yuexin Liu

Dissertations

This dissertation summarizes computational results from applying reinforcement learning and deep neural network to the designs of artificial microswimmers in the inertialess regime, where the viscous dissipation in the surrounding fluid environment dominates and the swimmer’s inertia is completely negligible. In particular, works in this dissertation consist of four interrelated studies of the design of microswimmers for different tasks: (1) a one-dimensional microswimmer in free-space that moves towards the target via translation, (2) a one-dimensional microswimmer in a periodic domain that rotates to reach the target, (3) a two-dimensional microswimmer that switches gaits to navigate to the designated targets in …


Efficient And Scalable Triangle Centrality Algorithms In The Arkouda Framework, Joseph Thomas Patchett Aug 2022

Efficient And Scalable Triangle Centrality Algorithms In The Arkouda Framework, Joseph Thomas Patchett

Theses

Graph data structures provide a unique challenge for both analysis and algorithm development. These data structures are irregular in that memory accesses are not known a priori and accesses to these structures tend to lack locality.

Despite these challenges, graph data structures are a natural way to represent relationships between entities and to exhibit unique features about these relationships. The network created from these relationships can create unique local structures that can describe the behavior between members of these structures. Graphs can be analyzed in a number of different ways including at a high level in community detection and at …


One-Stage Blind Source Separation Via A Sparse Autoencoder Framework, Jason Anthony Dabin May 2022

One-Stage Blind Source Separation Via A Sparse Autoencoder Framework, Jason Anthony Dabin

Dissertations

Blind source separation (BSS) is the process of recovering individual source transmissions from a received mixture of co-channel signals without a priori knowledge of the channel mixing matrix or transmitted source signals. The received co-channel composite signal is considered to be captured across an antenna array or sensor network and is assumed to contain sparse transmissions, as users are active and inactive aperiodically over time. An unsupervised machine learning approach using an artificial feedforward neural network sparse autoencoder with one hidden layer is formulated for blindly recovering the channel matrix and source activity of co-channel transmissions. The BSS sparse autoencoder …


Understanding The Voluntary Moderation Practices In Live Streaming Communities, Jie Cai May 2022

Understanding The Voluntary Moderation Practices In Live Streaming Communities, Jie Cai

Dissertations

Harmful content, such as hate speech, online abuses, harassment, and cyberbullying, proliferates across various online communities. Live streaming as a novel online community provides ways for thousands of users (viewers) to entertain and engage with a broadcaster (streamer) in real-time in the chatroom. While the streamer has the camera on and the screen shared, tens of thousands of viewers are watching and messaging in real-time, resulting in concerns about harassment and cyberbullying. To regulate harmful content—toxic messages in the chatroom, streamers rely on a combination of automated tools and volunteer human moderators (mods) to block users or remove content, which …


A Self-Learning Intersection Control System For Connected And Automated Vehicles, Ardeshir Mirbakhsh May 2022

A Self-Learning Intersection Control System For Connected And Automated Vehicles, Ardeshir Mirbakhsh

Dissertations

This study proposes a Decentralized Sparse Coordination Learning System (DSCLS) based on Deep Reinforcement Learning (DRL) to control intersections under the Connected and Automated Vehicles (CAVs) environment. In this approach, roadway sections are divided into small areas; vehicles try to reserve their desired area ahead of time, based on having a common desired area with other CAVs; the vehicles would be in an independent or coordinated state. Individual CAVs are set accountable for decision-making at each step in both coordinated and independent states. In the training process, CAVs learn to minimize the overall delay at the intersection. Due to the …


Local Learning Algorithms For Stochastic Spiking Neural Networks, Bleema Rosenfeld May 2022

Local Learning Algorithms For Stochastic Spiking Neural Networks, Bleema Rosenfeld

Dissertations

This dissertation focuses on the development of machine learning algorithms for spiking neural networks, with an emphasis on local three-factor learning rules that are in keeping with the constraints imposed by current neuromorphic hardware. Spiking neural networks (SNNs) are an alternative to artificial neural networks (ANNs) that follow a similar graphical structure but use a processing paradigm more closely modeled after the biological brain in an effort to harness its low power processing capability. SNNs use an event based processing scheme which leads to significant power savings when implemented in dedicated neuromorphic hardware such as Intel’s Loihi chip.

This work …


Optimization Opportunities In Human In The Loop Computational Paradigm, Dong Wei May 2022

Optimization Opportunities In Human In The Loop Computational Paradigm, Dong Wei

Dissertations

An emerging trend is to leverage human capabilities in the computational loop at different capacities, ranging from tapping knowledge from a richly heterogeneous pool of knowledge resident in the general population to soliciting expert opinions. These practices are, in general, termed human-in-the-loop (HITL) computations.

A HITL process requires holistic treatment and optimization from multiple standpoints considering all stakeholders: a. applications, b. platforms, c. humans. In application-centric optimization, the factors of interest usually are latency (how long it takes for a set of tasks to finish), cost (the monetary or computational expenses incurred in the process), and quality of the completed …


Towards Practicalization Of Blockchain-Based Decentralized Applications, Songlin He May 2022

Towards Practicalization Of Blockchain-Based Decentralized Applications, Songlin He

Dissertations

Blockchain can be defined as an immutable ledger for recording transactions, maintained in a distributed network of mutually untrusting peers. Blockchain technology has been widely applied to various fields beyond its initial usage of cryptocurrency. However, blockchain itself is insufficient to meet all the desired security or efficiency requirements for diversified application scenarios. This dissertation focuses on two core functionalities that blockchain provides, i.e., robust storage and reliable computation. Three concrete application scenarios including Internet of Things (IoT), cybersecurity management (CSM), and peer-to-peer (P2P) content delivery network (CDN) are utilized to elaborate the general design principles for these two main …


Representation Learning In Finance, Ajim Uddin May 2022

Representation Learning In Finance, Ajim Uddin

Dissertations

Finance studies often employ heterogeneous datasets from different sources with different structures and frequencies. Some data are noisy, sparse, and unbalanced with missing values; some are unstructured, containing text or networks. Traditional techniques often struggle to combine and effectively extract information from these datasets. This work explores representation learning as a proven machine learning technique in learning informative embedding from complex, noisy, and dynamic financial data. This dissertation proposes novel factorization algorithms and network modeling techniques to learn the local and global representation of data in two specific financial applications: analysts’ earnings forecasts and asset pricing.

Financial analysts’ earnings forecast …