Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 117

Full-Text Articles in Physical Sciences and Mathematics

Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang Jun 2024

Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang

Dissertations, Theses, and Capstone Projects

Contextual information has been widely used in many computer vision tasks, such as object detection, video action detection, image classification, etc. Recognizing a single object or action out of context could be sometimes very challenging, and context information may help improve the understanding of a scene or an event greatly. However, existing approaches design specific contextual information mechanisms for different detection tasks.

In this research, we first present a comprehensive survey of context understanding in computer vision, with a taxonomy to describe context in different types and levels. Then we proposed MultiCLU, a new multi-stage context learning and utilization framework, …


The Efficacy Of Using Machine Learning Techniques For Identifying And Classifying “Fake News”, Muhammad Islam Jun 2024

The Efficacy Of Using Machine Learning Techniques For Identifying And Classifying “Fake News”, Muhammad Islam

Dissertations, Theses, and Capstone Projects

In today's digital world, detecting fake news has emerged as a critical challenge, one that has significant effects on democracy and public discourse at large both regionally and globally. This research studies how diversity of news sources in training datasets affects how well machine learning models can classify fake vs true news. I used the Linear Support Vector Classification (LinearSVC) to create and compare two classification models: one was trained on a dataset that only had real news from a singular source, Reuters (Dataset 1), and the other was trained on a dataset that contained real news from Reuters, The …


Assessing Job Vulnerability And Employment Growth In The Era Of Large Language Models (Llms), Prudence P. Brou Jun 2024

Assessing Job Vulnerability And Employment Growth In The Era Of Large Language Models (Llms), Prudence P. Brou

Dissertations, Theses, and Capstone Projects

This paper explores the impact of Large Language Models (LLMs) and artificial intelligence (AI) on white-collar occupations in the context of job vulnerability and employment growth. Utilizing the Kaggle dataset "Occupation Salary and Likelihood of Automation," the study employs a data-driven approach to analyze trends across states. Through interactive data visualization, the project aims to provide actionable insights for affected workers, businesses, and policymakers navigating the changing dynamics of the workforce amidst technological advancements.


Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani Feb 2024

Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani

Dissertations, Theses, and Capstone Projects

The understanding of human actions in videos holds immense potential for technological advancement and societal betterment. This thesis explores fundamental aspects of this field, including action recognition in trimmed clips and action localization in untrimmed videos. Trimmed videos contain only one action instance, with moments before or after the action excluded from the video. However, the majority of videos captured in unconstrained environments, often referred to as untrimmed videos, are naturally unsegmented. Untrimmed videos are typically lengthy and may encompass multiple action instances, along with the moments preceding or following each action, as well as transitions between actions. In the …


What Does One Billion Dollars Look Like?: Visualizing Extreme Wealth, William Mahoney Luckman Feb 2024

What Does One Billion Dollars Look Like?: Visualizing Extreme Wealth, William Mahoney Luckman

Dissertations, Theses, and Capstone Projects

The word “billion” is a mathematical abstraction related to “big,” but it is difficult to understand the vast difference in value between one million and one billion; even harder to understand the vast difference in purchasing power between one billion dollars, and the average U.S. yearly income. Perhaps most difficult to conceive of is what that purchasing power and huge mass of capital translates to in terms of power. This project blends design, text, facts, and figures into an interactive narrative website that helps the user better understand their position in relation to extreme wealth: https://whatdoesonebilliondollarslooklike.website/

The site incorporates …


Optimization And Application Of Graph Neural Networks, Shuo Zhang Sep 2023

Optimization And Application Of Graph Neural Networks, Shuo Zhang

Dissertations, Theses, and Capstone Projects

Graph Neural Networks (GNNs) are widely recognized for their potential in learning from graph-structured data and solving complex problems. However, optimal performance and applicability of GNNs have been an open-ended challenge. This dissertation presents a series of substantial advances addressing this problem. First, we investigate attention-based GNNs, revealing a critical shortcoming: their ignorance of cardinality information that impacts their discriminative power. To rectify this, we propose Cardinality Preserved Attention (CPA) models that can be applied to any attention-based GNNs, which exhibit a marked improvement in performance. Next, we introduce the Directional Node Pair (DNP) descriptor and the Robust Molecular Graph …


Out-Of-Distribution Generalization Of Deep Learning To Illuminate Dark Protein Functional Space, Tian Cai Sep 2023

Out-Of-Distribution Generalization Of Deep Learning To Illuminate Dark Protein Functional Space, Tian Cai

Dissertations, Theses, and Capstone Projects

Dark protein illumination is a fundamental challenge in drug discovery where majority human proteins are understudied, i.e. with only known protein sequence but no known small molecule binder. It's a major road block to enable drug discovery paradigm shift from single-targeted which looks to identify a single target and design drug to regulate the single target to multi-targeted in a Systems Pharmacology perspective. Diseases such as Alzheimer's and Opioid-Use-Disorder plaguing millions of patients call for effective multi-targeted approach involving dark proteins. Using limited protein data to predict dark protein property requires deep learning systems with OOD generalization capacity. Out-of-Distribution (OOD) …


Evaluating Neural Networks As Cognitive Models For Learning Quasi-Regularities In Language, Xiaomeng Ma Jun 2023

Evaluating Neural Networks As Cognitive Models For Learning Quasi-Regularities In Language, Xiaomeng Ma

Dissertations, Theses, and Capstone Projects

Many aspects of language can be categorized as quasi-regular: the relationship between the inputs and outputs is systematic but allows many exceptions. Common domains that contain quasi-regularity include morphological inflection and grapheme-phoneme mapping. How humans process quasi-regularity has been debated for decades. This thesis implemented modern neural network models, transformer models, on two tasks: English past tense inflection and Chinese character naming, to investigate how transformer models perform quasi-regularity tasks. This thesis focuses on investigating to what extent the models' performances can represent human behavior. The results show that the transformers' performance is very similar to human behavior in many …


Witness-Authenticated Key Exchange, Kelsey G. Melissaris Sep 2022

Witness-Authenticated Key Exchange, Kelsey G. Melissaris

Dissertations, Theses, and Capstone Projects

In this dissertation we investigate Witness-Authenticated Key Exchange (WAKE), a key agreement protocol in which each party is authenticated through knowledge of a witness to an arbitrary NP statement. We provide both game-based and universally composable definitions. Thereby, this thesis presents solutions for the most flexible and general method of authentication for group key exchange, providing simple constructions from (succinct) signatures of knowledge (SOK) and a two round UC-secure protocol.

After a discussion of flaws in previous definitions for WAKE we supply a new and improved game-based definition along with the first definition for witness-authenticated key exchange between groups of …


Grammar Competition Explored In Two Case Studies: The Null Subject Stage In English-Speaking Children And The Variation Observed In Old English, Soumik Dey Sep 2022

Grammar Competition Explored In Two Case Studies: The Null Subject Stage In English-Speaking Children And The Variation Observed In Old English, Soumik Dey

Dissertations, Theses, and Capstone Projects

Grammar competition theory postulates that variation in a speaker is the result of different grammars competing against each other. This study performs an analysis of two case studies of empirical observations attributed to possible grammar competition — subject drop in English-speaking children and variation observed in Old English.

Children in an English-speaking environment drop subjects early on during acquisition. Orfitelli and Hyams (2012) find that young English-speaking children mistakenly interpret imperative null subject utterances as declaratives. They suggest that this misinterpretation can be attributed to performance factors, which leads to grammar competition and subsequently subject drop in English children. We …


Influence Level Prediction On Social Media Through Multi-Task And Sociolinguistic User Characteristics Modeling, Denys Katerenchuk Sep 2022

Influence Level Prediction On Social Media Through Multi-Task And Sociolinguistic User Characteristics Modeling, Denys Katerenchuk

Dissertations, Theses, and Capstone Projects

Prediction of a user’s influence level on social networks has attracted a lot of attention as human interactions move online. Influential users have the ability to influence others’ behavior to achieve their own agenda. As a result, predicting users’ level of influence online can help to understand social networks, forecast trends, prevent misinformation, etc. The research on user influence in social networks has attracted much attention across multiple disciplines, from social sciences to mathematics, yet it is still not well understood. One of the difficulties is that the definition of influence is specific to a particular problem or a domain, …


The Interaction Of Different Primary Producers And Physical And Chemical Dynamics Of An Urban Shallow Lake, Majid Sahin Sep 2022

The Interaction Of Different Primary Producers And Physical And Chemical Dynamics Of An Urban Shallow Lake, Majid Sahin

Dissertations, Theses, and Capstone Projects

An artificial urban shallow lake, Prospect Park Lake (PPL), is situated on a terminal moraine in Brooklyn New York, and supplied with municipal water treated with ortho-phosphates. The constant input of the phosphate nutrient is the primary source of eutrophication in the lake. The numerous pools along the water course houses various aquatic phototrophs, which influence the water quality and the state of the system, driving conditions into favoring the survival of their species. In the first half of the dissertation, the focus of the project is on analyzing how the different primary producers in different regions of PPL affect …


On The Cryptographic Deniability Of The Signal Protocol, Nihal Vatandas Sep 2022

On The Cryptographic Deniability Of The Signal Protocol, Nihal Vatandas

Dissertations, Theses, and Capstone Projects

Offline deniability is the ability to a posteriori deny having participated in a particular communication session. This property has been widely assumed for the Signal messaging application, yet no formal proof has appeared in the literature. In this work, we present the first formal study of the offline deniability of the Signal protocol. Our analysis shows that building a deniability proof for Signal is non-trivial and requires strong assumptions on the underlying mathematical groups where the protocol is run.

To do so, we study various implicitly authenticated key exchange protocols, including MQV, HMQV, and 3DH/X3DH, the latter being the core …


Data-Centric Machine Learning For Speech And Audio, Ali Raza Syed Sep 2022

Data-Centric Machine Learning For Speech And Audio, Ali Raza Syed

Dissertations, Theses, and Capstone Projects

There is growing recognition of the importance of data-centric methods for building machine learning systems. Data-centric methods assume a fixed model and iterate over the data to improve system performance. This is in contrast to traditional model-centric approaches, which assume a fixed dataset and iterate over models for the same ends. Data-centric machine learning is driven by the observation that, beyond the size of the training data, model performance depends on factors such as the quality of the annotations, and whether the data are representative of conditions in which models will be deployed. This is particularly of interest in the …


Finite Gaussian Neurons: Defending Against Adversarial Attacks By Making Neural Networks Say "I Don’T Know", Felix Grezes Sep 2022

Finite Gaussian Neurons: Defending Against Adversarial Attacks By Making Neural Networks Say "I Don’T Know", Felix Grezes

Dissertations, Theses, and Capstone Projects

In this work, I introduce the Finite Gaussian Neuron (FGN), a novel neuron architecture for artificial neural networks aimed at protecting against adversarial attacks.
Since 2014, artificial neural networks have been known to be vulnerable to adversarial attacks, which can fool the network into producing wrong or nonsensical outputs by making humanly imperceptible alterations to inputs. While defenses against adversarial attacks have been proposed, they usually involve retraining a new neural network from scratch, a costly task.

My works aims to:
- easily convert existing models to Finite Gaussian Neuron architecture,
- while preserving the existing model's behavior on real …


An Analysis Of The Friendship Paradox And Derived Sampling Methods, Yitzchak Novick Sep 2022

An Analysis Of The Friendship Paradox And Derived Sampling Methods, Yitzchak Novick

Dissertations, Theses, and Capstone Projects

The friendship paradox (FP) is the famous sampling-bias phenomenon that leads to the seemingly paradoxical truth that, on average, people’s friends have more friends than they do. Among the many far-reaching research findings the FP inspired is a sampling method that samples neighbors of vertices in a graph in order to acquire random vertices that are of higher expected degree than average.

Our research examines the friendship paradox on a local level. We seek to quantify the impact of the FP on an individual vertex by defining the vertex’s “friendship index”, a measure of the extent to which the phenomenon …


Towards Explaining Variation In Entrainment, Andreas Weise Sep 2022

Towards Explaining Variation In Entrainment, Andreas Weise

Dissertations, Theses, and Capstone Projects

Entrainment refers to the tendency of human speakers to adapt to their interlocutors to become more similar to them. This affects various dimensions and occurs in many contexts, allowing for rich applications in human-computer interaction. However, it is not exhibited by every speaker in every conversation but varies widely across features, speakers, and contexts, hindering broad application. This variation, whose guiding principles are poorly understood even after decades of entrainment research, is the subject of this thesis. We begin with a comprehensive literature review that serves as the foundation of our own work and provides a reference to guide future …


Coded Distributed Function Computation, Pedro J. Soto Jun 2022

Coded Distributed Function Computation, Pedro J. Soto

Dissertations, Theses, and Capstone Projects

A ubiquitous problem in computer science research is the optimization of computation on large data sets. Such computations are usually too large to be performed on one machine and therefore the task needs to be distributed amongst a network of machines. However, a common problem within distributed computing is the mitigation of delays caused by faulty machines. This can be performed by the use of coding theory to optimize the amount of redundancy needed to handle such faults. This problem differs from classical coding theory since it is concerned with the dynamic coded computation on data rather than just statically …


Identifying, Evaluating And Applying Importance Maps For Speech, Viet Anh Trinh Feb 2022

Identifying, Evaluating And Applying Importance Maps For Speech, Viet Anh Trinh

Dissertations, Theses, and Capstone Projects

Like many machine learning systems, speech models often perform well when employed on data in the same domain as their training data. However, when the inference is on out-of-domain data, performance suffers. With a fast-growing number of applications of speech models in healthcare, education, automotive, automation, etc., it is essential to ensure that speech models can generalize to out-of-domain data, especially to noisy environments in real-world scenarios. In contrast, human listeners are quite robust to noisy environments. Thus, a thorough understanding of the differences between human listeners and speech models is urgently required to enhance speech model performance in noise. …


Usability Of Health-Related Websites By Filipino-American Adults And Nursing Informatics Experts, Kathleen Begonia Feb 2022

Usability Of Health-Related Websites By Filipino-American Adults And Nursing Informatics Experts, Kathleen Begonia

Dissertations, Theses, and Capstone Projects

Filipino-Americans are an understudied minority group with high prevalence and mortality from chronic conditions, such as cardiovascular disease and diabetes. Facing barriers to care and lack of culturally appropriate health resources, they frequently use the internet to obtain health information. It is unknown whether they perceive health-related websites to be useful or easy to use because there are no published usability studies involving this population. Using the Technology Acceptance Model as a theoretical framework, this study investigated the difference between website design ratings by experts and the perceptions of Filipino-American users to determine if usability guidelines influenced the perceived ease …


Representation Learning For Chemical Activity Predictions, Mohamed S. Ayed Feb 2022

Representation Learning For Chemical Activity Predictions, Mohamed S. Ayed

Dissertations, Theses, and Capstone Projects

Computational prediction of a phenotypic response upon the chemical perturbation on a biological system plays an important role in drug discovery and many other applications. Chemical fingerprints derived from chemical structures are a widely used feature to build machine learning models. However, the fingerprints ignore the biological context, thus, they suffer from several problems such as the activity cliff and curse of dimensionality. Fundamentally, the chemical modulation of biological activities is a multi-scale process. It is the genome-wide chemical-target interactions that modulate chemical phenotypic responses. Thus, the genome-scale chemical-target interaction profile will more directly correlate with in vitro and in …


Solving Multiple Inference In Graphical Models, Cong Chen Sep 2021

Solving Multiple Inference In Graphical Models, Cong Chen

Dissertations, Theses, and Capstone Projects

For inference problems in graphical models, much effort has been directed at algorithms for obtaining one single optimal prediction. In practice, the data is often noisy or incomplete, which makes one single optimal solution unreliable. To address this problem, multiple Inference is proposed to find several best solutions, M-Best, where multiple hypotheses are preferred for advanced reasoning. People use oracle accuracy as an evaluation criterion expecting one of the solutions has high accuracy with the ground truth. It has been shown that it is beneficial for the top solutions to be diverse. Approaches for solving diverse multiple inference are proposed …


Molecular Dynamics Simulations Of Self-Assemblies In Nature And Nanotechnology, Phu Khanh Tang Sep 2021

Molecular Dynamics Simulations Of Self-Assemblies In Nature And Nanotechnology, Phu Khanh Tang

Dissertations, Theses, and Capstone Projects

Nature usually divides complex systems into smaller building blocks specializing in a few tasks since one entity cannot achieve everything. Therefore, self-assembly is a robust tool exploited by Nature to build hierarchical systems that accomplish unique functions. The cell membrane distinguishes itself as an example of Nature’s self-assembly, defining and protecting the cell. By mimicking Nature’s designs using synthetically designed self-assemblies, researchers with advanced nanotechnological comprehension can manipulate these synthetic self-assemblies to improve many aspects of modern medicine and materials science. Understanding the competing underlying molecular interactions in self-assembly is always of interest to the academic scientific community and industry. …


Piecewise Linear Manifold Clustering, Artyom Diky Sep 2021

Piecewise Linear Manifold Clustering, Artyom Diky

Dissertations, Theses, and Capstone Projects

This work studies the application of topological analysis to non-linear manifold clustering. A novel method, that exploits the data clustering structure, allows to generate a topological representation of the point dataset. An analysis of topological construction under different simulated conditions is performed to explore the capabilities and limitations of the method, and demonstrated statistically significant improvements in performance. Furthermore, we introduce a new information-theoretical validation measure for clustering, that exploits geometrical properties of clusters to estimate clustering compressibility, for evaluation of the clustering goodness-of-fit without any prior information about true class assignments. We show how the new validation measure, when …


Novel Hybrid Resampling Algorithms For Parallel/Distributed Particle Filters, Xudong Zhang Sep 2021

Novel Hybrid Resampling Algorithms For Parallel/Distributed Particle Filters, Xudong Zhang

Dissertations, Theses, and Capstone Projects

Particle filters, also known as sequential Monte Carlo (SMC) methods, use the Bayesian inference and the stochastic sampling technique to estimate the states of dynamic systems from given observations. Parallel/Distributed particle filters were introduced to improve the performance of sequential particle filters by using multiple processing units (PUs). The classical resampling algorithm used in parallel/distributed particle filters is a centralized scheme, called centralized resampling, which needs a central unit (CU) to serve as a hub for data transfers. As a result, the centralized resampling procedures produce extra communication costs, which lowers the speedup factors in parallel computing. Even though some …


Logics Of Resource And Justification, Hirohiko Kushida Sep 2021

Logics Of Resource And Justification, Hirohiko Kushida

Dissertations, Theses, and Capstone Projects

It is a well-known result by G ̈odel in 1933 that the Intuitionistic Logic can be embedded into a system which is essentially equivalent to the modal logic S4. This can be considered to be an attempt to provide a provability semantics to the Intuitionistic Logic. This work had caused some problems: the exact arithmetical meaning of the Intuitionistic Logic and S4, the exact axiomatization of formal provability in formal arithmetic and the standard model of arithmetic. These days the arithmetical interpretation has been extended and generalized to epistemological interpretation for various modal logics, which resulted in various systems of …


Adversarial Training For Skill Learning In A Mobile Robot, Todd W. Flyr Sep 2021

Adversarial Training For Skill Learning In A Mobile Robot, Todd W. Flyr

Dissertations, Theses, and Capstone Projects

Machine Learning in mobile robotics is sometimes hampered by the difficulties associated with the creation of a large corpus of labeled data that most neural network based learning algorithms demand. In recent years, advances in the field of machine learning have been facilitated via the creation of large collaboratively-created labeled training datasets that researchers can use as the basis for experiments to validate and improve their candidate neural network architectures. For the field of robotics, however, tasks are so disparate and the physical devices so varied that in most cases the creation of collaborative benchmark datasets are impractical. Obtaining data …


Mechanism Design And Modeling To Analyze Complex Social Systems For Public Policy, Haripriya Chakraborty Jun 2021

Mechanism Design And Modeling To Analyze Complex Social Systems For Public Policy, Haripriya Chakraborty

Dissertations, Theses, and Capstone Projects

The study of complex systems is an important area of research. Many scenarios require the ability to simulate large multi-agent systems with minimal artificial assumptions. We are currently living in a world where the adoption of artificial intelligence (AI) in various areas is increasing rapidly. This, in turn, has serious consequences from a computational and policy perspective. The focus needs to be on designing systems that are not only computationally elegant and efficient but also ethical. The goal of this thesis is to examine some of the ways AI can be used to simulate complex social systems. In addition, we …


Towards Automated Software Evolution Of Data-Intensive Applications, Yiming Tang Jun 2021

Towards Automated Software Evolution Of Data-Intensive Applications, Yiming Tang

Dissertations, Theses, and Capstone Projects

Recent years have witnessed an explosion of work on Big Data. Data-intensive applications analyze and produce large volumes of data typically terabyte and petabyte in size. Many techniques for facilitating data processing are integrated into data-intensive applications. API is a software interface that allows two applications to communicate with each other. Streaming APIs are widely used in today's Object-Oriented programming development that can support parallel processing. In this dissertation, an approach that automatically suggests stream code run in parallel or sequentially is proposed. However, using streams efficiently and properly needs many subtle considerations. The use and misuse patterns for stream …


Learn Biologically Meaningful Representation With Transfer Learning, Di He Jun 2021

Learn Biologically Meaningful Representation With Transfer Learning, Di He

Dissertations, Theses, and Capstone Projects

Machine learning has made significant contributions to bioinformatics and computational biol­ogy. In particular, supervised learning approaches have been widely used in solving problems such as bio­marker identification, drug response prediction, and so on. However, because of the limited availability of comprehensively labeled and clean data, constructing predictive models in super­ vised settings is not always desirable or possible, especially when using data­hunger, red­hot learning paradigms such as deep learning methods. Hence, there are urgent needs to develop new approaches that could leverage more readily available unlabeled data in driving successful machine learning ap­ plications in this area.

In my dissertation, …