Theory and Algorithms | Open Access Articles

Foundations Of Node Representation Learning, Sudhanshu Chanpuriya Nov 2023

Foundations Of Node Representation Learning, Sudhanshu Chanpuriya

Doctoral Dissertations

Low-dimensional node representations, also called node embeddings, are a cornerstone in the modeling and analysis of complex networks. In recent years, advances in deep learning have spurred development of novel neural network-inspired methods for learning node representations which have largely surpassed classical 'spectral' embeddings in performance. Yet little work asks the central questions of this thesis: Why do these novel deep methods outperform their classical predecessors, and what are their limitations? We pursue several paths to answering these questions. To further our understanding of deep embedding methods, we explore their relationship with spectral methods, which are better understood, and show …

Go to article

A Structure-Aware Generative Adversarial Network For Bilingual Lexicon Induction, Bocheng Han, Qian Tao, Lusi Li, Zhihao Xiong Jan 2023

A Structure-Aware Generative Adversarial Network For Bilingual Lexicon Induction, Bocheng Han, Qian Tao, Lusi Li, Zhihao Xiong

Computer Science Faculty Publications

Bilingual lexicon induction (BLI) is the task of inducing word translations with a learned mapping function that aligns monolingual word embedding spaces in two different languages. However, most previous methods treat word embeddings as isolated entities and fail to jointly consider both the intra-space and inter-space topological relations between words. This limitation makes it challenging to align words from embedding spaces with distinct topological structures, especially when the assumption of isomorphism may not hold. To this end, we propose a novel approach called the Structure-Aware Generative Adversarial Network (SA-GAN) model to explicitly capture multiple topological structure information to achieve accurate …

Go to article

Improving Adjacency List Storage Methods For Polypeptide Similarity Analysis, Arianna Swensen Dec 2022

Improving Adjacency List Storage Methods For Polypeptide Similarity Analysis, Arianna Swensen

Honors Theses

Protein design is a complex biomolecular and computational problem. Working on increasingly large protein folding problems requires an improvement in current analysis methods available. This work first discusses various methods of protein design, including de novo protein design, which is the primary focus of this thesis. Then, a new approach utilizing a B+ tree to effectively store and query a graph of keys and vertices is proposed in order to store the number of times two polypeptides are considered to be similar. This approach is found to have a reduction in time complexity from current mapping methods and thus provides …

Go to article

Application Of Probabilistic Ranking Systems On Women’S Junior Division Beach Volleyball, Cameron Stewart, Michael Mazel, Bivin Sadler Sep 2022

Application Of Probabilistic Ranking Systems On Women’S Junior Division Beach Volleyball, Cameron Stewart, Michael Mazel, Bivin Sadler

SMU Data Science Review

Women’s beach volleyball is one of the fastest growing collegiate sports today. The increase in popularity has come with an increase in valuable scholarship opportunities across the country. With thousands of athletes to sort through, college scouts depend on websites that aggregate tournament results and rank players nationally. This project partnered with the company Volleyball Life, who is the current market leader in the ranking space of junior beach volleyball players. Utilizing the tournament information provided by Volleyball Life, this study explored replacements to the current ranking systems, which are designed to aggregate player points from recent tournament placements. Three …

Go to article

Unsupervised Structural Graph Node Representation Learning, Mikel Joaristi Dec 2020

Unsupervised Structural Graph Node Representation Learning, Mikel Joaristi

Boise State University Theses and Dissertations

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, …

Go to article

Towards Distributed Node Similarity Search On Graphs, Tianming Zhang, Yunjun Gao, Baihua Zheng, Lu Chen, Shiting Wen, Wei Guo Jun 2020

Towards Distributed Node Similarity Search On Graphs, Tianming Zhang, Yunjun Gao, Baihua Zheng, Lu Chen, Shiting Wen, Wei Guo

Research Collection School Of Computing and Information Systems

Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In this paper, we propose an efficiently distributed framework to support node similarity search on massive graphs, which considers both graph structure correlation and node attribute similarity in metric spaces. The framework consists of preprocessing stage and query stage. In the preprocessing stage, a parallel KD-tree construction (KDC) algorithm is developed to form a newly …

Go to article

A Mathematical Analysis Of The Game Of Santorini, Carson Clyde Geissler Jan 2020

A Mathematical Analysis Of The Game Of Santorini, Carson Clyde Geissler

Senior Independent Study Theses

Santorini is a two player combinatorial board game. Santorini bears resemblance to the graph theory game of Geography, a game of moving and deleting vertices on a graph. We explore Santorini with game theory, complexity theory, and artiﬁcial intelligence. We present David Lichtenstein’s proof that Geography is PSPACE-hard and adapt the proof for generalized forms of Santorini. Last, we discuss the development of an AI built for a software implementation of Santorini and present a number of improvements to that AI.

Go to article

Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen Sep 2019

Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen

Research Collection School Of Computing and Information Systems

Reachability computation is a fundamental graph functionality with a wide range of applications. In spite of this, little work has as yet been done on efficient reachability queries over temporal graphs, which are used extensively to model time-varying networks, such as communication networks, social networks, and transportation schedule networks. Moreover, we are faced with increasingly large real-world temporal networks that may be distributed across multiple data centers. This state of affairs motivates the paper's study of efficient reachability queries on distributed temporal graphs. We propose an efficient index, called Temporal Vertex Labeling (TVL), which is a labeling scheme for distributed …

Go to article

Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras Apr 2019

Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras

Research Collection School Of Computing and Information Systems

An information dissemination campaign is often multifaceted, involving several facets or pieces of information disseminating from different sources. The question then arises, how should we assign such pieces to eligible sources so as to achieve the best viral dissemination results? Past research has studied the problem of Influence Maximization (IM), which is to select a set of k promoters that maximizes the expected reach of a message over a network. However, in this classical IM problem, each promoter spreads out the same unitary piece of information. In this paper, we propose the Optimal Influential Pieces Assignment (OIPA) problem, which is …

Go to article

@Yourlocation: A Spatial Analysis Of Geotagged Tweets In The Us, Ocean Mckinney Jan 2019

@Yourlocation: A Spatial Analysis Of Geotagged Tweets In The Us, Ocean Mckinney

CMC Senior Theses

This project examines the spatial network properties observable from geo-located tweet data. Conventional exploration examines characteristics of a variety of network attributes, but few employ spatial edge correlations in their analysis. Recent studies have demonstrated the improvements that these correlations contribute to drawing conclusions about network structure. This thesis expands upon social network research utilizing spatial edge correlations and presents processing and formatting techniques for JSON (JavaScript Object Notation) data.

Go to article

Variance Of Clusterings On Graphs, Thomas Vlado Mulc Apr 2016

Variance Of Clusterings On Graphs, Thomas Vlado Mulc

Mathematical Sciences Technical Reports (MSTR)

Graphs that represent data often have structures or characteristics that can represent some relationships in the data. One of these structures is clusters or community structures. Most clustering algorithms for graphs are deterministic, which means they will output the same clustering each time. We investigated a few stochastic algorithms, and look into the consistency of their clusterings.

Go to article

A Dynamic Programming Algorithm For Finding The Optimal Placement Of A Secondary Structure Topology In Cryo-Em Data, Abhishek Biswas, Desh Ranjan, Mohammad Zubair, Jing He Jan 2015

A Dynamic Programming Algorithm For Finding The Optimal Placement Of A Secondary Structure Topology In Cryo-Em Data, Abhishek Biswas, Desh Ranjan, Mohammad Zubair, Jing He

Computer Science Faculty Publications

The determination of secondary structure topology is a critical step in deriving the atomic structures from the protein density maps obtained from electron cryomicroscopy technique. This step often relies on matching the secondary structure traces detected from the protein density map to the secondary structure sequence segments predicted from the amino acid sequence. Due to inaccuracies in both sources of information, a pool of possible secondary structure positions needs to be sampled. One way to approach the problem is to first derive a small number of possible topologies using existing matching algorithms, and then find the optimal placement for each …

Go to article

Theory and Algorithms Commons^™

Full-Text Articles in Theory and Algorithms

Foundations Of Node Representation Learning, Sudhanshu Chanpuriya

Doctoral Dissertations

A Structure-Aware Generative Adversarial Network For Bilingual Lexicon Induction, Bocheng Han, Qian Tao, Lusi Li, Zhihao Xiong

Computer Science Faculty Publications

Improving Adjacency List Storage Methods For Polypeptide Similarity Analysis, Arianna Swensen

Honors Theses

Application Of Probabilistic Ranking Systems On Women’S Junior Division Beach Volleyball, Cameron Stewart, Michael Mazel, Bivin Sadler

SMU Data Science Review

Unsupervised Structural Graph Node Representation Learning, Mikel Joaristi

Boise State University Theses and Dissertations

Towards Distributed Node Similarity Search On Graphs, Tianming Zhang, Yunjun Gao, Baihua Zheng, Lu Chen, Shiting Wen, Wei Guo

Research Collection School Of Computing and Information Systems

A Mathematical Analysis Of The Game Of Santorini, Carson Clyde Geissler

Senior Independent Study Theses

Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen

Research Collection School Of Computing and Information Systems

Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras

Research Collection School Of Computing and Information Systems

@Yourlocation: A Spatial Analysis Of Geotagged Tweets In The Us, Ocean Mckinney

CMC Senior Theses

Variance Of Clusterings On Graphs, Thomas Vlado Mulc

Mathematical Sciences Technical Reports (MSTR)

A Dynamic Programming Algorithm For Finding The Optimal Placement Of A Secondary Structure Topology In Cryo-Em Data, Abhishek Biswas, Desh Ranjan, Mohammad Zubair, Jing He

Computer Science Faculty Publications