Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Computer Engineering

Thunderrw: An In-Memory Graph Random Walk Engine, Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, Yuchen Li Aug 2021

Thunderrw: An In-Memory Graph Random Walk Engine, Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, Yuchen Li

Research Collection School Of Computing and Information Systems

As random walk is a powerful tool in many graph processing, mining and learning applications, this paper proposes an efficient inmemory random walk engine named ThunderRW. Compared with existing parallel systems on improving the performance of a single graph operation, ThunderRW supports massive parallel random walks. The core design of ThunderRW is motivated by our profiling results: common RW algorithms have as high as 73.1% CPU pipeline slots stalled due to irregular memory access, which suffers significantly more memory stalls than the conventional graph workloads such as BFS and SSSP. To improve the memory efficiency, we first design a generic …


Context-Aware Outstanding Fact Mining From Knowledge Graphs, Yueji Yang, Yuchen Li, Panagiotis Karras, Anthony Tung Aug 2021

Context-Aware Outstanding Fact Mining From Knowledge Graphs, Yueji Yang, Yuchen Li, Panagiotis Karras, Anthony Tung

Research Collection School Of Computing and Information Systems

An Outstanding Fact (OF) is an attribute that makes a target entity stand out from its peers. The mining of OFs has important applications, especially in Computational Journalism, such as news promotion, fact-checking, and news story finding. However, existing approaches to OF mining: (i) disregard the context in which the target entity appears, hence may report facts irrelevant to that context; and (ii) require relational data, which are often unavailable or incomplete in many application domains. In this paper, we introduce the novel problem of mining Contextaware Outstanding Facts (COFs) for a target entity under a given context specified by …


Minimum Coresets For Maxima Representation Of Multidimensional Data, Yanhao Wang, Michael Mathioudakis, Yuchen Li, Kian-Lee Tan Jun 2021

Minimum Coresets For Maxima Representation Of Multidimensional Data, Yanhao Wang, Michael Mathioudakis, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Coresets are succinct summaries of large datasets such that, for a given problem, the solution obtained from a coreset is provably competitive with the solution obtained from the full dataset. As such, coreset-based data summarization techniques have been successfully applied to various problems, e.g., geometric optimization, clustering, and approximate query processing, for scaling them up to massive data. In this paper, we study coresets for the maxima representation of multidimensional data: Given a set �� of points in R �� , where �� is a small constant, and an error parameter �� ∈ (0, 1), a subset �� ⊆ �� …


On M-Impact Regions And Standing Top-K Influence Problems, Bo Tang, Kyriakos Mouratidis, Mingji Han Jun 2021

On M-Impact Regions And Standing Top-K Influence Problems, Bo Tang, Kyriakos Mouratidis, Mingji Han

Research Collection School Of Computing and Information Systems

In this paper, we study the ��-impact region problem (mIR). In a context where users look for available products with top-�� queries, mIR identifies the part of the product space that attracts the most user attention. Specifically, mIR determines the kind of attribute values that lead a (new or existing) product to the top-�� result for at least a fraction of the user population. mIR has several applications, ranging from effective marketing to product improvement. Importantly, it also leads to (exact and efficient) solutions for standing top-�� impact problems, which were previously solved heuristically only, or whose current solutions face …


Hierarchical Reinforcement Learning: A Comprehensive Survey, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek Jun 2021

Hierarchical Reinforcement Learning: A Comprehensive Survey, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek

Research Collection School Of Computing and Information Systems

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future …


Efficient Retrieval Of Matrix Factorization-Based Top-K Recommendations: A Survey Of Recent Approaches, Duy Dung Le, Hady W. Lauw Apr 2021

Efficient Retrieval Of Matrix Factorization-Based Top-K Recommendations: A Survey Of Recent Approaches, Duy Dung Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Top-k recommendation seeks to deliver a personalized list of k items to each individual user. An established methodology in the literature based on matrix factorization (MF), which usually represents users and items as vectors in low-dimensional space, is an effective approach to recommender systems, thanks to its superior performance in terms of recommendation quality and scalability. A typical matrix factorization recommender system has two main phases: preference elicitation and recommendation retrieval. The former analyzes user-generated data to learn user preferences and item characteristics in the form of latent feature vectors, whereas the latter ranks the candidate items based on the …


Dbl: Efficient Reachability Queries On Dynamic Graphs, Qiuyi Lyu, Yuchen Li, Bingsheng He, Bin Gong Apr 2021

Dbl: Efficient Reachability Queries On Dynamic Graphs, Qiuyi Lyu, Yuchen Li, Bingsheng He, Bin Gong

Research Collection School Of Computing and Information Systems

Reachability query is a fundamental problem on graphs, which has been extensively studied in academia and industry. Since graphs are subject to frequent updates in many applications, it is essential to support efficient graph updates while offering good performance in reachability queries. Existing solutions compress the original graph with the Directed Acyclic Graph (DAG) and propose efficient query processing and index update techniques. However, they focus on optimizing the scenarios where the Strong Connected Components (SCCs) remain unchanged and have overlooked the prohibitively high cost of the DAG maintenance when SCCs are updated. In this paper, we propose DBL, an …


Boundary Precedence Image Inpainting Method Based On Self-Organizing Maps, Haibo Pen, Quan Wang, Zhaoxia Wang Apr 2021

Boundary Precedence Image Inpainting Method Based On Self-Organizing Maps, Haibo Pen, Quan Wang, Zhaoxia Wang

Research Collection School Of Computing and Information Systems

In addition to text data analysis, image analysis is an area that has increasingly gained importance in recent years because more and more image data have spread throughout the internet and real life. As an important segment of image analysis techniques, image restoration has been attracting a lot of researchers’ attention. As one of AI methodologies, Self-organizing Maps (SOMs) have been applied to a great number of useful applications. However, it has rarely been applied to the domain of image restoration. In this paper, we propose a novel image restoration method by leveraging the capability of SOMs, and we name …


Newslink: Empowering Intuitive News Search With Knowledge Graphs, Yueji Yang, Yuchen Li, Anthony Tung Apr 2021

Newslink: Empowering Intuitive News Search With Knowledge Graphs, Yueji Yang, Yuchen Li, Anthony Tung

Research Collection School Of Computing and Information Systems

News search tools help end users to identify relevant news stories. However, existing search approaches often carry out in a "black-box" process. There is little intuition that helps users understand how the results are related to the query. In this paper, we propose a novel news search framework, called NEWSLINK, to empower intuitive news search by using relationship paths discovered from open Knowledge Graphs (KGs). Specifically, NEWSLINK embeds both a query and news documents to subgraphs, called subgraph embeddings, in the KG. Their embeddings' overlap induces relationship paths between the involving entities. Two major advantages are obtained by incorporating subgraph …


Dram Failure Prediction In Aiops: Empirical Evaluation, Challenges And Opportunities, Zhiyue Wu, Hongzuo Xu, Guansong Pang, Fengyuan Yu, Yijie Wang, Songlei Jian, Yongjun Wang Apr 2021

Dram Failure Prediction In Aiops: Empirical Evaluation, Challenges And Opportunities, Zhiyue Wu, Hongzuo Xu, Guansong Pang, Fengyuan Yu, Yijie Wang, Songlei Jian, Yongjun Wang

Research Collection School Of Computing and Information Systems

DRAM failure prediction is a vital task in AIOps, which is crucial to maintain the reliability and sustainable service of large-scale data centers. However, limited work has been done on DRAM failure prediction mainly due to the lack of public available datasets. This paper presents a comprehensive empirical evaluation of diverse machine learning techniques for DRAM failure prediction using a large-scale multisource dataset, including more than three millions of records of kernel, address, and mcelog data, provided by Alibaba Cloud through PAKDD 2021 competition. Particularly, we first formulate the problem as a multiclass classification task and exhaustively evaluate seven popular/stateof-the-art …


Towards Efficient Motif-Based Graph Partitioning: An Adaptive Sampling Approach, Shixun Huang, Yuchen Li, Zhifeng Bao, Zhao Li Apr 2021

Towards Efficient Motif-Based Graph Partitioning: An Adaptive Sampling Approach, Shixun Huang, Yuchen Li, Zhifeng Bao, Zhao Li

Research Collection School Of Computing and Information Systems

In this paper, we study the problem of efficient motif-based graph partitioning (MGP). We observe that existing methods require to enumerate all motif instances to compute the exact edge weights for partitioning. However, the enumeration is prohibitively expensive against large graphs. We thus propose a sampling-based MGP (SMGP) framework that employs an unbiased sampling mechanism to efficiently estimate the edge weights while trying to preserve the partitioning quality. To further improve the effectiveness, we propose a novel adaptive sampling framework called SMGP+. SMGP+ iteratively partitions the input graph based on up-to-date estimated edge weights, and adaptively adjusts the sampling distribution …


Dycuckoo: Dynamic Hash Tables On Gpus, Yuchen Li, Qiwei Zhu, Zheng Lyu, Zhongdong Huang, Jianling Sun Apr 2021

Dycuckoo: Dynamic Hash Tables On Gpus, Yuchen Li, Qiwei Zhu, Zheng Lyu, Zhongdong Huang, Jianling Sun

Research Collection School Of Computing and Information Systems

The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naïve solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table …


Learning Adl Daily Routines With Spatiotemporal Neural Networks, Shan Gao, Ah-Hwee Tan, Rossi Setchi Jan 2021

Learning Adl Daily Routines With Spatiotemporal Neural Networks, Shan Gao, Ah-Hwee Tan, Rossi Setchi

Research Collection School Of Computing and Information Systems

The activities of daily living (ADLs) refer to the activities performed by individuals on a daily basis and are the indicators of a person’s habits, lifestyle, and wellbeing. Learning an individual’s ADL daily routines has significant value in the healthcare domain. Specifically, ADL recognition and inter-ADL pattern learning problems have been studied extensively in the past couple of decades. However, discovering the patterns performed in a day and clustering them into ADL daily routines has been a relatively unexplored research area. In this paper, a self-organizing neural network model, called the Spatiotemporal ADL Adaptive Resonance Theory (STADLART), is proposed for …