Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Engineering

Series

Institution
Keyword
Publication Year
Publication
File Type

Articles 1 - 30 of 268

Full-Text Articles in Databases and Information Systems

Immersive Japanese Language Learning Web Application Using Spaced Repetition, Active Recall, And An Artificial Intelligent Conversational Chat Agent Both In Voice And In Text, Marc Butler Apr 2024

Immersive Japanese Language Learning Web Application Using Spaced Repetition, Active Recall, And An Artificial Intelligent Conversational Chat Agent Both In Voice And In Text, Marc Butler

MS in Computer Science Project Reports

In the last two decades various human language learning applications, spaced repetition software, online dictionaries, and artificial intelligent chat agents have been developed. However, there is no solution to cohesively combine these technologies into a comprehensive language learning application including skills such as speaking, typing, listening, and reading. Our contribution is to provide an immersive language learning web application to the end user which combines spaced repetition, a study technique used to review information at systematic intervals, and active recall, the process of purposely retrieving information from memory during a review session, with an artificial intelligent conversational chat agent both …


A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami Jan 2024

A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami

VMASC Publications

Named Data Network (NDN) is proposed for the Internet as an information-centric architecture. Content storing in the router’s cache plays a significant role in NDN. When a router’s cache becomes full, a cache replacement policy determines which content should be discarded for the new content storage. This paper proposes a new cache replacement policy called Discard of Fast Retrievable Content (DFRC). In DFRC, the retrieval time of the content is evaluated using the FIB table information, and the content with less retrieval time receives more discard priority. An impact weight is also used to involve both the grade of retrieval …


Data Provenance Via Differential Auditing, Xin Mu, Ming Pang, Feida Zhu Nov 2023

Data Provenance Via Differential Auditing, Xin Mu, Ming Pang, Feida Zhu

Research Collection School Of Computing and Information Systems

With the rising awareness of data assets, data governance, which is to understand where data comes from, how it is collected, and how it is used, has been assuming evergrowing importance. One critical component of data governance gaining increasing attention is auditing machine learning models to determine if specific data has been used for training. Existing auditing techniques, like shadow auditing methods, have shown feasibility under specific conditions such as having access to label information and knowledge of training protocols. However, these conditions are often not met in most real-world applications. In this paper, we introduce a practical framework for …


A Dynamic Online Dashboard For Tracking The Performance Of Division 1 Basketball Athletic Performance, Erica Juliano, Chelsea Thakkar, Christopher B. Taber, Mehul S. Raval, Kaya Tolga, Samah Senbel Oct 2023

A Dynamic Online Dashboard For Tracking The Performance Of Division 1 Basketball Athletic Performance, Erica Juliano, Chelsea Thakkar, Christopher B. Taber, Mehul S. Raval, Kaya Tolga, Samah Senbel

School of Computer Science & Engineering Undergraduate Publications

Using Data Analytics is a vital part of sport performance enhancement. We collect data from the Division 1 'Women's basketball athletes and coaches at our university, for use in analysis and prediction. Several data sources are used daily and weekly: WHOOP straps, weekly surveys, polar straps, jump analysis, and training session information. In this paper, we present an online dashboard to visually present the data to the athletes and coaches. R shiny was used to develop the platform, with the data stored on the cloud for instant updates of the dashboard as the data becomes available. The performance of athletes …


When Routing Meets Recommendation: Solving Dynamic Order Recommendations Problem In Peer-To-Peer Logistics Platforms, Zhiqin Zhang, Waldy Joe, Yuyang Er, Hoong Chuin Lau Sep 2023

When Routing Meets Recommendation: Solving Dynamic Order Recommendations Problem In Peer-To-Peer Logistics Platforms, Zhiqin Zhang, Waldy Joe, Yuyang Er, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Peer-to-Peer (P2P) logistics platforms, unlike traditional last-mile logistics providers, do not have dedicated delivery resources (both vehicles and drivers). Thus, the efficiency of such operating model lies in the successful matching of demand and supply, i.e., how to match the delivery tasks with suitable drivers that will result in successful assignment and completion of the tasks. We consider a Same-Day Delivery Problem (SDDP) involving a P2P logistics platform where new orders arrive dynamically and the platform operator needs to generate a list of recommended orders to the crowdsourced drivers. We formulate this problem as a Dynamic Order Recommendations Problem (DORP). …


Niche: A Curated Dataset Of Engineered Machine Learning Projects In Python, Ratnadira Widyasari, Zhou Yang, Ferdian Thung, Sheng Qin Sim, Fiona Wee, Camellia Lok, Jack Phan, Haodi Qi, Constance Tan, David Lo, David Lo May 2023

Niche: A Curated Dataset Of Engineered Machine Learning Projects In Python, Ratnadira Widyasari, Zhou Yang, Ferdian Thung, Sheng Qin Sim, Fiona Wee, Camellia Lok, Jack Phan, Haodi Qi, Constance Tan, David Lo, David Lo

Research Collection School Of Computing and Information Systems

Machine learning (ML) has gained much attention and has been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those projects to curate ML projects of high quality. The limited availability of such a high-quality dataset poses an obstacle to understanding ML projects. To help clear this obstacle, we present NICHE, a manually labelled dataset consisting of 572 ML projects. Based on the evidence of good software engineering practices, we label 441 of these projects as engineered and 131 as non-engineered. This …


Nftdisk: Visual Detection Of Wash Trading In Nft Markets, Xiaolin Wen, Yong Wang, Xuanwu Yue, Feida Zhu, Min Zhu Apr 2023

Nftdisk: Visual Detection Of Wash Trading In Nft Markets, Xiaolin Wen, Yong Wang, Xuanwu Yue, Feida Zhu, Min Zhu

Research Collection School Of Computing and Information Systems

With the growing popularity of Non-Fungible Tokens (NFT), a new type of digital assets, various fraudulent activities have appeared in NFT markets. Among them, wash trading has become one of the most common frauds in NFT markets, which attempts to mislead investors by creating fake trading volumes. Due to the sophisticated patterns of wash trading, only a subset of them can be detected by automatic algorithms, and manual inspection is usually required. We propose NFTDisk, a novel visualization for investors to identify wash trading activities in NFT markets, where two linked visualization modules are presented: a radial visualization module with …


Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Learning Relation Prototype From Unlabeled Texts For Long-Tail Relation Extraction, Yixin Cao, Jun Kuang, Ming Gao, Aoying Zhou, Yonggang Wen, Tat-Seng Chua Feb 2023

Learning Relation Prototype From Unlabeled Texts For Long-Tail Relation Extraction, Yixin Cao, Jun Kuang, Ming Gao, Aoying Zhou, Yonggang Wen, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by extracting entity relations from texts. However, it usually suffers from the long-tail issue. The training data mainly concentrates on a few types of relations, leading to the lack of sufficient annotations for the remaining types of relations. In this paper, we propose a general approach to learn relation prototypes from unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient training data. We learn relation prototypes as an implicit factor between entities, which reflects the meanings of relations as well …


Mitigating Popularity Bias In Recommendation With Unbalanced Interactions: A Gradient Perspective, Weijieying Ren, Lei Wang, Kunpeng Liu, Ruocheng Guo, Ee-Peng Lim, Yanjie Fu Dec 2022

Mitigating Popularity Bias In Recommendation With Unbalanced Interactions: A Gradient Perspective, Weijieying Ren, Lei Wang, Kunpeng Liu, Ruocheng Guo, Ee-Peng Lim, Yanjie Fu

Research Collection School Of Computing and Information Systems

Recommender systems learn from historical user-item interactions to identify preferred items for target users. These observed interactions are usually unbalanced following a long-tailed distribution. Such long-tailed data lead to popularity bias to recommend popular but not personalized items to users. We present a gradient perspective to understand two negative impacts of popularity bias in recommendation model optimization: (i) the gradient direction of popular item embeddings is closer to that of positive interactions, and (ii) the magnitude of positive gradient for popular items are much greater than that of unpopular items. To address these issues, we propose a simple yet efficient …


An Attribute-Aware Attentive Gcn Model For Attribute Missing In Recommendation, Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie Sep 2022

An Attribute-Aware Attentive Gcn Model For Attribute Missing In Recommendation, Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie

Research Collection School Of Computing and Information Systems

As important side information, attributes have been widely exploited in the existing recommender system for better performance. However, in the real-world scenarios, it is common that some attributes of items/users are missing (e.g., some movies miss the genre data). Prior studies usually use a default value (i.e., "other") to represent the missing attribute, resulting in sub-optimal performance. To address this problem, in this paper, we present an attribute-aware attentive graph convolution network (A(2)-GCN). In particular, we first construct a graph, where users, items, and attributes are three types of nodes and their associations are edges. Thereafter, we leverage the graph …


Automated Identification Of Astronauts On Board The International Space Station: A Case Study In Space Archaeology, Rao Hamza Ali, Amir Kanan Kashefi, Alice C. Gorman, Justin St. P. Walsh, Erik J. Linstead Aug 2022

Automated Identification Of Astronauts On Board The International Space Station: A Case Study In Space Archaeology, Rao Hamza Ali, Amir Kanan Kashefi, Alice C. Gorman, Justin St. P. Walsh, Erik J. Linstead

Art Faculty Articles and Research

We develop and apply a deep learning-based computer vision pipeline to automatically identify crew members in archival photographic imagery taken on-board the International Space Station. Our approach is able to quickly tag thousands of images from public and private photo repositories without human supervision with high degrees of accuracy, including photographs where crew faces are partially obscured. Using the results of our pipeline, we carry out a large-scale network analysis of the crew, using the imagery data to provide novel insights into the social interactions among crew during their missions.


Smart Manufacturing—Theories, Methods, And Applications, Zhuming Bi, Lida Xu, Puren Ouyang Aug 2022

Smart Manufacturing—Theories, Methods, And Applications, Zhuming Bi, Lida Xu, Puren Ouyang

Information Technology & Decision Sciences Faculty Publications

(First paragraph) Smart manufacturing (SM) distinguishes itself from other system paradigms by introducing ‘smartness’ as a measure to a manufacturing system; however, researchers in different domains have different expectations of system smartness from their own perspectives. In this Special Issue (SI), SM refers to a system paradigm where digital technologies are deployed to enhance system smartness by (1) empowering physical resources in production, (2) utilizing virtual and dynamic assets over the internet to expand system capabilities, (3) supporting data-driven decision making at all domains and levels of businesses, or (4) reconfiguring systems to adapt changes and uncertainties in dynamic environments. …


Interpreting Trajectories From Multiple Views: A Hierarchical Self-Attention Network For Estimating The Time Of Arrival, Zebin Chen, Xiaolin Xiao, Yue-Jiao Gong, Jun Fang, Nan Ma, Hua Chai, Zhiguang Cao Aug 2022

Interpreting Trajectories From Multiple Views: A Hierarchical Self-Attention Network For Estimating The Time Of Arrival, Zebin Chen, Xiaolin Xiao, Yue-Jiao Gong, Jun Fang, Nan Ma, Hua Chai, Zhiguang Cao

Research Collection School Of Computing and Information Systems

Estimating the time of arrival is a crucial task in intelligent transportation systems. Although considerable efforts have been made to solve this problem, most of them decompose a trajectory into several segments and then compute the travel time by integrating the attributes from all segments. The segment view, though being able to depict the local traffic conditions straightforwardly, is insufficient to embody the intrinsic structure of trajectories on the road network. To overcome the limitation, this study proposes multi-view trajectory representation that comprehensively interprets a trajectory from the segment-, link-, and intersection-views. To fulfill the purpose, we design a hierarchical …


Harnessing Confidence For Report Aggregation In Crowdsourcing Environments, Hadeel Alhosaini, Xianzhi Wang, Lina Yao, Zhong Yang, Farookh Hussain, Ee-Peng Lim Jul 2022

Harnessing Confidence For Report Aggregation In Crowdsourcing Environments, Hadeel Alhosaini, Xianzhi Wang, Lina Yao, Zhong Yang, Farookh Hussain, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Crowdsourcing is an effective means of accomplishing human intelligence tasks by leveraging the collective wisdom of crowds. Given reports of various accuracy degrees from workers, it is important to make wise use of these reports to derive accurate task results. Intuitively, a task result derived from a sufficient number of reports bears lower uncertainty, and higher uncertainty otherwise. Existing report aggregation research, however, has largely neglected the above uncertainty issue. In this regard, we propose a novel report aggregation framework that defines and incorporates a new confidence measure to quantify the uncertainty associated with tasks and workers, thereby enhancing result …


Multi-Agent Reinforcement Learning For Traffic Signal Control Through Universal Communication Method, Qize Jiang, Minhao Qin, Shengmin Shi, Weiwei Sun Sun, Baihua Zheng Jul 2022

Multi-Agent Reinforcement Learning For Traffic Signal Control Through Universal Communication Method, Qize Jiang, Minhao Qin, Shengmin Shi, Weiwei Sun Sun, Baihua Zheng

Research Collection School Of Computing and Information Systems

How to coordinate the communication among intersections effectively in real complex traffic scenarios with multi-intersection is challenging. Existing approaches only enable the communication in a heuristic manner without considering the content/importance of information to be shared. In this paper, we propose a universal communication form UniComm between intersections. UniComm embeds massive observations collected at one agent into crucial predictions of their impact on its neighbors, which improves the communication efficiency and is universal across existing methods. We also propose a concise network UniLight to make full use of communications enabled by UniComm. Experimental results on real datasets demonstrate that UniComm …


Analyzing Offline Social Engagements: An Empirical Study Of Meetup Events Related To Software Development, Abhishek Sharma, Gede Artha Azriadi Prana, Anamika Sawhney, Nachiappan Nagappan, David Lo Mar 2022

Analyzing Offline Social Engagements: An Empirical Study Of Meetup Events Related To Software Development, Abhishek Sharma, Gede Artha Azriadi Prana, Anamika Sawhney, Nachiappan Nagappan, David Lo

Research Collection School Of Computing and Information Systems

Software developers use a variety of social mediachannels and tools in order to keep themselves up to date,collaborate with other developers, and find projects to contributeto. Meetup is one of such social media used by softwaredevelopers to organize community gatherings. We in this work,investigate the dynamics of Meetup groups and events relatedto software development. Our work is different from previouswork as we focus on the actual event and group data that wascollected using Meetup API.In this work, we performed an empirical study of eventsand groups present on Meetup which are related to softwaredevelopment. First, we identified 6,327 Meetup groups related …


Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead Jan 2022

Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead

Engineering Faculty Articles and Research

Large amounts of autism spectrum disorder (ASD) data is created through hospitals, therapy centers, and mobile applications; however, much of this rich data does not have pre-existing classes or labels. Large amounts of data—both genetic and behavioral—that are collected as part of scientific studies or a part of treatment can provide a deeper, more nuanced insight into both diagnosis and treatment of ASD. This paper reviews 43 papers using unsupervised machine learning in ASD, including k-means clustering, hierarchical clustering, model-based clustering, and self-organizing maps. The aim of this review is to provide a survey of the current uses of …


An Empirical Study On The Impact Of Deep Parameters On Mobile App Energy Usage, Qiang Xu, James C. Davis, Y Charlie Hu, Abhilash Jindal Jan 2022

An Empirical Study On The Impact Of Deep Parameters On Mobile App Energy Usage, Qiang Xu, James C. Davis, Y Charlie Hu, Abhilash Jindal

Department of Electrical and Computer Engineering Faculty Publications

Improving software performance through configuration parameter tuning is a common activity during software maintenance. Beyond traditional performance metrics like latency, mobile app developers are interested in reducing app energy usage. Some mobile apps have centralized locations for parameter tuning, similar to databases and operating systems, but it is common for mobile apps to have hundreds of parameters scattered around the source code. The correlation between these "deep" parameters and app energy usage is unclear. Researchers have studied the energy effects of deep parameters in specific modules, but we lack a systematic understanding of the energy impact of mobile deep parameters. …


The State Of The Art Of Information Integration In Space Applications, Zhuming Bi, K. L. Yung, Andrew W.H. Ip., Yuk Ming Tang, Chris W.J. Zhang, Li Da Xu Jan 2022

The State Of The Art Of Information Integration In Space Applications, Zhuming Bi, K. L. Yung, Andrew W.H. Ip., Yuk Ming Tang, Chris W.J. Zhang, Li Da Xu

Information Technology & Decision Sciences Faculty Publications

This paper aims to present a comprehensive survey on information integration (II) in space informatics. With an ever-increasing scale and dynamics of complex space systems, II has become essential in dealing with the complexity, changes, dynamics, and uncertainties of space systems. The applications of space II (SII) require addressing some distinctive functional requirements (FRs) of heterogeneity, networking, communication, security, latency, and resilience; while limited works are available to examine recent advances of SII thoroughly. This survey helps to gain the understanding of the state of the art of SII in sense that (1) technical drivers for SII are discussed and …


Non-Parametric Stochastic Autoencoder Model For Anomaly Detection, Raphael B. Alampay, Patricia Angela R. Abu Jan 2022

Non-Parametric Stochastic Autoencoder Model For Anomaly Detection, Raphael B. Alampay, Patricia Angela R. Abu

Department of Information Systems & Computer Science Faculty Publications

Anomaly detection is a widely studied field in computer science with applications ranging from intrusion detection, fraud detection, medical diagnosis and quality assurance in manufacturing. The underlying premise is that an anomaly is an observation that does not conform to what is considered to be normal. This study addresses two major problems in the field. First, anomalies are defined in a local context, that is, being able to give quantitative measures as to how anomalies are categorized within its own problem domain and cannot be generalized to other domains. Commonly, anomalies are measured according to statistical probabilities relative to the …


Machine Learning In Requirements Elicitation: A Literature Review, Cheligeer Cheligeer, Jingwei Huang, Guosong Wu, Nadia Bhuiyan, Yuan Xu, Yong Zeng Jan 2022

Machine Learning In Requirements Elicitation: A Literature Review, Cheligeer Cheligeer, Jingwei Huang, Guosong Wu, Nadia Bhuiyan, Yuan Xu, Yong Zeng

Engineering Management & Systems Engineering Faculty Publications

A growing trend in requirements elicitation is the use of machine learning (ML) techniques to automate the cumbersome requirement handling process. This literature review summarizes and analyzes studies that incorporate ML and natural language processing (NLP) into demand elicitation. We answer the following research questions: (1) What requirement elicitation activities are supported by ML? (2) What data sources are used to build ML-based requirement solutions? (3) What technologies, algorithms, and tools are used to build ML-based requirement elicitation? (4) How to construct an ML-based requirements elicitation method? (5) What are the available tools to support ML-based requirements elicitation methodology? Keywords …


Evaluating Technology-Mediated Collaborative Workflows For Telehealth, Christopher Bondy Ph.D., Pengcheng Shi, Pamela Grover Md, Vicki Hanson, Linlin Chen, Rui Li Dec 2021

Evaluating Technology-Mediated Collaborative Workflows For Telehealth, Christopher Bondy Ph.D., Pengcheng Shi, Pamela Grover Md, Vicki Hanson, Linlin Chen, Rui Li

Articles

Goals: This paper discusses the need for a predictable method to evaluate gains and gaps of collaborative technology-mediated workflows and introduces an evaluation framework to address this need. Methods: The Collaborative Space Analysis Framework (CS-AF), introduced in this research, is a cross-disciplinary evaluation method designed to evaluate technology-mediated collaborative workflows. The 5-step CS-AF approach includes: (1) current-state workflow definition, (2) current-state (baseline) workflow assessment, (3) technology-mediated workflow development and deployment, (4) technology-mediated workflow assessment, (5) analysis, and conclusions. For this research, a comprehensive, empirical study of hypertension exam workflow for telehealth was conducted using the CS-AF approach. Results: The CS-AF …


Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii Dec 2021

Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii

Publications and Research

The spaces we live in go through many transformations over the course of a year, a month, or a day; My room has seen tremendous clutter and pristine order within the span of a few hours. My goal is to discover patterns within my space and formulate an understanding of the changes that occur. This insight will provide actionable direction for maintaining a cleaner environment, as well as provide some information about the optimal times for productivity and energy preservation.

Using a Raspberry Pi, I will set up automated image capture in a room in my home. These images will …


Crest Or Trough? How Research Libraries Used Emerging Technologies To Survive The Pandemic, So Far, Scout Calvert Oct 2021

Crest Or Trough? How Research Libraries Used Emerging Technologies To Survive The Pandemic, So Far, Scout Calvert

UNL Libraries: Faculty Publications

Introduction

In the first months of the COVID-19 pandemic, it was impossible to tell if we were at the crest of a wave of new transmissions, or a trough of a much larger wave, still yet to peak. As of this writing, as colleges and universities prepare for mostly in-person fall 2021 semesters, case counts in the United States are increasing again after a decline that coincided with easier access to the COVID vaccine. Plans for a return to campus made with confidence this spring may be in doubt, as we climb the curve of what is already the second …


Digitally Reporting Trail Obstructions In Forest Park, Colton S. Maybee Aug 2021

Digitally Reporting Trail Obstructions In Forest Park, Colton S. Maybee

REU Final Reports

The inclusion of technology on the trail can lead to better experiences for everyone involved in the hobby. Hikers can play a more prominent role in the maintenance of the trails by being able to provide better reports of obstructions while directly on the trail. This paper goes into the project of revamping the obstruction report system applied at Forest Park in Portland, Oregon. Most of my contributions to the project focus on mobile app development with some research into path planning algorithms related to the continuations of this project.


Forest Park Trail Monitoring, Adan Robles, Colton S. Maybee, Erin Dougherty Aug 2021

Forest Park Trail Monitoring, Adan Robles, Colton S. Maybee, Erin Dougherty

REU Final Reports

Forest Park, one of the largest public parks in the United States with over 40 trails to pick from when planning a hiking trip. One of the main problems this park has is that there are too many trails, and a lot of the trails extend over 3 miles. Due to these circumstances’ trails are not checked frequently and hikers are forced to hike trails in the area with no warnings of potential hazards they can encounter. In this paper I researched how Forest Park currently monitors its trails and then set up a goal to solve the problem. We …


Thunderrw: An In-Memory Graph Random Walk Engine, Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, Yuchen Li Aug 2021

Thunderrw: An In-Memory Graph Random Walk Engine, Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, Yuchen Li

Research Collection School Of Computing and Information Systems

As random walk is a powerful tool in many graph processing, mining and learning applications, this paper proposes an efficient inmemory random walk engine named ThunderRW. Compared with existing parallel systems on improving the performance of a single graph operation, ThunderRW supports massive parallel random walks. The core design of ThunderRW is motivated by our profiling results: common RW algorithms have as high as 73.1% CPU pipeline slots stalled due to irregular memory access, which suffers significantly more memory stalls than the conventional graph workloads such as BFS and SSSP. To improve the memory efficiency, we first design a generic …


Vehicle Routing: Review Of Benchmark Datasets, Aldy Gunawan, Graham Kendall, Barry Mccollum, Hsin-Vonn Seow, Lai Soon Lee Aug 2021

Vehicle Routing: Review Of Benchmark Datasets, Aldy Gunawan, Graham Kendall, Barry Mccollum, Hsin-Vonn Seow, Lai Soon Lee

Research Collection School Of Computing and Information Systems

The Vehicle Routing Problem (VRP) was formally presented to the scientific literature in 1959 by Dantzig and Ramser (DOI:10.1287/mnsc.6.1.80). Sixty years on, the problem is still heavily researched, with hundreds of papers having been published addressing this problem and the many variants that now exist. Many datasets have been proposed to enable researchers to compare their algorithms using the same problem instances where either the best known solution is known or, in some cases, the optimal solution is known. In this survey paper, we provide a list of Vehicle Routing Problem datasets, categorized to enable researchers to have easy access …


Context-Aware Outstanding Fact Mining From Knowledge Graphs, Yueji Yang, Yuchen Li, Panagiotis Karras, Anthony Tung Aug 2021

Context-Aware Outstanding Fact Mining From Knowledge Graphs, Yueji Yang, Yuchen Li, Panagiotis Karras, Anthony Tung

Research Collection School Of Computing and Information Systems

An Outstanding Fact (OF) is an attribute that makes a target entity stand out from its peers. The mining of OFs has important applications, especially in Computational Journalism, such as news promotion, fact-checking, and news story finding. However, existing approaches to OF mining: (i) disregard the context in which the target entity appears, hence may report facts irrelevant to that context; and (ii) require relational data, which are often unavailable or incomplete in many application domains. In this paper, we introduce the novel problem of mining Contextaware Outstanding Facts (COFs) for a target entity under a given context specified by …