Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 19 of 19

Full-Text Articles in Databases and Information Systems

Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii Dec 2021

Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii

Publications and Research

The spaces we live in go through many transformations over the course of a year, a month, or a day; My room has seen tremendous clutter and pristine order within the span of a few hours. My goal is to discover patterns within my space and formulate an understanding of the changes that occur. This insight will provide actionable direction for maintaining a cleaner environment, as well as provide some information about the optimal times for productivity and energy preservation.

Using a Raspberry Pi, I will set up automated image capture in a room in my home. These images will …


Multilateration Index., Chip Lynch Aug 2021

Multilateration Index., Chip Lynch

Electronic Theses and Dissertations

We present an alternative method for pre-processing and storing point data, particularly for Geospatial points, by storing multilateration distances to fixed points rather than coordinates such as Latitude and Longitude. We explore the use of this data to improve query performance for some distance related queries such as nearest neighbor and query-within-radius (i.e. “find all points in a set P within distance d of query point q”). Further, we discuss the problem of “Network Adequacy” common to medical and communications businesses, to analyze questions such as “are at least 90% of patients living within 50 miles of a covered emergency …


Thunderrw: An In-Memory Graph Random Walk Engine, Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, Yuchen Li Aug 2021

Thunderrw: An In-Memory Graph Random Walk Engine, Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, Yuchen Li

Research Collection School Of Computing and Information Systems

As random walk is a powerful tool in many graph processing, mining and learning applications, this paper proposes an efficient inmemory random walk engine named ThunderRW. Compared with existing parallel systems on improving the performance of a single graph operation, ThunderRW supports massive parallel random walks. The core design of ThunderRW is motivated by our profiling results: common RW algorithms have as high as 73.1% CPU pipeline slots stalled due to irregular memory access, which suffers significantly more memory stalls than the conventional graph workloads such as BFS and SSSP. To improve the memory efficiency, we first design a generic …


Context-Aware Outstanding Fact Mining From Knowledge Graphs, Yueji Yang, Yuchen Li, Panagiotis Karras, Anthony Tung Aug 2021

Context-Aware Outstanding Fact Mining From Knowledge Graphs, Yueji Yang, Yuchen Li, Panagiotis Karras, Anthony Tung

Research Collection School Of Computing and Information Systems

An Outstanding Fact (OF) is an attribute that makes a target entity stand out from its peers. The mining of OFs has important applications, especially in Computational Journalism, such as news promotion, fact-checking, and news story finding. However, existing approaches to OF mining: (i) disregard the context in which the target entity appears, hence may report facts irrelevant to that context; and (ii) require relational data, which are often unavailable or incomplete in many application domains. In this paper, we introduce the novel problem of mining Contextaware Outstanding Facts (COFs) for a target entity under a given context specified by …


Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao Jul 2021

Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao

Graduate Theses and Dissertations

Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users' data may contain private information that needs to be protected.

Cloud computing has become more and more popular in …


On M-Impact Regions And Standing Top-K Influence Problems, Bo Tang, Kyriakos Mouratidis, Mingji Han Jun 2021

On M-Impact Regions And Standing Top-K Influence Problems, Bo Tang, Kyriakos Mouratidis, Mingji Han

Research Collection School Of Computing and Information Systems

In this paper, we study the ��-impact region problem (mIR). In a context where users look for available products with top-�� queries, mIR identifies the part of the product space that attracts the most user attention. Specifically, mIR determines the kind of attribute values that lead a (new or existing) product to the top-�� result for at least a fraction of the user population. mIR has several applications, ranging from effective marketing to product improvement. Importantly, it also leads to (exact and efficient) solutions for standing top-�� impact problems, which were previously solved heuristically only, or whose current solutions face …


Hierarchical Reinforcement Learning: A Comprehensive Survey, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek Jun 2021

Hierarchical Reinforcement Learning: A Comprehensive Survey, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek

Research Collection School Of Computing and Information Systems

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future …


Minimum Coresets For Maxima Representation Of Multidimensional Data, Yanhao Wang, Michael Mathioudakis, Yuchen Li, Kian-Lee Tan Jun 2021

Minimum Coresets For Maxima Representation Of Multidimensional Data, Yanhao Wang, Michael Mathioudakis, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Coresets are succinct summaries of large datasets such that, for a given problem, the solution obtained from a coreset is provably competitive with the solution obtained from the full dataset. As such, coreset-based data summarization techniques have been successfully applied to various problems, e.g., geometric optimization, clustering, and approximate query processing, for scaling them up to massive data. In this paper, we study coresets for the maxima representation of multidimensional data: Given a set �� of points in R �� , where �� is a small constant, and an error parameter �� ∈ (0, 1), a subset �� ⊆ �� …


Securing Fog Federation From Behavior Of Rogue Nodes, Mohammed Saleh H. Alshehri May 2021

Securing Fog Federation From Behavior Of Rogue Nodes, Mohammed Saleh H. Alshehri

Graduate Theses and Dissertations

As the technological revolution advanced information security evolved with an increased need for confidential data protection on the internet. Individuals and organizations typically prefer outsourcing their confidential data to the cloud for processing and storage. As promising as the cloud computing paradigm is, it creates challenges; everything from data security to time latency issues with data computation and delivery to end-users. In response to these challenges CISCO introduced the fog computing paradigm in 2012. The intent was to overcome issues such as time latency and communication overhead and to bring computing and storage resources close to the ground and the …


Mapping Renewal: How An Unexpected Interdisciplinary Collaboration Transformed A Digital Humanities Project, Elise Tanner, Geoffrey Joseph Apr 2021

Mapping Renewal: How An Unexpected Interdisciplinary Collaboration Transformed A Digital Humanities Project, Elise Tanner, Geoffrey Joseph

Digital Initiatives Symposium

Funded by a National Endowment for Humanities (NEH) Humanities Collections and Reference Resources Foundations Grant, the UA Little Rock Center for Arkansas History and Culture’s “Mapping Renewal” pilot project focused on creating access to and providing spatial context to archival materials related to racial segregation and urban renewal in the city of Little Rock, Arkansas, from 1954-1989. An unplanned interdisciplinary collaboration with the UA Little Rock Arkansas Economic Development Institute (AEDI) has proven to be an invaluable partnership. One team member from each department will demonstrate the Mapping Renewal website and discuss how the collaborative process has changed and shaped …


Lecture 11: The Road To Exascale And Legacy Software For Dense Linear Algebra, Jack Dongarra Apr 2021

Lecture 11: The Road To Exascale And Legacy Software For Dense Linear Algebra, Jack Dongarra

Mathematical Sciences Spring Lecture Series

In this talk, we will look at the current state of high performance computing and look at the next stage of extreme computing. With extreme computing, there will be fundamental changes in the character of floating point arithmetic and data movement. In this talk, we will look at how extreme-scale computing has caused algorithm and software developers to change their way of thinking on implementing and program-specific applications.


Dycuckoo: Dynamic Hash Tables On Gpus, Yuchen Li, Qiwei Zhu, Zheng Lyu, Zhongdong Huang, Jianling Sun Apr 2021

Dycuckoo: Dynamic Hash Tables On Gpus, Yuchen Li, Qiwei Zhu, Zheng Lyu, Zhongdong Huang, Jianling Sun

Research Collection School Of Computing and Information Systems

The hash table is a fundamental structure that has been implemented on graphics processing units (GPUs) to accelerate a wide range of analytics workloads. Most existing works have focused on static scenarios and occupy large GPU memory to maximize the insertion efficiency. In many cases, data stored in hash tables get updated dynamically, and existing approaches use unnecessarily large memory resources. One naïve solution is to rebuild a hash table (known as rehashing) whenever it is either filled or mostly empty. However, this approach renders significant overheads for rehashing. In this paper, we propose a novel dynamic cuckoo hash table …


Towards Efficient Motif-Based Graph Partitioning: An Adaptive Sampling Approach, Shixun Huang, Yuchen Li, Zhifeng Bao, Zhao Li Apr 2021

Towards Efficient Motif-Based Graph Partitioning: An Adaptive Sampling Approach, Shixun Huang, Yuchen Li, Zhifeng Bao, Zhao Li

Research Collection School Of Computing and Information Systems

In this paper, we study the problem of efficient motif-based graph partitioning (MGP). We observe that existing methods require to enumerate all motif instances to compute the exact edge weights for partitioning. However, the enumeration is prohibitively expensive against large graphs. We thus propose a sampling-based MGP (SMGP) framework that employs an unbiased sampling mechanism to efficiently estimate the edge weights while trying to preserve the partitioning quality. To further improve the effectiveness, we propose a novel adaptive sampling framework called SMGP+. SMGP+ iteratively partitions the input graph based on up-to-date estimated edge weights, and adaptively adjusts the sampling distribution …


Dram Failure Prediction In Aiops: Empirical Evaluation, Challenges And Opportunities, Zhiyue Wu, Hongzuo Xu, Guansong Pang, Fengyuan Yu, Yijie Wang, Songlei Jian, Yongjun Wang Apr 2021

Dram Failure Prediction In Aiops: Empirical Evaluation, Challenges And Opportunities, Zhiyue Wu, Hongzuo Xu, Guansong Pang, Fengyuan Yu, Yijie Wang, Songlei Jian, Yongjun Wang

Research Collection School Of Computing and Information Systems

DRAM failure prediction is a vital task in AIOps, which is crucial to maintain the reliability and sustainable service of large-scale data centers. However, limited work has been done on DRAM failure prediction mainly due to the lack of public available datasets. This paper presents a comprehensive empirical evaluation of diverse machine learning techniques for DRAM failure prediction using a large-scale multisource dataset, including more than three millions of records of kernel, address, and mcelog data, provided by Alibaba Cloud through PAKDD 2021 competition. Particularly, we first formulate the problem as a multiclass classification task and exhaustively evaluate seven popular/stateof-the-art …


Efficient Retrieval Of Matrix Factorization-Based Top-K Recommendations: A Survey Of Recent Approaches, Duy Dung Le, Hady W. Lauw Apr 2021

Efficient Retrieval Of Matrix Factorization-Based Top-K Recommendations: A Survey Of Recent Approaches, Duy Dung Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Top-k recommendation seeks to deliver a personalized list of k items to each individual user. An established methodology in the literature based on matrix factorization (MF), which usually represents users and items as vectors in low-dimensional space, is an effective approach to recommender systems, thanks to its superior performance in terms of recommendation quality and scalability. A typical matrix factorization recommender system has two main phases: preference elicitation and recommendation retrieval. The former analyzes user-generated data to learn user preferences and item characteristics in the form of latent feature vectors, whereas the latter ranks the candidate items based on the …


Dbl: Efficient Reachability Queries On Dynamic Graphs, Qiuyi Lyu, Yuchen Li, Bingsheng He, Bin Gong Apr 2021

Dbl: Efficient Reachability Queries On Dynamic Graphs, Qiuyi Lyu, Yuchen Li, Bingsheng He, Bin Gong

Research Collection School Of Computing and Information Systems

Reachability query is a fundamental problem on graphs, which has been extensively studied in academia and industry. Since graphs are subject to frequent updates in many applications, it is essential to support efficient graph updates while offering good performance in reachability queries. Existing solutions compress the original graph with the Directed Acyclic Graph (DAG) and propose efficient query processing and index update techniques. However, they focus on optimizing the scenarios where the Strong Connected Components (SCCs) remain unchanged and have overlooked the prohibitively high cost of the DAG maintenance when SCCs are updated. In this paper, we propose DBL, an …


Newslink: Empowering Intuitive News Search With Knowledge Graphs, Yueji Yang, Yuchen Li, Anthony Tung Apr 2021

Newslink: Empowering Intuitive News Search With Knowledge Graphs, Yueji Yang, Yuchen Li, Anthony Tung

Research Collection School Of Computing and Information Systems

News search tools help end users to identify relevant news stories. However, existing search approaches often carry out in a "black-box" process. There is little intuition that helps users understand how the results are related to the query. In this paper, we propose a novel news search framework, called NEWSLINK, to empower intuitive news search by using relationship paths discovered from open Knowledge Graphs (KGs). Specifically, NEWSLINK embeds both a query and news documents to subgraphs, called subgraph embeddings, in the KG. Their embeddings' overlap induces relationship paths between the involving entities. Two major advantages are obtained by incorporating subgraph …


Boundary Precedence Image Inpainting Method Based On Self-Organizing Maps, Haibo Pen, Quan Wang, Zhaoxia Wang Apr 2021

Boundary Precedence Image Inpainting Method Based On Self-Organizing Maps, Haibo Pen, Quan Wang, Zhaoxia Wang

Research Collection School Of Computing and Information Systems

In addition to text data analysis, image analysis is an area that has increasingly gained importance in recent years because more and more image data have spread throughout the internet and real life. As an important segment of image analysis techniques, image restoration has been attracting a lot of researchers’ attention. As one of AI methodologies, Self-organizing Maps (SOMs) have been applied to a great number of useful applications. However, it has rarely been applied to the domain of image restoration. In this paper, we propose a novel image restoration method by leveraging the capability of SOMs, and we name …


Blockchain For A Resilient, Efficient, And Effective Supply Chain, Evidence From Cases, Adrian Gheorghe, Farinaz Sabz Ali Pour, Unal Tatar, Omer Faruk Keskin Jan 2021

Blockchain For A Resilient, Efficient, And Effective Supply Chain, Evidence From Cases, Adrian Gheorghe, Farinaz Sabz Ali Pour, Unal Tatar, Omer Faruk Keskin

Engineering Management & Systems Engineering Faculty Publications

In the modern acquisition, it is unrealistic to consider single entities as producing and delivering a product independently. Acquisitions usually take place through supply networks. Resiliency, efficiency, and effectiveness of supply networks directly contribute to the acquisition system's resiliency, efficiency, and effectiveness. All the involved firms form a part of a supply network essential to producing the product or service. The decision-makers have to look for new methodologies for supply chain management. Blockchain technology introduces new methods of decentralization and delegation of services, which can transform supply chains and result in a more resilient, efficient, and effective supply chain. This …