Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Databases and Information Systems

Implementing The Cms+ Sports Rankings Algorithm In A Javafx Environment, Luke Welch May 2022

Implementing The Cms+ Sports Rankings Algorithm In A Javafx Environment, Luke Welch

Industrial Engineering Undergraduate Honors Theses

Every year, sports teams and athletes get cut from championship opportunities because of their rank. While this reality is easier to swallow if a team or athlete is distant from the cut, it is much harder when they are right on the edge. Many times, it leaves fans and athletes wondering, “Why wasn’t I ranked higher? What factors when into the ranking? Are the rankings based on opinion alone?” These are fair questions that deserve an answer. Many times, sports rankings are derived from opinion polls. Other times, they are derived from a combination of opinion polls and measured performance. …


On Finding Two Posets That Cover Given Linear Orders, Ivy Ordanel, Proceso L. Fernandez Jr, Henry Adorna Oct 2019

On Finding Two Posets That Cover Given Linear Orders, Ivy Ordanel, Proceso L. Fernandez Jr, Henry Adorna

Department of Information Systems & Computer Science Faculty Publications

The Poset Cover Problem is an optimization problem where the goal is to determine a minimum set of posets that covers a given set of linear orders. This problem is relevant in the field of data mining, specifically in determining directed networks or models that explain the ordering of objects in a large sequential dataset. It is already known that the decision version of the problem is NP-Hard while its variation where the goal is to determine only a single poset that covers the input is in P. In this study, we investigate the variation, which we call the 2-Poset …


Distributed Similarity Queries In Metric Spaces, Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao Jun 2019

Distributed Similarity Queries In Metric Spaces, Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao

Research Collection School Of Computing and Information Systems

Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In …


Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras Apr 2019

Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras

Research Collection School Of Computing and Information Systems

An information dissemination campaign is often multifaceted, involving several facets or pieces of information disseminating from different sources. The question then arises, how should we assign such pieces to eligible sources so as to achieve the best viral dissemination results? Past research has studied the problem of Influence Maximization (IM), which is to select a set of k promoters that maximizes the expected reach of a message over a network. However, in this classical IM problem, each promoter spreads out the same unitary piece of information. In this paper, we propose the Optimal Influential Pieces Assignment (OIPA) problem, which is …


Distributed K-Nearest Neighbor Queries In Metric Spaces, Xin Ding, Yuanliang Zhang, Lu Chen, Yunjun Gao, Baihua Zheng Jul 2018

Distributed K-Nearest Neighbor Queries In Metric Spaces, Xin Ding, Yuanliang Zhang, Lu Chen, Yunjun Gao, Baihua Zheng

Research Collection School Of Computing and Information Systems

Metric k nearest neighbor (MkNN) queries have applications in many areas such as multimedia retrieval, computational biology, and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), which uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronously process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop an efficient estimation based MkNN method using AMDS to improve the query efficiency. Extensive experiments …


Metric Similarity Joins Using Mapreduce, Yunjun Gao, Keyu Yang, Lu Chen, Baihua Zheng, Gang Chen, Chun Chen Mar 2017

Metric Similarity Joins Using Mapreduce, Yunjun Gao, Keyu Yang, Lu Chen, Baihua Zheng, Gang Chen, Chun Chen

Research Collection School Of Computing and Information Systems

Given two object sets Q and O , a metric similarity join finds similar object pairs according to a certain criterion. This operation has a wide variety of applications in data cleaning, data mining, to name but a few. However, the rapidly growing volume of data nowadays challenges traditional metric similarity join methods, and thus, a distributed method is required. In this paper, we adopt a popular distributed framework, namely, MapReduce, to support scalable metric similarity joins. To ensure the load balancing, we present two sampling based partition methods. One utilizes the pivot and the space-filling curve mappings to cluster …


Answering Why-Not And Why Questions On Reverse Top-K Queries, Qing Liu, Yunjun Gao, Gang Chen, Baihua Zheng, Linlin Zhou Dec 2016

Answering Why-Not And Why Questions On Reverse Top-K Queries, Qing Liu, Yunjun Gao, Gang Chen, Baihua Zheng, Linlin Zhou

Research Collection School Of Computing and Information Systems

Why-not and why questions can be posed by database users to seek clarifications on unexpected query results. Specifically, why-not questions aim to explain why certain expected tuples are absent from the query results, while why questions try to clarify why certain unexpected tuples are present in the query results. This paper systematically explores the why-not and why questions on reverse top-k queries, owing to its importance in multi-criteria decision making. We first formalize why-not questions on reverse top-k queries, which try to include the missing objects in the reverse top-k query results, and then, we propose a unified framework called …


Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley Jul 2016

Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley

Computer Science Summer Fellows

Cyberbullying is a type of bullying that uses technology such as cell phones to harass or malign another person. To detect acts of cyberbullying, we are developing an algorithm that will detect cyberbullying in SMS (text) messages. Over 80,000 text messages have been collected by software installed on cell phones carried by participants in our study. This paper describes the development of the algorithm to detect cyberbullying messages, using the cell phone data collected previously. The algorithm works by first separating the messages into conversations in an automated way. The algorithm then analyzes the conversations and scores the severity and …


On Processing Reverse K-Skyband And Ranked Reverse Skyline Queries, Yunjun Gao, Qing Liu, Baihua Zheng, Mou Li, Gang Chen, Qing Li Feb 2015

On Processing Reverse K-Skyband And Ranked Reverse Skyline Queries, Yunjun Gao, Qing Liu, Baihua Zheng, Mou Li, Gang Chen, Qing Li

Research Collection School Of Computing and Information Systems

In this paper, for the first time, we identify and solve the problem of efficient reverse k-skyband (RkSB) query processing. Given a set P of multi-dimensional points and a query point q, an RkSB query returns all the points in P whose dynamic k-skyband contains q. We formalize RkSB retrieval, and then propose five algorithms for computing the RkSB of an arbitrary query point efficiently. Our methods utilize a conventional data-partitioning index (e.g., R-tree) on the dataset, and employ pre-computation, reuse and pruning techniques to boost the query efficiency. In addition, we extend our solutions to tackle an interesting variant …


Predictive Adaptive Resonance Theory And Knowledge Discovery In Databases, Ah-Hwee Tan, Hui-Shin Vivien Soon May 2000

Predictive Adaptive Resonance Theory And Knowledge Discovery In Databases, Ah-Hwee Tan, Hui-Shin Vivien Soon

Research Collection School Of Computing and Information Systems

This paper investigates the scalability of predictive Adaptive Resonance Theory (ART) networks for knowledge discovery in very large databases. Although predictive ART performs fast and incremental learning, the number of recognition categories or rules that it creates during learning may become substantially large and cause the learning speed to slow down. To tackle this problem, we introduce an on-line algorithm for evaluating and pruning categories during learning. Benchmark experiments on a large scale data set show that on-line pruning has been effective in reducing the number of the recognition categories and the time for convergence. Interestingly, the pruned networks also …


A Greedy Hypercube-Labeling Algorithm, D. Bhagavathi, C. E. Grosch, S. Olariu Jan 1994

A Greedy Hypercube-Labeling Algorithm, D. Bhagavathi, C. E. Grosch, S. Olariu

Computer Science Faculty Publications

Due to its attractive topological properties, the hypercube multiprocessor has emerged as one of the architectures of choice when it comes to implementing a large number of computational problems. In many such applications, Gray-code labelings of the hypercube are a crucial prerequisite for obtaining efficient algorithms. We propose a greedy algorithm that, given an n-dimensional hypercube H with N=22 nodes, returns a Gray-code labeling of H, that is, a labeling of the nodes with binary strings of length n such that two nodes are neighbors in the hypercube if, and only if, their labels differ in exactly …