Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Databases and Information Systems

A Deep Search Architecture For Capturing Product Ontologies, Tejeshwar Sangameswaran Dec 2014

A Deep Search Architecture For Capturing Product Ontologies, Tejeshwar Sangameswaran

Graduate Theses and Dissertations

This thesis describes a method to populate very large product ontologies quickly. We discuss a deep search architecture to text-mine online e-commerce market places and build a taxonomy of products and their corresponding descriptions and parent categories. The goal is to automatically construct an open database of products, which are aggregated from different online retailers. The database contains extensive metadata on each object, which can be queried and analyzed. Such a public database currently does not exist; instead the information currently resides siloed within various organizations. In this thesis, we describe the tools, data structures and software architectures that allowed …


Click-Through-Based Subspace Learning For Image Search, Yingwei Pan, Ting Yao, Xinmei Tian, Houqiang Li, Chong-Wah Ngo Nov 2014

Click-Through-Based Subspace Learning For Image Search, Yingwei Pan, Ting Yao, Xinmei Tian, Houqiang Li, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

One of the fundamental problems in image search is to rank image documents according to a given textual query. We address two limitations of the existing image search engines in this paper. First, there is no straightforward way of comparing textual keywords with visual image content. Image search engines therefore highly depend on the surrounding texts, which are often noisy or too few to accurately describe the image content. Second, ranking functions are trained on query-image pairs labeled by human labelers, making the annotation intellectually expensive and thus cannot be scaled up. We demonstrate that the above two fundamental challenges …


Ultimate Codes: Near-Optimal Mds Array Codes For Raid-6, Zhijie Huang, Hong Jiang, Chong Wang, Ke Zhou, Yuhong Zhao Jul 2014

Ultimate Codes: Near-Optimal Mds Array Codes For Raid-6, Zhijie Huang, Hong Jiang, Chong Wang, Ke Zhou, Yuhong Zhao

CSE Technical Reports

As modern storage systems have grown in size and complexity, RAID-6 is poised to replace RAID-5 as the dominant form of RAID architectures due to its ability to protect against double disk failures. Many excellent erasure codes specially designed for RAID-6 have emerged in recent years. However, all of them have limitations. In this paper, we present a class of near perfect erasure codes for RAID-6, called the Ultimate codes. These codes encode, update and decode either optimally or nearly optimally, regardless of what the code length is. This implies that utilizing these codes we can build highly efficient and …


Querie: Collaborative Database Exploration, Magdalini Eirinaki, Suju Abraham, Neoklis Polyzotis, Naushin Shaikh Jul 2014

Querie: Collaborative Database Exploration, Magdalini Eirinaki, Suju Abraham, Neoklis Polyzotis, Naushin Shaikh

Magdalini Eirinaki

Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the user’s querying behavior and finds matching patterns in the system’s query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these “similar” users and their queries to recommend queries that the current user may find interesting. In this work we describe an instantiation of the QueRIE framework, where …


Influences Of Influential Users: An Empirical Study Of Music Social Network, Jing Ren, Zhiyong Cheng, Jialie Shen, Feida Zhu Jul 2014

Influences Of Influential Users: An Empirical Study Of Music Social Network, Jing Ren, Zhiyong Cheng, Jialie Shen, Feida Zhu

Research Collection School Of Computing and Information Systems

Influential user can play a crucial role in online social networks. This paper documents an empirical study aiming at exploring the effects of influential users in the context of music social network. To achieve this goal, music diffusion graph is developed to model how information propagates over network. We also propose a heuristic method to measure users' influences. Using the real data from Last. fm, our empirical test demonstrates key effects of influential users and reveals limitations of existing influence identification/characterization schemes.


Determination Of Optimal Spatial Databases For The Area Of Poland To The Calculation Of Air Pollutant Dispersion Using The Calmet/Calpuff Model, Robert Oleniacz, Mateusz Rzeszutek Jun 2014

Determination Of Optimal Spatial Databases For The Area Of Poland To The Calculation Of Air Pollutant Dispersion Using The Calmet/Calpuff Model, Robert Oleniacz, Mateusz Rzeszutek

Robert Oleniacz

The paper presents a methodology for the preparation of three-dimensional spatial data and land use data for the purpose of modeling pollutant dispersion in the ambient air using a group of geophysical preprocessors of the CALMET/CALPUFF modeling system and the GIS software. Some space information data sources available to Poland were specified and their characteristics and availability were discussed. Particular attention was turned to the SRTM3 and GTOPO30 elevation data as well as the CLC2006 and GLCC land use data, for the preparation of computational grids of different resolutions. Groups of programs which can be used in order to form …


Direct Neighbor Search, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang Jun 2014

Direct Neighbor Search, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang

Kyriakos MOURATIDIS

In this paper we study a novel query type, called direct neighbor query. Two objects in a dataset are direct neighbors (DNs) if a window selection may exclusively retrieve these two objects. Given a source object, a DN search computes all of its direct neighbors in the dataset. The DNs define a new type of affinity that differs from existing formulations (e.g., nearest neighbors, nearest surrounders, reverse nearest neighbors, etc.) and finds application in domains where user interests are expressed in the form of windows, i.e., multi-attribute range selections. Drawing on key properties of the DN relationship, we develop an …


Self-Organizing Neural Networks Integrating Domain Knowledge And Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Jacek M. Zurada Jun 2014

Self-Organizing Neural Networks Integrating Domain Knowledge And Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Jacek M. Zurada

Research Collection School Of Computing and Information Systems

The use of domain knowledge in learning systems is expected to improve learning efficiency and reduce model complexity. However, due to the incompatibility with knowledge structure of the learning systems and real-time exploratory nature of reinforcement learning (RL), domain knowledge cannot be inserted directly. In this paper, we show how self-organizing neural networks designed for online and incremental adaptation can integrate domain knowledge and RL. Specifically, symbol-based domain knowledge is translated into numeric patterns before inserting into the self-organizing neural networks. To ensure effective use of domain knowledge, we present an analysis of how the inserted knowledge is used by …


Reliability Guided Resource Allocation For Large-Scale Supercomputing Systems, Shruti Umamaheshwaran Apr 2014

Reliability Guided Resource Allocation For Large-Scale Supercomputing Systems, Shruti Umamaheshwaran

Open Access Theses

In high performance computing systems, parallel applications request a large number of resources for long time periods. In this scenario, if a resource fails during the application runtime, it would cause all applications using this resource to fail. The probability of application failure is tied to the inherent reliability of resources used by the application. Our investigation of high performance computing systems operating in the field has revealed a significant difference in the measured operational reliability of individual computing nodes. By adding awareness of the individual system nodes' reliability to the scheduler along with the predicted reliability needs of parallel …


A Farm Management Information System With Task-Specific, Collaborative Mobile Apps And Cloud Storage Services, Jonathan Tyler Welte Apr 2014

A Farm Management Information System With Task-Specific, Collaborative Mobile Apps And Cloud Storage Services, Jonathan Tyler Welte

Open Access Theses

Modern production agriculture is beginning to advance beyond deterministic, scheduled operations between relatively few people to larger scale, information-driven efficiency in order to respond to the challenges of field variability and meet the needs of a growing population. Since no two farms are the same with respect to information and management structure, a specialized farm management information system (FMIS) which is tailored to the realities on the ground of individual farms is likely to be more effective than generalized FMIS available today.

This thesis presents the design of a FMIS using proven user-centered design principles. This approach resulted in the …


Assessing The Impact Of Electronic Health Record Systems Implementation On Hospital Patient Perceptions Of Care, Katherine Sofia Palacio Salgar Apr 2014

Assessing The Impact Of Electronic Health Record Systems Implementation On Hospital Patient Perceptions Of Care, Katherine Sofia Palacio Salgar

Engineering Management & Systems Engineering Theses & Dissertations

The delivery of health care services has been impacted by advances in Knowledge Management Information Systems (KMIS) and Information Technology (IT). The literature reveals that Electronic Health Records Systems (EHRs) are a comprehensive KMIS. There is a wide recognition in the body of knowledge that demonstrates the potential of EHRs to transform all aspects of health care services and, in consequence, the performance of Health Care Delivery Organizations (HCDO). Authors of published research also agree that there is a need for more empirical contributions that demonstrate the impact of EHRs upon HCDO. It is argued that in most cases, studies …


Disaster Data Management In Cloud Environments, Katarina Grolinger Jan 2014

Disaster Data Management In Cloud Environments, Katarina Grolinger

Katarina Grolinger

Facilitating decision-making in a vital discipline such as disaster management requires information gathering, sharing, and integration on a global scale and across governments, industries, communities, and academia. A large quantity of immensely heterogeneous disaster-related data is available; however, current data management solutions offer few or no integration capabilities and limited potential for collaboration. Moreover, recent advances in cloud computing, Big Data, and NoSQL have opened the door for new solutions in disaster data management. In this thesis, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM) with the objectives of 1) facilitating information gathering …


Data Management In Cloud Environments: Nosql And Newsql Data Stores, Katarina Grolinger, Wilson A. Higashino, Abhinav Tiwari, Miriam Am Capretz Jan 2014

Data Management In Cloud Environments: Nosql And Newsql Data Stores, Katarina Grolinger, Wilson A. Higashino, Abhinav Tiwari, Miriam Am Capretz

Katarina Grolinger

: Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the …


Coupling Graphs, Efficient Algorithms And B-Cell Epitope Prediction, Liang Zhao, Steven C. H. Hoi, Zhenhua Li, Limsoon Wong, Hung Nguyen Jan 2014

Coupling Graphs, Efficient Algorithms And B-Cell Epitope Prediction, Liang Zhao, Steven C. H. Hoi, Zhenhua Li, Limsoon Wong, Hung Nguyen

Research Collection School Of Computing and Information Systems

Coupling graphs are newly introduced in this paper to meet many application needs particularly in the field of bioinformatics. A coupling graph is a two-layer graph complex, in which each node from one layer of the graph complex has at least one connection with the nodes in the other layer, and vice versa. The coupling graph model is sufficiently powerful to capture strong and inherent associations between subgraph pairs in complicated applications. The focus of this paper is on mining algorithms of frequent coupling subgraphs and bioinformatics application. Although existing frequent subgraph mining algorithms are competent to identify frequent subgraphs …