Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 29 of 29

Full-Text Articles in Physical Sciences and Mathematics

Non-Monotonic Generation Of Knowledge Paths For Context Understanding, Pei-Chi Lo, Ee-Peng Lim Mar 2024

Non-Monotonic Generation Of Knowledge Paths For Context Understanding, Pei-Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Knowledge graphs can be used to enhance text search and access by augmenting textual content with relevant background knowledge. While many large knowledge graphs are available, using them to make semantic connections between entities mentioned in the textual content remains to be a difficult task. In this work, we therefore introduce contextual path generation (CPG) which refers to the task of generating knowledge paths, contextual path, to explain the semantic connections between entities mentioned in textual documents with given knowledge graph. To perform CPG task well, one has to address its three challenges, namely path relevance, incomplete knowledge graph, and …


Active Discovering New Slots For Task-Oriented Conversation, Yuxia Wu, Tianhao Dai, Zhedong Zheng, Lizi Liao Jan 2024

Active Discovering New Slots For Task-Oriented Conversation, Yuxia Wu, Tianhao Dai, Zhedong Zheng, Lizi Liao

Research Collection School Of Computing and Information Systems

Existing task-oriented conversational systems heavily rely on domain ontologies with pre-defined slots and candidate values. In practical settings, these prerequisites are hard to meet, due to the emerging new user requirements and ever-changing scenarios. To mitigate these issues for better interaction performance, there are efforts working towards detecting out-of-vocabulary values or discovering new slots under unsupervised or semi-supervised learning paradigms. However, overemphasizing on the conversation data patterns alone induces these methods to yield noisy and arbitrary slot results. To facilitate the pragmatic utility, real-world systems tend to provide a stringent amount of human labeling quota, which offers an authoritative way …


Multi-Representation Variational Autoencoder Via Iterative Latent Attention And Implicit Differentiation, Nhu Thuat Tran, Hady Wirawan Lauw Oct 2023

Multi-Representation Variational Autoencoder Via Iterative Latent Attention And Implicit Differentiation, Nhu Thuat Tran, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Variational Autoencoder (VAE) offers a non-linear probabilistic modeling of user's preferences. While it has achieved remarkable performance at collaborative filtering, it typically samples a single vector for representing user's preferences, which may be insufficient to capture the user's diverse interests. Existing solutions extend VAE to model multiple interests of users by resorting a variant of self-attentive method, i.e., employing prototypes to group items into clusters, each capturing one topic of user's interests. Despite showing improvements, the current design could be more effective since prototypes are randomly initialized and shared across users, resulting in uninformative and non-personalized clusters.To fill the gap, …


Arduinoprog: Towards Automating Arduino Programming, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang Sep 2023

Arduinoprog: Towards Automating Arduino Programming, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Writing code for Arduino poses unique challenges. A developer 1) needs hardware-specific knowledge about the interface configuration between the Arduino controller and the I/Ohardware, 2) identifies a suitable driver library for the I/O hardware, and 3) follows certain usage patterns of the driver library in order to use them properly. In this work, based on a study of real-world user queries posted in the Arduino forum, we propose ArduinoProg to address such challenges. ArduinoProg consists of three components, i.e., Library Retriever, Configuration Classifier, and Pattern Generator. Given a query, Library Retriever retrieves library names relevant to the I/O hardware identified …


Automating Arduino Programming: From Hardware Setups To Sample Source Code Generation, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang May 2023

Automating Arduino Programming: From Hardware Setups To Sample Source Code Generation, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

An embedded system is a system consisting of software code, controller hardware, and I/O (Input/Output) hardware that performs a specific task. Developing an embedded system presents several challenges. First, the development often involves configuring hardware that requires domain-specific knowledge. Second, the library for the hardware may have API usage patterns that must be followed. To overcome such challenges, we propose a framework called ArduinoProg towards the automatic generation of Arduino applications. ArduinoProg takes a natural language query as input and outputs the configuration and API usage pattern for the hardware described in the query. Motivated by our findings on the …


Contextual Path Retrieval: A Contextual Entity Relation Embedding-Based Approach, Pei-Chi Lo, Ee-Peng Lim Jan 2023

Contextual Path Retrieval: A Contextual Entity Relation Embedding-Based Approach, Pei-Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Contextual path retrieval (CPR) refers to the task of finding contextual path(s) between a pair of entities in a knowledge graph that explains the connection between them in a given context. For this novel retrieval task, we propose the Embedding-based Contextual Path Retrieval (ECPR) framework. ECPR is based on a three-component structure that includes a context encoder and path encoder that encode query context and path, respectively, and a path ranker that assigns a ranking score to each candidate path to determine the one that should be the contextual path. For context encoding, we propose two novel context encoding methods, …


Legion: Massively Composing Rankers For Improved Bug Localization At Adobe, Darryl Jarman, Jeffrey Berry, Riley Smith, Ferdian Thung, David Lo Aug 2022

Legion: Massively Composing Rankers For Improved Bug Localization At Adobe, Darryl Jarman, Jeffrey Berry, Riley Smith, Ferdian Thung, David Lo

Research Collection School Of Computing and Information Systems

Studies have estimated that, in industrial settings, developers spend between 30 and 90 percent of their time fixing bugs. As such, tools that assist in identifying the location of bugs provide value by reducing debugging costs. One such tool is BugLocator. This study initially aimed to determine if developers working on the Adobe Analytics product could use BugLocator. The initial results show that BugLocator achieves a similar accuracy on five of seven Adobe Analytics repositories and on open-source projects. However, these results do not meet the minimum applicability requirement deemed necessary by Adobe Analytics developers prior to possible adoption. Thus, …


Digbug: Pre/Post-Processing Operator Selection For Accurate Bug Localization, Kisub Kim, Sankalp Ghatpande, Kui Liu, Anil Koyuncu, Dongsun Kim, Tegawendé F. Bissyande, Jacques Klein, Yves Le Traon Jul 2022

Digbug: Pre/Post-Processing Operator Selection For Accurate Bug Localization, Kisub Kim, Sankalp Ghatpande, Kui Liu, Anil Koyuncu, Dongsun Kim, Tegawendé F. Bissyande, Jacques Klein, Yves Le Traon

Research Collection School Of Computing and Information Systems

Bug localization is a recurrent maintenance task in software development. It aims at identifying relevant code locations (e.g., code files) that must be inspected to fix bugs. When such bugs are reported by users, the localization process become often overwhelming as it is mostly a manual task due to incomplete and informal information (written in natural languages) available in bug reports. The research community has then invested in automated approaches, notably using Information Retrieval techniques. Unfortunately, reported performance in the literature is still limited for practical usage. Our key observation, after empirically investigating a large dataset of bug reports as …


Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin. Qu May 2022

Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin. Qu

Research Collection School Of Computing and Information Systems

With the wide usage of data visualizations, a huge number of Scalable Vector Graphic (SVG)-based visualizations have been created and shared online. Accordingly, there has been an increasing interest in exploring how to retrieve perceptually similar visualizations from a large corpus, since it can benefit various downstream applications such as visualization recommendation. Existing methods mainly focus on the visual appearance of visualizations by regarding them as bitmap images. However, the structural information intrinsically existing in SVG-based visualizations is ignored. Such structural information can delineate the spatial and hierarchical relationship among visual elements, and characterize visualizations thoroughly from a new perspective. …


Codematcher: Searching Code Based On Sequential Semantics Of Important Query Words, Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li Jan 2022

Codematcher: Searching Code Based On Sequential Semantics Of Important Query Words, Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li

Research Collection School Of Computing and Information Systems

To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR)-based models for code search, but they fail to connect the semantic gap between query and code. An early successful deep learning (DL)-based model DeepCS solved this issue by learning the relationship between pairs of code methods and corresponding natural language descriptions. Two major advantages of DeepCS are the capability of understanding irrelevant/noisy keywords and capturing sequential relationships between words in query and code. In this article, we proposed an IR-based model CodeMatcher that …


Self-Supervised Contrastive Learning For Code Retrieval And Summarization Via Semantic-Preserving Transformations, Duy Quoc Nghi Bui, Yijun Yu, Lingxiao Jiang Jul 2021

Self-Supervised Contrastive Learning For Code Retrieval And Summarization Via Semantic-Preserving Transformations, Duy Quoc Nghi Bui, Yijun Yu, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used in two ways: (1) it can produce vector representation of code which can be applied to code retrieval tasks that do not have labeled data; (2) it can be used in a fine-tuning process for tasks that might still require label data such as code summarization. The key innovation is that we train the source code model by asking it to recognize similar …


Are The Code Snippets What We Are Searching For? A Benchmark And An Empirical Study On Code Search With Natural-Language Queries, Shuhan Yan, Hang Yu, Yuting Chen, Beijun Shen Feb 2020

Are The Code Snippets What We Are Searching For? A Benchmark And An Empirical Study On Code Search With Natural-Language Queries, Shuhan Yan, Hang Yu, Yuting Chen, Beijun Shen

Research Collection School Of Computing and Information Systems

Code search methods, especially those that allow programmers to raise queries in a natural language, plays an important role in software development. It helps to improve programmers' productivity by returning sample code snippets from the Internet and/or source-code repositories for their natural-language queries. Meanwhile, there are many code search methods in the literature that support natural-language queries. Difficulties exist in recognizing the strengths and weaknesses of each method and choosing the right one for different usage scenarios, because (1) the implementations of those methods and the datasets for evaluating them are usually not publicly available, and (2) some methods leverage …


Api Recommendation For Event-Driven Android Application Development, Weizhao Yuan, Huu Hoang Nguyen, Lingxiao Jiang, Yuting Chen, Jianjun Zhao, Haibo Yu Mar 2019

Api Recommendation For Event-Driven Android Application Development, Weizhao Yuan, Huu Hoang Nguyen, Lingxiao Jiang, Yuting Chen, Jianjun Zhao, Haibo Yu

Research Collection School Of Computing and Information Systems

Context: Software development is increasingly dependent on existing libraries. Developers need help to find suitable library APIs. Although many studies have been proposed to recommend relevant functional APIs that can be invoked for implementing a functionality, few studies have paid attention to an orthogonal need associated with event-driven programming frameworks, such as the Android framework. In addition to invoking functional APIs, Android developers need to know where to place functional code according to various events that may be triggered within the framework.Objective: This paper aims to develop an API recommendation engine for Android application development that can recommend both (1) …


Will This Localization Tool Be Effective For This Bug? Mitigating The Impact Of Unreliability Of Information Retrieval Based Bug Localization Tools, Tien-Duy B. Le, Ferdian Thung, David Lo Aug 2017

Will This Localization Tool Be Effective For This Bug? Mitigating The Impact Of Unreliability Of Information Retrieval Based Bug Localization Tools, Tien-Duy B. Le, Ferdian Thung, David Lo

Research Collection School Of Computing and Information Systems

Information retrieval (IR) based bug localization approaches process a textual bug report and a collection of source code files to find buggy files. They output a ranked list of files sorted by their likelihood to contain the bug. Recently, several IR-based bug localization tools have been proposed. However, there are no perfect tools that can successfully localize faults within a few number of most suspicious program elements for every single input bug report. Therefore, it is difficult for developers to decide which tool would be effective for a given bug report. Furthermore, for some bug reports, no bug localization tools …


Android Repository Mining For Detecting Publicly Accessible Functions Missing Permission Checks, Huu Hoang Nguyen, Lingxiao Jiang, Thanh Tho Quan May 2017

Android Repository Mining For Detecting Publicly Accessible Functions Missing Permission Checks, Huu Hoang Nguyen, Lingxiao Jiang, Thanh Tho Quan

Research Collection School Of Computing and Information Systems

Android has become the most popular mobile operating system. Millions of applications, including many malware, haven been developed for it. Even though its overall system architecture and many APIs are documented, many other methods and implementation details are not, not to mention potential bugs and vulnerabilities that may be exploited. Manual documentation may also be easily outdated as Android evolves constantly with changing features and higher complexities. Techniques and tool supports are thus needed to automatically extract information from different versions of Android to facilitate whole-system analysis of undocumented code. This paper presents an approach for alleviating the challenges associated …


On The Effectiveness Of Virtualization Based Memory Isolation On Multicore Platforms, Siqi Zhao, Xuhua Ding Apr 2017

On The Effectiveness Of Virtualization Based Memory Isolation On Multicore Platforms, Siqi Zhao, Xuhua Ding

Research Collection School Of Computing and Information Systems

Virtualization based memory isolation has beenwidely used as a security primitive in many security systems.This paper firstly provides an in-depth analysis of itseffectiveness in the multicore setting; a first in the literature.Our study reveals that memory isolation by itself is inadequatefor security. Due to the fundamental design choices inhardware, it faces several challenging issues including pagetable maintenance, address mapping validation and threadidentification. As demonstrated by our attacks implementedon XMHF and BitVisor, these issues undermine the security ofmemory isolation. Next, we propose a new isolation approachthat is immune to the aforementioned problems. In our design,the hypervisor constructs a fully isolated micro …


A Cooperative Coevolution Framework For Parallel Learning To Rank, Shuaiqiang Wang, Yun Wu, Byron J. Gao, Ke Wang, Hady W. Lauw, Jun Ma Dec 2015

A Cooperative Coevolution Framework For Parallel Learning To Rank, Shuaiqiang Wang, Yun Wu, Byron J. Gao, Ke Wang, Hady W. Lauw, Jun Ma

Research Collection School Of Computing and Information Systems

We propose CCRank, the first parallel framework for learning to rank based on evolutionary algorithms (EA), aiming to significantly improve learning efficiency while maintaining accuracy. CCRank is based on cooperative coevolution (CC), a divide-and-conquer framework that has demonstrated high promise in function optimization for problems with large search space and complex structures. Moreover, CC naturally allows parallelization of sub-solutions to the decomposed sub-problems, which can substantially boost learning efficiency. With CCRank, we investigate parallel CC in the context of learning to rank. We implement CCRank with three EA-based learning to rank algorithms for demonstration. Extensive experiments on benchmark datasets in …


Compositional Vector Space Models For Improved Bug Localization, Shaowei Wang, David Lo, Julia Lawall Oct 2014

Compositional Vector Space Models For Improved Bug Localization, Shaowei Wang, David Lo, Julia Lawall

Research Collection School Of Computing and Information Systems

Software developers and maintainers often need to locate code units responsible for a particular bug. A number of Information Retrieval (IR) techniques have been proposed to map natural language bug descriptions to the associated code units. The vector space model (VSM) with the standard tf-idf weighting scheme (VSMnatural), has been shown to outperform nine other state-of-the-art IR techniques. However, there are multiple VSM variants with different weighting schemes, and their relative performance differs for different software systems. Based on this observation, we propose to compose various VSM variants, modelling their composition as an optimization problem. We propose a genetic algorithm …


Query-Document-Dependent Fusion: A Case Study Of Multimodal Music Retrieval, Zhonghua Li, Bingjun Zhang, Yi Yu, Jialie Shen, Ye Wang Dec 2013

Query-Document-Dependent Fusion: A Case Study Of Multimodal Music Retrieval, Zhonghua Li, Bingjun Zhang, Yi Yu, Jialie Shen, Ye Wang

Research Collection School Of Computing and Information Systems

In recent years, multimodal fusion has emerged as a promising technology for effective multimedia retrieval. Developing the optimal fusion strategy for different modality (e.g. content, metadata) has been the subject of intensive research. Given a query, existing methods derive a unified fusion strategy for all documents with the underlying assumption that the relative significance of a modality remains the same across all documents. However, this assumption is often invalid. We thus propose a general multimodal fusion framework, query-document-dependent fusion (QDDF), which derives the optimal fusion strategy for each query-document pair via intelligent content analysis of both queries and documents. By …


K-Partite Graph Reinforcement And Its Application In Multimedia Information Retrieval, Yue Gao, Meng Wang, Rongrong Ji, Zheng-Jun Zha, Jialie Shen Jul 2012

K-Partite Graph Reinforcement And Its Application In Multimedia Information Retrieval, Yue Gao, Meng Wang, Rongrong Ji, Zheng-Jun Zha, Jialie Shen

Research Collection School Of Computing and Information Systems

In many example-based information retrieval tasks, example query actually contains multiple sub-queries. For example, in 3D object retrieval, the query is an object described by multiple views. In content-based video retrieval, the query is a video clip that contains multiple frames. Without prior knowledge, the most intuitive approach is to treat the sub-queries equally without difference. In this paper, we propose a k-partite graph reinforcement approach to fuse these sub-queries based on the to-be-retrieved database. The approach first collects the top retrieved results. These results are regarded as pseudo-relevant samples and then a k-partite graph reinforcement is performed on these …


Where Should The Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based On Bug Reports, Jian Zhou, Hongyu Zhang, David Lo Jun 2012

Where Should The Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based On Bug Reports, Jian Zhou, Hongyu Zhang, David Lo

Research Collection School Of Computing and Information Systems

For a large and evolving software system, the project team could receive a large number of bug reports. Locating the source code files that need to be changed in order to fix the bugs is a challenging task. Once a bug report is received, it is desirable to automatically point out to the files that developers should change in order to fix the bug. In this paper, we propose BugLocator, an information retrieval based method for locating the relevant files for fixing a bug. BugLocator ranks all files based on the textual similarity between the initial bug report and the …


Concern Localization Using Information Retrieval: An Empirical Study On Linux Kernel, Shaowei Wang, David Lo, Zhenchang Xing, Lingxiao Jiang Oct 2011

Concern Localization Using Information Retrieval: An Empirical Study On Linux Kernel, Shaowei Wang, David Lo, Zhenchang Xing, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Many software maintenance activities need to find code units (functions, files, etc.) that implement a certain concern (features, bugs, etc.). To facilitate such activities, many approaches have been proposed to automatically link code units with concerns described in natural languages, which are termed as concern localization and often employ Information Retrieval (IR) techniques. There has not been a study that evaluates and compares the effectiveness of latest IR techniques on a large dataset. This study fills this gap by investigating ten IR techniques, some of which are new and have not been used for concern localization, on a Linux kernel …


Parallel Learning To Rank For Information Retrieval, Shuaiqiang Wang, Byron J. Gao, Ke Wang, Hady W. Lauw Jul 2011

Parallel Learning To Rank For Information Retrieval, Shuaiqiang Wang, Byron J. Gao, Ke Wang, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Learning to rank represents a category of effective ranking methods for information retrieval. While the primary concern of existing research has been accuracy, learning efficiency is becoming an important issue due to the unprecedented availability of large-scale training data and the need for continuous update of ranking functions. In this paper, we investigate parallel learning to rank, targeting simultaneous improvement in accuracy and efficiency.


A New Hardware-Assisted Pir With O(N) Shuffle Cost, Xuhua Ding, Yanjiang Yang, Robert H. Deng, Shuhong Wang Aug 2010

A New Hardware-Assisted Pir With O(N) Shuffle Cost, Xuhua Ding, Yanjiang Yang, Robert H. Deng, Shuhong Wang

Research Collection School Of Computing and Information Systems

Since the concept of private information retrieval (PIR) was first formalized by Chor et al., various constructions have been proposed with a common goal of reducing communication complexity. Unfortunately, none of them is suitable for practical settings mainly due to the prohibitively high cost for either communications or computations. The booming of the Internet and its applications, especially, the recent trend in outsourcing databases, fuels the research on practical PIR schemes. In this paper, we propose a hardware-assisted PIR scheme with a novel shuffle algorithm. Our PIR construction entails O(n) offline computation cost, and constant online operations and O(log n) …


Continuous Nearest Neighbor Monitoring In Road Networks, Kyriakos Mouratidis, Man Lung Yiu, Dimitris Papadias, Nikos Mamoulis Sep 2006

Continuous Nearest Neighbor Monitoring In Road Networks, Kyriakos Mouratidis, Man Lung Yiu, Dimitris Papadias, Nikos Mamoulis

Research Collection School Of Computing and Information Systems

Recent research has focused on continuous monitoring of nearest neighbors (NN) in highly dynamic scenarios, where the queries and the data objects move frequently and arbitrarily. All existing methods, however, assume the Euclidean distance metric. In this paper we study k-NN monitoring in road networks, where the distance between a query and a data object is determined by the length of the shortest path connecting them. We propose two methods that can handle arbitrary object and query moving patterns, as well as °uctuations of edge weights. The ¯rst one maintains the query results by processing only updates that may invalidate …


Learning Distance Metrics With Contextual Constraints For Image Retrieval, Steven C. H. Hoi, Wei Liu, Michael R. Lyu, Wei-Ying Ma Jun 2006

Learning Distance Metrics With Contextual Constraints For Image Retrieval, Steven C. H. Hoi, Wei Liu, Michael R. Lyu, Wei-Ying Ma

Research Collection School Of Computing and Information Systems

Relevant Component Analysis (RCA) has been proposed for learning distance metrics with contextual constraints for image retrieval. However, RCA has two important disadvantages. One is the lack of exploiting negative constraints which can also be informative, and the other is its incapability of capturing complex nonlinear relationships between data instances with the contextual information. In this paper, we propose two algorithms to overcome these two disadvantages, i.e., Discriminative Component Analysis (DCA) and Kernel DCA. Compared with other complicated methods for distance metric learning, our algorithms are rather simple to understand and very easy to solve. We evaluate the performance of …


Fisa: Feature-Based Instance Selection For Imbalanced Text Classification, Aixin Sun, Ee Peng Lim, Boualem Benatallah, Mahbub Hassan Apr 2006

Fisa: Feature-Based Instance Selection For Imbalanced Text Classification, Aixin Sun, Ee Peng Lim, Boualem Benatallah, Mahbub Hassan

Research Collection School Of Computing and Information Systems

Support Vector Machines (SVM) classifiers are widely used in text classification tasks and these tasks often involve imbalanced training. In this paper, we specifically address the cases where negative training documents significantly outnumber the positive ones. A generic algorithm known as FISA (Feature-based Instance Selection Algorithm), is proposed to select only a subset of negative training documents for training a SVM classifier. With a smaller carefully selected training set, a SVM classifier can be more efficiently trained while delivering comparable or better classification accuracy. In our experiments on the 20-Newsgroups dataset, using only 35% negative training examples and 60% learning …


Integrating User Feedback Log Into Relevance Feedback By Coupled Svm For Content-Based Image Retrieval, Steven C. H. Hoi, Michael R. Lyu, Rong Jin Apr 2005

Integrating User Feedback Log Into Relevance Feedback By Coupled Svm For Content-Based Image Retrieval, Steven C. H. Hoi, Michael R. Lyu, Rong Jin

Research Collection School Of Computing and Information Systems

Relevance feedback has been shown as an important tool to boost the retrieval performance in content-based image retrieval. In the past decade, various algorithms have been proposed to formulate relevance feedback in contentbased image retrieval. Traditional relevance feedback techniques mainly carry out the learning tasks by focusing lowlevel visual features of image content with little consideration on log information of user feedback. However, from a long-term learning perspective, the user feedback log is one of the most important resources to bridge the semantic gap problem in image retrieval. In this paper we propose a novel technique to integrate the log …


On Integrating Existing Bibliographic Databases And Structured Databases, Ying Lu, Ee Peng Lim Aug 1996

On Integrating Existing Bibliographic Databases And Structured Databases, Ying Lu, Ee Peng Lim

Research Collection School Of Computing and Information Systems

It is widely accepted that future digital library applications have to be built upon different kinds of database servers to draw different forms of data from them. These data include bibliographic data, text data, multimedia data, and structured data. We address the problem of integrating existing bibliographic and structured databases which reside at different locations in the network. To integrate bibliographic data and structured data, we extended the well-known SQL model to represent bibliographic related attributes and queries. In particular, we have added a new data type to model attributes in the bibliographic database. We have also designed specialized predicates …