Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Physical Sciences and Mathematics

Xstamps: A Multiversion Timestamps Concurrency Control Protocol For Xml Data, Khin-Myo Win, Wee-Keong Ng, Ee Peng Lim Dec 2003

Xstamps: A Multiversion Timestamps Concurrency Control Protocol For Xml Data, Khin-Myo Win, Wee-Keong Ng, Ee Peng Lim

Research Collection School Of Computing and Information Systems

With the tremendous growth of XML data over the Web, efficient management of such data becomes a new challenge for database community. Several data management solutions, proposed in recent years, extend the capability of traditional database systems to meet the needs of XML data while alternative approaches introduce new generation databases, named as native XML database management systems. Although traditional databases have mature transaction management and concurrency control techniques, there is still a need to tailor techniques for native XML databases in order to deal with distinct characteristics of XML. In this paper, we propose XStamps, a multiversion timestamps concurrency …


Paper For An Educational Digital Library, Dion Hoe-Lian Goh, Yin-Leng Theng, Ming Yin, Ee Peng Lim Dec 2003

Paper For An Educational Digital Library, Dion Hoe-Lian Goh, Yin-Leng Theng, Ming Yin, Ee Peng Lim

Research Collection School Of Computing and Information Systems

GeogDL is a digital library of geography examination resources designed to assist students in revising for a national geography examination in Singapore. As part of an iterative design process, we carried out participatory design and brainstorming with student and teacher design partners. The first study involved prospective student design partners. In response to the first study, we describe in this paper an implementation of PAPER – Personalised Adaptive Pathways for Exam Resources – a new bundle of personalized, interactive services containing a mock exam and a personal coach. The mock exam provides a simulation of the actual geography examination while …


Apparatus For Discovering Computing Services Architecture And Developing Patterns Of Computing Services And Method Therefor [Sg 107499], Emarson Victoria, Hui Tseng, Hwee Hwa Pang, Tau Chen Cham, Siew Choo Tay Nov 2003

Apparatus For Discovering Computing Services Architecture And Developing Patterns Of Computing Services And Method Therefor [Sg 107499], Emarson Victoria, Hui Tseng, Hwee Hwa Pang, Tau Chen Cham, Siew Choo Tay

Research Collection School Of Computing and Information Systems

An apparatus for discovering computing services architecture and developing patterns of computing services and method therefor are disclosed. The apparatus, according to an embodiment of the invention, provides a graphical user interface for displaying a deployment plan of deployed computing services. Components in the deployment plan are interconnected by links indicating dependency relationships between the components. Each component and link is assigned a confidence value, which is based on a calculated weight of the properties of each component. The apparatus further provides editing tools for manipulating the components in the deployment plan as well as for creating and managing patterns.


Web Unit Mining: Finding And Classifying Subgraphs Of Web Pages, Aixin Sun, Ee Peng Lim Nov 2003

Web Unit Mining: Finding And Classifying Subgraphs Of Web Pages, Aixin Sun, Ee Peng Lim

Research Collection School Of Computing and Information Systems

In web classification, most researchers assume that the objects to classify are individual web pages from one or more web sites. In practice, the assumption is too restrictive since a web page itself may not always correspond to a concept instance of some semantic concept (or category) given to the classification task. In this paper, we want to relax this assumption and allow a concept instance to be represented by a subgraph of web pages or a set of web pages. We identify several new issues to be addressed when the assumption is removed, and formulate the web unit mining …


Ssm: Fast Construction Of The Optimized Segment Support Map, Kok-Leong Ong, Wee-Keong Ng, Ee Peng Lim Sep 2003

Ssm: Fast Construction Of The Optimized Segment Support Map, Kok-Leong Ong, Wee-Keong Ng, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Computing the frequency of a pattern is one of the key operations in data mining algorithms. Recently, the Optimized Segment Support Map (OSSM) was introduced as a simple but powerful way of speeding up any form of frequency counting satisfying the monotonicity condition. However, the construction cost to obtain the ideal OSSM is high, and makes it less attractive in practice. In this paper, we propose the FSSM, a novel algorithm that constructs the OSSM quickly using a FP-Tree. Given a user-defined segment size, the FSSM is able to construct the OSSM at a fraction of the time required by …


On Mining Group Patterns Of Mobile Users, Yida Wang, Ee Peng Lim, San-Yih Hwang Sep 2003

On Mining Group Patterns Of Mobile Users, Yida Wang, Ee Peng Lim, San-Yih Hwang

Research Collection School Of Computing and Information Systems

In this paper, we present a group pattern mining approach to derive the grouping information of mobile device users based on the spatio-temporal distances among them. Group patterns of users are determined by a distance threshold and a minimum duration. To discover group patterns, we propose the AGP and VG-growth algorithms that are derived from the Apriori and FP-growth algorithms respectively. We further evaluate the efficiencies of these two algorithms using synthetically generated user movement data.


Towards A Role-Based Metadata Scheme For Educational Digital Libraries: A Case Study In Singapore, Dian Melati Md Ismail, Ming Yin, Yin-Leng Theng, Dion Hoe-Lian Goh, Ee Peng Lim Aug 2003

Towards A Role-Based Metadata Scheme For Educational Digital Libraries: A Case Study In Singapore, Dian Melati Md Ismail, Ming Yin, Yin-Leng Theng, Dion Hoe-Lian Goh, Ee Peng Lim

Research Collection School Of Computing and Information Systems

In this paper, we describe the development of an appropriate metadata scheme for GeogDL, a Web-based digital library application containing past-year examination resources for students taking a Singapore national examination in geography. The new metadata scheme was developed from established metadata schemes on education and e-learning. Initial evaluation showed that a role-based approach would be more viable, adapting to the different roles of teachers/educators and librarians contributing geography resources to GeogDL. The paper concludes with concrete implementation of the role-based metadata schema for GeogDL.


Adaptive Filters For Continuous Queries Over Distributed Data Stream, Chris Olston, Jing Jiang, Jennifer Widom Jun 2003

Adaptive Filters For Continuous Queries Over Distributed Data Stream, Chris Olston, Jing Jiang, Jennifer Widom

Research Collection School Of Computing and Information Systems

We consider an environment where distributed data sources continuously stream updates to a centralized processor that monitors continuous queries over the distributed data. Significant communication overhead is incurred in the presence of rapid update streams, and we propose a new technique for reducing the overhead. Users register continuous queries with precision requirements at the central stream processor, which installs filters at remote data sources. The filters adapt to changing conditions to minimize stream rates while guaranteeing that all continuous queries still receive the updates necessary to provide answers of adequate precision at all times. Our approach enables applications to trade …


Ladar-Based Detection And Tracking Of Moving Objects From A Ground Vehicle At High Speeds, Chieh-Chih Wang, Charles Thorpe, Arne Suppe Jun 2003

Ladar-Based Detection And Tracking Of Moving Objects From A Ground Vehicle At High Speeds, Chieh-Chih Wang, Charles Thorpe, Arne Suppe

Research Collection School Of Computing and Information Systems

Detection and tracking of moving objects (DATMO) in crowded urban areas from a ground vehicle at high speeds is difficult because of a wide variety of targets and uncertain pose estimation from odometry and GPS/DGPS. In this paper we present a solution of the simultaneous localization and mapping (SLAM) with DATMO problem to accomplish this task using ladar sensors and odometry. With a precise pose estimate and a surrounding map from SLAM, moving objects are detected without a priori knowledge of the targets. The interacting multiple model (IMM) estimation algorithm is used for modeling the motion of a moving object …


Using Support Vector Machines For Terrorism Information Extraction, Aixin Sun, Myo-Myo Naing, Ee Peng Lim, Wai Lam Jun 2003

Using Support Vector Machines For Terrorism Information Extraction, Aixin Sun, Myo-Myo Naing, Ee Peng Lim, Wai Lam

Research Collection School Of Computing and Information Systems

Information extraction (IE) is of great importance in many applications including web intelligence, search engines, text understanding, etc. To extract information from text documents, most IE systems rely on a set of extraction patterns. Each extraction pattern is defined based on the syntactic and/or semantic constraints on the positions of desired entities within natural language sentences. The IE systems also provide a set of pattern templates that determines the kind of syntactic and semantic constraints to be considered. In this paper, we argue that such pattern templates restricts the kind of extraction patterns that can be learned by IE systems. …


On Machine Learning Methods For Chinese Document Classification, Ji He, Ah-Hwee Tan, Chew-Lim Tan May 2003

On Machine Learning Methods For Chinese Document Classification, Ji He, Ah-Hwee Tan, Chew-Lim Tan

Research Collection School Of Computing and Information Systems

This paper reports our comparative evaluation of three machine learning methods, namely k Nearest Neighbor (kNN), Support Vector Machines (SVM), and Adaptive Resonance Associative Map (ARAM) for Chinese document categorization. Based on two Chinese corpora, a series of controlled experiments evaluated their learning capabilities and efficiency in mining text classification knowledge. Benchmark experiments showed that their predictive performance were roughly comparable, especially on clean and well organized data sets. While kNN and ARAM yield better performances than SVM on small and clean data sets, SVM and ARAM significantly outperformed kNN on noisy data. Comparing efficiency, kNN was notably more costly …


Guest Editorial: Text And Web Mining, Ah-Hwee Tan, Philip S. Yu May 2003

Guest Editorial: Text And Web Mining, Ah-Hwee Tan, Philip S. Yu

Research Collection School Of Computing and Information Systems

Text mining and web mining are two interrelated fields that have received a lot of attention in recent years. Text mining [1, 2] is concerned with the analysis of very large document collections and the extraction of hidden knowledge from text-based data. Web mining [3] refers to the analysis and mining of all web-related data, including web content, hyperlink structure, and web access statistics.


On Querying Geospatial And Georeferenced Metadata Resources In Gportal, Zehua Liu, Ee Peng Lim, Wee-Keong Ng, Dion Hoe-Lian Goh May 2003

On Querying Geospatial And Georeferenced Metadata Resources In Gportal, Zehua Liu, Ee Peng Lim, Wee-Keong Ng, Dion Hoe-Lian Goh

Research Collection School Of Computing and Information Systems

G-Portal is a web portal system providing a range of digital library services to access geospatial and georeferenced resources on the Web. Among them are the storage and query subsystems that provide a central repository of metadata resources organized under different projects. In GPortal, all metadata resources are represented in XML (Extensible Markup Language) and they are compliant to some resource schemas de.ned by their creators. The resource schemas are extended versions of a basic resource schema making it easy to accommodate all kinds of metadata resources while maintaining the portability of resource data. To support queries over the geospatial …


Efficient Native Xml Storage System (Enaxs), Khin-Myo Win, Wee-Keong Ng, Ee Peng Lim Apr 2003

Efficient Native Xml Storage System (Enaxs), Khin-Myo Win, Wee-Keong Ng, Ee Peng Lim

Research Collection School Of Computing and Information Systems

XML is a self-describing meta-language and fast emerging as a dominant standard for Web data exchange among various applications. With the tremendous growth of XML documents, an efficient storage system is required to manage them. The conventional databases, which require all data to adhere to an explicitly specified rigid schema, are unable to provide an efficient storage for tree-structured XML documents. A new data model that is specifically designed for XML documents is required. In this paper, we propose a new storage system, named Efficient Native XML Storage System (ENAXS), for large and complex XML documents. ENAXS stores all XML …


Hierarchical Text Classification Methods And Their Specification, Ee Peng Lim, Aixin Sun, Wee-Keong Ng Mar 2003

Hierarchical Text Classification Methods And Their Specification, Ee Peng Lim, Aixin Sun, Wee-Keong Ng

Research Collection School Of Computing and Information Systems

Hierarchical text classification refers to assigning text documents to the categories in a given category tree based on their content. With large number of categories organized as a tree, hierarchical text classification helps users to find information more quickly and accurately. Nevertheless, hierarchical text classification methods in the past have often been constructed in a proprietary manner. The construction steps often involve human efforts and are not completely automated. In this chapter, we therefore propose a specification language known as HCL (Hierarchical Classification Language). HCL is designed to describe a hierarchical classification method including the definition of a category tree …


Stegfs: A Steganographic File System, Hwee Hwa Pang, Kian-Lee Tan, Xuan Zhou Mar 2003

Stegfs: A Steganographic File System, Hwee Hwa Pang, Kian-Lee Tan, Xuan Zhou

Research Collection School Of Computing and Information Systems

While user access control and encryption can protect valuable data from passive observers, those techniques leave visible ciphertexts that are likely to alert an active adversary to the existence of the data, who can then compel an authorized user to disclose it. This paper introduces StegFS, a steganographic file system that aims to overcome that weakness by offering plausible deniability to owners of protected files. StegFS securely hides user-selected files in a file system so that, without the corresponding access keys, an attacker would not be able to deduce their existence, even if the attacker is thoroughly familiar with the …


Instance Based Attribute Identification In Database Integration, Ee Peng Lim, Cecil Chua, Roger Hsiang-Li Chiang Jan 2003

Instance Based Attribute Identification In Database Integration, Ee Peng Lim, Cecil Chua, Roger Hsiang-Li Chiang

Research Collection School Of Computing and Information Systems

Most research on attribute identification in database integration has focused on integrating attributes using schema and summary information derived from the attribute values. No research has attempted to fully explore the use of attribute values to perform attribute identification. We propose an attribute identification method that employs schema and summary instance information as well as properties of attributes derived from their instances. Unlike other attribute identification methods that match only single attributes, our method matches attribute groups for integration. Because our attribute identification method fully explores data instances, it can identify corresponding attributes to be integrated even when schema information …


On Quantitative Evaluation Of Clustering Systems, Ji He, Ah-Hwee Tan, Chew-Lim Tan, Sam-Yuan Sung Jan 2003

On Quantitative Evaluation Of Clustering Systems, Ji He, Ah-Hwee Tan, Chew-Lim Tan, Sam-Yuan Sung

Research Collection School Of Computing and Information Systems

Clustering refers to the task of partitioning unlabelled data into meaningful groups (clusters). It is a useful approach in data mining processes for identifying hidden patterns and revealing underlying knowledge from large data collections. The application areas of clustering, to name a few, include image segmentation, information retrieval, document classification, associate rule mining, web usage tracking, and transaction analysis.


Performance Measurement Framework For Hierarchical Text Classification, Ee Peng Lim, Aixin Sun, Wee-Keong Ng Jan 2003

Performance Measurement Framework For Hierarchical Text Classification, Ee Peng Lim, Aixin Sun, Wee-Keong Ng

Research Collection School Of Computing and Information Systems

Hierarchical text classification or simply hierarchical classification refers to assigning a document to one or more suitable categories from a hierarchical category space. In our literature survey, we have found that the existing hierarchical classification experiments used a variety of measures to evaluate performance. These performance measures often assume independence between categories and do not consider documents misclassified into categories that are similar or not far from the correct categories in the category tree. In this paper, we therefore propose new performance measures for hierarchical classification. The proposed performance measures consist of category similarity measures and distance-based measures that consider …


Advances In Mobile Commerce Technologies, Ee Peng Lim, Keng Siau Jan 2003

Advances In Mobile Commerce Technologies, Ee Peng Lim, Keng Siau

Research Collection School Of Computing and Information Systems

No abstract provided.