Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 47

Full-Text Articles in Physical Sciences and Mathematics

Authenticating Query Results In Data Publishing, Di Ma, Robert H. Deng, Hwee Hwa Pang, Jianying Zhou Dec 2005

Authenticating Query Results In Data Publishing, Di Ma, Robert H. Deng, Hwee Hwa Pang, Jianying Zhou

Research Collection School Of Computing and Information Systems

We propose a communication-efficient authentication scheme to authenticate query results disseminated by untrusted data publishing servers. In our scheme, signatures of multiple tuples in the result set are aggregated into one and thus the communication overhead incurred by the signature keeps constant. Next attr-MHTs (tuple based Merkle Hash Tree) are built to further reduce the communication overhead incurred by auxiliary authentication information (AAI). Besides the property of communication-efficiency, our scheme also supports dynamic SET operations (UNION, INTERSECTION) and dynamic JOIN with immunity to reordering attack.


Webarc: Website Archival Using A Structured Approach, Ee Peng Lim, Maria Marissa Dec 2005

Webarc: Website Archival Using A Structured Approach, Ee Peng Lim, Maria Marissa

Research Collection School Of Computing and Information Systems

Website archival refers to the task of monitoring and storing snapshots of website(s) for future retrieval and analysis. This task is particularly important for websites that have content changing over time with older information constantly overwritten by newer one. In this paper, we propose WEBARC as a set of software tools to allow users to construct a logical structure for a website to be archived. Classifiers are trained to. determine relevant web pages and their categories, and subsequently used in website downloading. The archival schedule can be specified and executed by a scheduler. A website viewer is also developed to …


A Framework To Learn Bayesian Network From Changing, Multiple-Source Biomedical Data, Li G., Tze-Yun Leong Dec 2005

A Framework To Learn Bayesian Network From Changing, Multiple-Source Biomedical Data, Li G., Tze-Yun Leong

Research Collection School Of Computing and Information Systems

Structure learning in Bayesian network is a big issue. Many efforts have tried to solve this problem and quite a few algorithms have been proposed. However, when we attempt to apply the existing methods to microarray data, there are three main challenges: 1) there are many variables in the data set, 2) the sample size is small, and 3) microarray data are changing from experiment to experiment and new data are available quickly. To address these three problems, we assume that the major functions of a kind of cells do not change too much in different experiments, and propose a …


A Threshold-Based Algorithm For Continuous Monitoring Of K Nearest Neighbors, Kyriakos Mouratidis, Dimitris Papadias, Spiridon Bakiras, Yufei Tao Nov 2005

A Threshold-Based Algorithm For Continuous Monitoring Of K Nearest Neighbors, Kyriakos Mouratidis, Dimitris Papadias, Spiridon Bakiras, Yufei Tao

Research Collection School Of Computing and Information Systems

Assume a set of moving objects and a central server that monitors their positions over time, while processing continuous nearest neighbor queries from geographically distributed clients. In order to always report up-to-date results, the server could constantly obtain the most recent position of all objects. However, this naïve solution requires the transmission of a large number of rapid data streams corresponding to location updates. Intuitively, current information is necessary only for objects that may influence some query result (i.e., they may be included in the nearest neighbor set of some client). Motivated by this observation, we present a threshold-based algorithm …


Mining Ontological Knowledge From Domain-Specific Text Documents, Xing Jiang, Ah-Hwee Tan Nov 2005

Mining Ontological Knowledge From Domain-Specific Text Documents, Xing Jiang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Traditional text mining systems employ shallow parsing techniques and focus on concept extraction and taxonomic relation extraction. This paper presents a novel system called CRCTOL for mining rich semantic knowledge in the form of ontology from domain-specific text documents. By using a full text parsing technique and incorporating both statistical and lexico-syntactic methods, the knowledge extracted by our system is more concise and contains a richer semantics compared with alternative systems. We conduct a case study wherein CRCTOL extracts ontological knowledge, specifically key concepts and semantic relations, from a terrorism domain text collection. Quantitative evaluation, by comparing with a state-of-the-art …


Query Processing In Spatial Databases Containing Obstacles, Jun Zhang, Dimitris Papadias, Kyriakos Mouratidis, Manli Zhu Nov 2005

Query Processing In Spatial Databases Containing Obstacles, Jun Zhang, Dimitris Papadias, Kyriakos Mouratidis, Manli Zhu

Research Collection School Of Computing and Information Systems

Despite the existence of obstacles in many database applications, traditional spatial query processing assumes that points in space are directly reachable and utilizes the Euclidean distance metric. In this paper, we study spatial queries in the presence of obstacles, where the obstructed distance between two points is defined as the length of the shortest path that connects them without crossing any obstacles. We propose efficient algorithms for the most important query types, namely, range search, nearest neighbours, e-distance joins, closest pairs and distance semi-joins, assuming that both data objects and obstacles are indexed by R-trees. The effectiveness of the proposed …


Accurately Extracting Coherent Relevant Passages Using Hidden Markov Models, Jing Jiang, Chengxiang Zhai Nov 2005

Accurately Extracting Coherent Relevant Passages Using Hidden Markov Models, Jing Jiang, Chengxiang Zhai

Research Collection School Of Computing and Information Systems

In this paper, we present a principled method for accurately extracting coherent relevant passages of variable lengths using HMMs. We show that with appropriate parameter estimation, the HMM method outperforms a number of strong baseline methods on two data sets.


Dsim: A Distance-Based Indexing Method For Genomic Sequences, Xia Cao, Beng-Chin Ooi, Hwee Hwa Pang, Kian-Lee Tan, Anthony K. H. Tung Oct 2005

Dsim: A Distance-Based Indexing Method For Genomic Sequences, Xia Cao, Beng-Chin Ooi, Hwee Hwa Pang, Kian-Lee Tan, Anthony K. H. Tung

Research Collection School Of Computing and Information Systems

In this paper, we propose a Distance-based Sequence Indexing Method (DSIM) for indexing and searching genome databases. Borrowing the idea of video compression, we compress the genomic sequence database around a set of automatically selected reference words, formed from high-frequency data substrings and substrings in past queries. The compression captures the distance of each non-reference word in the database to some reference word. At runtime, a query is processed by comparing its substrings with the compressed data strings, through their distances to the reference words. We also propose an efficient scheme to incrementally update the reference words and the compressed …


Automatic 3d Face Modeling Using 2d Active Appearance Models, Jianke Zhu, Steven Hoi, Michael R. Lyu Oct 2005

Automatic 3d Face Modeling Using 2d Active Appearance Models, Jianke Zhu, Steven Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Although a lot of promising research findings have been studied on 3D face modeling in the past years, it is still a challenge to generate realistic 3D human face models and facial animations. This paper presents a novel approach to model 3D faces automatically from still images or video sequences without manual interactions. Our proposed scheme comprises three steps. First, we offline construct 3D shape models using Active Appearance Models (AAMs), which saves large computation costs for online modeling. Second, based on the computed 3D shape models, we propose an efficient algorithm to estimate the parameters of 3D pose and …


Nil Is Not Nothing: Recognition Of Chinese Network Informal Language Expressions, Yunqing Xia, Kam-Fai Wong, Wei Gao Oct 2005

Nil Is Not Nothing: Recognition Of Chinese Network Informal Language Expressions, Yunqing Xia, Kam-Fai Wong, Wei Gao

Research Collection School Of Computing and Information Systems

Informal language is actively used in network-mediated communication, e.g. chat room, BBS, email and text message. We refer the anomalous terms used in such context as network informal language (NIL) expressions. For example, “஧(ou3)” is used to replace “ᚒ(wo3)” in Chinese ICQ. Without unconventional resource, knowledge and techniques, the existing natural language processing approaches exhibit less effectiveness in dealing with NIL text. We propose to study NIL expressions with a NIL corpus and investigate techniques in processing NIL expressions. Two methods for Chinese NIL expression recognition are designed in NILER system. The experimental results show that pattern matching method produces …


On Organizing And Accessing Geospatial And Georeferenced Web Resources Using The G-Portal System, Zehua Liu, Ee Peng Lim, Yin-Leng Theng, Dion Hoe-Lian Goh, Wee-Keong Ng Sep 2005

On Organizing And Accessing Geospatial And Georeferenced Web Resources Using The G-Portal System, Zehua Liu, Ee Peng Lim, Yin-Leng Theng, Dion Hoe-Lian Goh, Wee-Keong Ng

Research Collection School Of Computing and Information Systems

In order to organise and manage geospatial and georeferenced information on the Web making them convenient for searching and browsing, a digital portal known as G-Portal has been designed and implemented. Compared to other digital libraries, G-Portal is unique for several of its features. It maintains metadata resources in XML with flexible resource schemas. Logical groupings of metadata resources as projects and layers are possible to allow the entire metadata collection to be partitioned differently for users with different information needs. These metadata resources can be displayed in both the classification-based and map-based interfaces provided by G-Portal. G-Portal further incorporates …


Managing Geography Learning Objects Using Personalized Project Spaces In G-Portal, Dion Hoe-Lian Goh, Aixin Sun, Wenbo Zong, Dan Wu, Ee Peng Lim, Yin-Leng Theng, John Hedberg, Chew-Hung Chang Sep 2005

Managing Geography Learning Objects Using Personalized Project Spaces In G-Portal, Dion Hoe-Lian Goh, Aixin Sun, Wenbo Zong, Dan Wu, Ee Peng Lim, Yin-Leng Theng, John Hedberg, Chew-Hung Chang

Research Collection School Of Computing and Information Systems

The personalized project space is an important feature in G-Portal that supports individual and group learning activities. Within such a space, its owner can create, delete, and organize metadata referencing learning objects on the Web. Browsing and querying are among the functions provided to access the metadata. In addition, new schemas can be added to accommodate metadata of diverse attribute sets. Users can also easily share metadata across different projects using a “copy-and-paste” approach. Finally, a viewer to support offline viewing of personalized project content is also provided.


Cuhk At Imageclef 2005: Cross-Language And Cross-Media Image Retrieval, Steven C. H. Hoi, J. Zhu, M. Lyu Sep 2005

Cuhk At Imageclef 2005: Cross-Language And Cross-Media Image Retrieval, Steven C. H. Hoi, J. Zhu, M. Lyu

Research Collection School Of Computing and Information Systems

In this paper, we describe our studies of cross-language and cross-media image retrieval at the ImageCLEF 2005. This is the first participation of our CUHK (The Chinese University of Hong Kong) group at ImageCLEF. The task in which we participated is the “bilingual ad hoc retrieval” task. There are three major focuses and contributions in our participation. The first is the empirical evaluation of language models and smoothing strategies for cross-language image retrieval. The second is the evaluation of cross-media image retrieval, i.e., combining text and visual contents for image retrieval. The last is the evaluation of bilingual image retrieval …


Wmxml: A System For Watermarking Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan, Dhruv Mangla Aug 2005

Wmxml: A System For Watermarking Xml Data, Xuan Zhou, Hwee Hwa Pang, Kian-Lee Tan, Dhruv Mangla

Research Collection School Of Computing and Information Systems

As increasing amount of data is published in the form of XML, copyright protection of XML data is becoming an important requirement for many applications. While digital watermarking is a widely used measure to protect digital data from copyright offences, the complex and flexible construction of XML data poses a number of challenges to digital watermarking, such as re-organization and alteration attacks. To overcome these challenges, the watermarking scheme has to be based on the usability of data and the underlying semantics like key attributes and functional dependencies. In this paper, we describe WmXML, a system for watermarking XML documents. …


Medoid Queries In Large Spatial Databases, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou Aug 2005

Medoid Queries In Large Spatial Databases, Kyriakos Mouratidis, Dimitris Papadias, Spiros Papadimitriou

Research Collection School Of Computing and Information Systems

Assume that a franchise plans to open k branches in a city, so that the average distance from each residential block to the closest branch is minimized. This is an instance of the k-medoids problem, where residential blocks constitute the input dataset and the k branch locations correspond to the medoids. Since the problem is NP-hard, research has focused on approximate solutions. Despite an avalanche of methods for small and moderate size datasets, currently there exists no technique applicable to very large databases. In this paper, we provide efficient algorithms that utilize an existing data-partition index to achieve low CPU …


Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis Aug 2005

Constrained Shortest Path Computation, Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, Kyriakos Mouratidis

Research Collection School Of Computing and Information Systems

This paper proposes and solves a-autonomy and k-stops shortest path problems in large spatial databases. Given a source s and a destination d, an aautonomy query retrieves a sequence of data points connecting s and d, such that the distance between any two consecutive points in the path is not greater than a. A k-stops query retrieves a sequence that contains exactly k intermediate data points. In both cases our aim is to compute the shortest path subject to these constraints. Assuming that the dataset is indexed by a data-partitioning method, the proposed techniques initially compute a sub-optimal path by …


Web Mining - The Ontology Approach, Ee Peng Lim, Aixin Sun Aug 2005

Web Mining - The Ontology Approach, Ee Peng Lim, Aixin Sun

Research Collection School Of Computing and Information Systems

The World Wide Web today provides users access to extremely large number of Web sites many of which contain information of education and commercial values. Due to the unstructured and semi-structured nature of Web pages and the design idiosyncrasy of Web sites, it is a challenging task to develop digital libraries for organizing and managing digital content from the Web. Web mining research, in its last 10 years, has on the other hand made significant progress in categorizing and extracting content from the Web. In this paper, we represent ontology as a set of concepts and their inter-relationships relevant to …


Geogdl: A Web-Based Approach To Geography Examination, Ee Peng Lim, Dion Hoe-Lian Goh, Yin-Leng Theng Aug 2005

Geogdl: A Web-Based Approach To Geography Examination, Ee Peng Lim, Dion Hoe-Lian Goh, Yin-Leng Theng

Research Collection School Of Computing and Information Systems

The traditional educational approach with students as passive recipients has been the subject of criticism. A constructivist learner-centered approach towards education has been argued to produce greater internalization and application of knowledge compared to the traditional teacher-centered, transmission-oriented approach. Nevertheless, contemporary instructional design models argue for the use and integration of both approaches especially in complex learning tasks. This paper describes GeogDL, a Web-based application developed above a digital library of geographical resources for Singapore students preparing to take a national examination in geography. GeogDL is unique in that it not only provides an environment for active learning, it also …


Hot Event Detection And Summarization By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo Jul 2005

Hot Event Detection And Summarization By Graph Modeling And Matching, Yuxin Peng, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This paper proposes a new approach for hot event detection and summarization of news videos. The approach is mainly based on two graph algorithms: optimal matching (OM) and normalized cut (NC). Initially, OM is employed to measure the visual similarity between all pairs of events under the one-to-one mapping constraint among video shots. Then, news events are represented as a complete weighted graph and NC is carried out to globally and optimally partition the graph into event clusters. Finally, based on the cluster size and globality of events, hot events can be automatically detected and selected as the summaries of …


An Informatization Of Society Approach To E-Government: Analyzing Singapore’S E-Government Efforts, Calvin M. L. Chan, Yi Meng Lau Jul 2005

An Informatization Of Society Approach To E-Government: Analyzing Singapore’S E-Government Efforts, Calvin M. L. Chan, Yi Meng Lau

Research Collection School Of Computing and Information Systems

Despite the much publicized benefits of e-government, many countries are experiencing difficulty in yielding success in their e-government initiatives. Studies which adopt the national e-government initiatives as the unit of analysis remain largely rare. This paper aspires to provide an analysis of Singapore’s widely acclaimed success in the e-government effort at the national level to allow other countries to learn and gain from its experience. Using the ‘Conceptual Framework for the Informatization of Society’ in facilitating the data analysis, implications are drawn to offer insights for the considerations of e-government practitioners. Theoretical implications are also derived through positing that the …


Co-Clustering Of Time-Evolving News Story With Transcript And Keyframe, Xiao Wu, Chong-Wah Ngo, Qing Li Jul 2005

Co-Clustering Of Time-Evolving News Story With Transcript And Keyframe, Xiao Wu, Chong-Wah Ngo, Qing Li

Research Collection School Of Computing and Information Systems

This paper presents techniques in clustering the same-topic news stories according to event themes. We model the relationship of stories with textual and visual concepts under the representation of bipartite graph. The textual and visual concepts are extracted respectively from speech transcripts and keyframes. Co-clustering algorithm is employed to exploit the duality of stories and textual-visual concepts based on spectral graph partitioning. Experimental results on TRECVID-2004 corpus show that the co-clustering of news stories with textual-visual concepts is significantly better than the co-clustering with either textual or visual concept alone.


Multibiometrics Based On Palmprint And Handgeometry, Xiao-Yong Wei, Dan Xu, Chong-Wah Ngo Jul 2005

Multibiometrics Based On Palmprint And Handgeometry, Xiao-Yong Wei, Dan Xu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This paper described our approach of multibiometrics in a single image. Firstly, a new method for capturing the key points of hand geometry is proposed. Then, we described our new method of palmprint feature extracting. By using projection transform and wavelet transform, this method considered both the global feature and local detail of a palmprint texture and proposed a new kind of palmprint feature. We also proposed a twice segmentation method for handgeometry feature extraction. In the processing of feature matching, we analyzed the weakness of the traditional Euclidian Square Norm method, and introduced an improved method. The experimental results …


Social Network Discovery By Mining Spatio-Temporal Events, Hady Lauw, Ee Peng Lim, Hwee Hwa Pang, Teck-Tim Tan Jul 2005

Social Network Discovery By Mining Spatio-Temporal Events, Hady Lauw, Ee Peng Lim, Hwee Hwa Pang, Teck-Tim Tan

Research Collection School Of Computing and Information Systems

Knowing patterns of relationship in a social network is very useful for law enforcement agencies to investigate collaborations among criminals, for businesses to exploit relationships to sell products, or for individuals who wish to network with others. After all, it is not just what you know, but also whom you know, that matters. However, finding out who is related to whom on a large scale is a complex problem. Asking every single individual would be impractical, given the huge number of individuals and the changing dynamics of relationships. Recent advancement in technology has allowed more data about activities of individuals …


Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui Jun 2005

Aggregate Nearest Neighbor Queries In Spatial Databases, Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, Chun Kit Hui

Research Collection School Of Computing and Information Systems

Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,...qn, an ANN query outputs the facility p belongs to P that minimizes the sum of distances |pqi| for 1 is less than or equal to i is less than or equal to n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p belongs to P that minimizes the maximum distance that …


Dsi: A Fully Distributed Spatial Index For Wireless Data Broadcast, Wang-Chien Lee, Baihua Zheng Jun 2005

Dsi: A Fully Distributed Spatial Index For Wireless Data Broadcast, Wang-Chien Lee, Baihua Zheng

Research Collection School Of Computing and Information Systems

Recent announcement of the MSN Direct Service has demonstrated the feasibility and industrial interest in utilizing wireless broadcast for pervasive information services. To support location-based services in wireless data broadcast systems, a distributed spatial index (called DSI) is proposed in this paper. DSI is highly efficient because it has a linear yet fully distributed structure that facilitates multiple search paths to be naturally mixed together by sharing links. Moreover, DSI is very resilient in error-prone wireless communication environments. Search algorithms for two classical location-based queries, window queries and kNN queries, based on DSI are presented. Performance evaluation of DSI shows …


Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring, Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias Jun 2005

Conceptual Partitioning: An Efficient Method For Continuous Nearest Neighbor Monitoring, Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias

Research Collection School Of Computing and Information Systems

Given a set of objects P and a query point q, a k nearest neighbor (k-NN) query retrieves the k objects in P that lie closest to q. Even though the problem is well-studied for static datasets, the traditional methods do not extend to highly dynamic environments where multiple continuous queries require real-time results, and both objects and queries receive frequent location updates. In this paper we propose conceptual partitioning (CPM), a comprehensive technique for the efficient monitoring of continuous NN queries. CPM achieves low running time by handling location updates only from objects that fall in the vicinity of …


A Semi-Supervised Active Learning Framework For Image Retrieval, Steven Hoi, Michael R. Lyu Jun 2005

A Semi-Supervised Active Learning Framework For Image Retrieval, Steven Hoi, Michael R. Lyu

Research Collection School Of Computing and Information Systems

Although recent studies have shown that unlabeled data are beneficial to boosting the image retrieval performance, very few approaches for image retrieval can learn with labeled and unlabeled data effectively. This paper proposes a novel semi-supervised active learning framework comprising a fusion of semi-supervised learning and support vector machines. We provide theoretical analysis of the active learning framework and present a simple yet effective active learning algorithm for image retrieval. Experiments are conducted on real-world color images to compare with traditional methods. The promising experimental results show that our proposed scheme significantly outperforms the previous approaches.


On Assigning Place Names To Geography Related Web Pages, Wenbo Zong, Dan Wu, Aixin Sun, Ee Peng Lim, Dion Hoe-Lian Goh Jun 2005

On Assigning Place Names To Geography Related Web Pages, Wenbo Zong, Dan Wu, Aixin Sun, Ee Peng Lim, Dion Hoe-Lian Goh

Research Collection School Of Computing and Information Systems

In this paper, we attempt to give spatial semantics to web pages by assigning them place names. The entire assignment task is divided into three sub-problems, namely place name extraction, place name disambiguation and place name assignment. We propose our approaches to address these sub-problems. In particular, we have modified GATE, a well-known named entity extraction software, to perform place name extraction using a US Census gazetteer. A rule-based place name disambiguation method and a place name assignment method capable of assigning place names to web page segments have also been proposed. We have evaluated our proposed disambiguation and assignment …


Evaluating G-Portal For Geography Learning And Teaching, Chew-Hung Chang, John G. Hedberg, Yin-Leng Theng, Ee Peng Lim, Tiong-Sa Teh, Dion Hoe-Lian Goh Jun 2005

Evaluating G-Portal For Geography Learning And Teaching, Chew-Hung Chang, John G. Hedberg, Yin-Leng Theng, Ee Peng Lim, Tiong-Sa Teh, Dion Hoe-Lian Goh

Research Collection School Of Computing and Information Systems

This paper describes G-Portal, a geospatial digital library of geographical assets, providing an interactive platform to engage students in active manipulation and analysis of information resources and collaborative learning activities. Using a G-Portal application in which students conducted a field study of an environmental problem of beach erosion and sea level rise, we describe a pilot study to evaluate usefulness and usability issues to support the learning of geographical concepts, and in turn teaching.


Verifying Completeness Of Relational Query Results In Data Publishing, Hwee Hwa Pang, Arpit Jain, Krithi Ramamritham, Kian-Lee Tan Jun 2005

Verifying Completeness Of Relational Query Results In Data Publishing, Hwee Hwa Pang, Arpit Jain, Krithi Ramamritham, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

In data publishing, the owner delegates the role of satisfying user queries to a third-party publisher. As the publisher may be untrusted or susceptible to attacks, it could produce incorrect query results. In this paper, we introduce a scheme for users to verify that their query results are complete (i.e., no qualifying tuples are omitted) and authentic (i.e., all the result values originated from the owner). The scheme supports range selection on key and non-key attributes, project as well as join queries on relational databases. Moreover, the proposed scheme complies with access control policies, is computationally secure, and can be …