Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Wollongong

Data

Faculty of Informatics - Papers (Archive)

Articles 1 - 23 of 23

Full-Text Articles in Physical Sciences and Mathematics

A Novel Approach To Data Deduplication Over The Engineering-Oriented Cloud Systems, Zhe Sun, Jun Shen, Jianming Young Jan 2013

A Novel Approach To Data Deduplication Over The Engineering-Oriented Cloud Systems, Zhe Sun, Jun Shen, Jianming Young

Faculty of Informatics - Papers (Archive)

This paper presents a duplication-less storage system over the engineering-oriented cloud computing platforms. Our deduplication storage system, which manages data and duplication over the cloud system, consists of two major components, a front-end deduplication application and a mass storage system as back-end. Hadoop distributed file system (HDFS) is a common distribution file system on the cloud, which is used with Hadoop database (HBase). We use HDFS to build up a mass storage system and employ HBase to build up a fast indexing system. With a deduplication application, a scalable and parallel deduplicated cloud storage system can be effectively built up. …


Privacy Enhanced Data Outsourcing In The Cloud, Miao Zhou, Yi Mu, Willy Susilo, Jun Yan, Liju Dong Jan 2012

Privacy Enhanced Data Outsourcing In The Cloud, Miao Zhou, Yi Mu, Willy Susilo, Jun Yan, Liju Dong

Faculty of Informatics - Papers (Archive)

How to secure outsourcing data in cloud computing is a challenging problem, since a cloud environment cannot been considered to be trusted. The situation becomes even more challenging when outsourced data sources in a cloud environment are managed by multiple outsourcers who hold different access rights. In this paper, we introduce an efficient and novel tree-based key management scheme that allows a data source to be accessed by multiple parties who hold different rights. We ensure that the database remains secure, while some selected data sources can be securely shared with other authorized parties.


Power Line Enhancement For Data Monitoring Of Neural Electrical Activity In The Human Body, Ahmed M. Haidar, Sridhathan C, Abdulsalam Hazza, Ahmed Saleh Jan 2011

Power Line Enhancement For Data Monitoring Of Neural Electrical Activity In The Human Body, Ahmed M. Haidar, Sridhathan C, Abdulsalam Hazza, Ahmed Saleh

Faculty of Informatics - Papers (Archive)

Distance and real-time data monitoring are the necessary condition that makes any system in good working order. Recent advancements in micro-electronics and wireless technology enable the application of wireless sensors in both industry and wild environments. However, Long-distance wireless communication has several drawbacks like limited bandwidth, considerable costs and unstable connection quality. Therefore, Power Line Communication (PLC) using pre-established Power Lines (PL) becomes more attractive for high data transmission technology. This paper reviews the existing distance data monitoring systems and presents a case study for data transferring of temperature and heart beat measurement. The simulations were carried out on the …


Fixed Rank Filtering For Spatio-Temporal Data, Noel Cressie, Tao Shi, Emily L. Kang Jan 2010

Fixed Rank Filtering For Spatio-Temporal Data, Noel Cressie, Tao Shi, Emily L. Kang

Faculty of Informatics - Papers (Archive)

Datasets from remote-sensing platforms and sensor networks are often spatial, temporal, and very large. Processing massive amounts of data to provide current estimates of the (hidden) state from current and past data is challenging, even for the Kalman filter. A large number of spatial locations observed through time can quickly lead to an overwhelmingly high-dimensional statistical model. Dimension reduction without sacrificing complexity is our goal in this article. We demonstrate how a Spatio-Temporal Random Effects (STRE) component of a statistical model reduces the problem to one of fixed dimension with a very fast statistical solution, a methodology we call Fixed …


Data Security And Information Privacy For Pda Accessible Clinical-Log For Medical Education In Problem-Based Learning (Pbl) Approach, Rattiporn Luanrattana, Khin Than Win, John A. Fulcher Jan 2010

Data Security And Information Privacy For Pda Accessible Clinical-Log For Medical Education In Problem-Based Learning (Pbl) Approach, Rattiporn Luanrattana, Khin Than Win, John A. Fulcher

Faculty of Informatics - Papers (Archive)

Data security and information privacy are the important aspects to consider for the use of mobile technology for recording clinical experience and encounter in medical education. Objective: This study aims to address the qualitative findings of the appropriate data security and information privacy for PDA accessible clinical-log in problem-based learning (PBL) approach in medical education. Method: The semi-structured interviews were conducted with the medical faculty members, honorary clinical academics and medical education technology specialists. Results: Data security and information access plan were determined for managing clinical-log data. The results directed the guideline for the future development and implementation of clinical-log …


Estimating Shared Copy Number Aberrations For Array Cgh Data: The Linear-Median Method, Yan-Xia Lin, Veera Baladandayuthapani, V Bonato, K.-A. Do Jan 2010

Estimating Shared Copy Number Aberrations For Array Cgh Data: The Linear-Median Method, Yan-Xia Lin, Veera Baladandayuthapani, V Bonato, K.-A. Do

Faculty of Informatics - Papers (Archive)

Motivation: Existing methods for estimating copy number variations in array comparative genomic hybridization (aCGH) data are limited to estimations of the gain/loss of chromosome regions for single sample analysis. We propose the linear-median method for estimating shared copy numbers in DNA sequences across multiple samples, demonstrate its operating characteristics through simulations and applications to real cancer data, and compare it to two existing methods.

Results: Our proposed linear-median method has the power to estimate common changes that appear at isolated single probe positions or very short regions. Such changes are hard to detect by current methods. This new …


Explicit Connections Between Longitudinal Data Analysis And Kernel Machines, N D. Pearce, M. P. Wand Jan 2009

Explicit Connections Between Longitudinal Data Analysis And Kernel Machines, N D. Pearce, M. P. Wand

Faculty of Informatics - Papers (Archive)

Two areas of research - longitudinal data analysis and kernel machines - have large, but mostly distinct, literatures. This article shows explicitly that both fields have much in common with each other. In particular, many popular longitudinal data fitting procedures are special types of kernel machines. These connections have the potential to provide fruitful cross-fertilization between longitudinal data analytic and kernel machine methodology.


Clustering, Classification And Explanatory Rules From Harmonic Monitoring Data, Ali Asheibi, David A. Stirling, Danny Sutanto, D A. Robinson Jan 2009

Clustering, Classification And Explanatory Rules From Harmonic Monitoring Data, Ali Asheibi, David A. Stirling, Danny Sutanto, D A. Robinson

Faculty of Informatics - Papers (Archive)

A method based on the successful AutoClass (Cheeseman & Stutz, 1996) and the Snob research programs (Wallace & Dowe, 1994); (Baxter & Wallace, 1996) has been chosen for our research work on harmonic classification. The method utilizes mixture models (McLachlan, 1992) as a representation of the formulated clusters. This research is principally based on the formation of such mixture models (typically based on Gaussian distributions) through a Minimum Message Length (MML) encoding scheme (Wallace & Boulton, 1968). During the formation of such mixture models the various derivative tools (algorithms) allow for the automated selection of the number of clusters and …


Learning Pattern Classification Tasks With Imbalanced Data Sets, Son Lam Phung, Abdesselam Bouzerdoum, Giang Hoang Nguyen Jan 2009

Learning Pattern Classification Tasks With Imbalanced Data Sets, Son Lam Phung, Abdesselam Bouzerdoum, Giang Hoang Nguyen

Faculty of Informatics - Papers (Archive)

This chapter is concerned with the class imbalance problem, which has been recognised as a crucial problem in machine learning and data mining. The problem occurs when there are significantly fewer training instances of one class compared to another class.


Data Mining Of Misr Aerosol Product Using Spatial Statistics, Tao Shi, Noel A. Cressie Jan 2007

Data Mining Of Misr Aerosol Product Using Spatial Statistics, Tao Shi, Noel A. Cressie

Faculty of Informatics - Papers (Archive)

In climate models, aerosol forcing is the major source of uncertainty in climate forcing, over the industrial period. To reduce this uncertainty, instruments on satellites have been put in place to collect global data. However, missing and noisy observations impose considerable difficulties for scientists researching global aerosol distribution, aerosol transportation, and comparisons between satellite observations and global-climate-model outputs. In this paper, we propose a Spatial Mixed Effects (SME) statistical model to predict the missing values, denoise the observed values, and quantify the spatial-prediction uncertainties. The computations associated with the SME model are linear scalable to the number of data points, …


The Benefits & Concerns Of Public Data Availability In Australia: A Survey Of Security Experts, Roba Abbas Jan 2007

The Benefits & Concerns Of Public Data Availability In Australia: A Survey Of Security Experts, Roba Abbas

Faculty of Informatics - Papers (Archive)

This paper gauges the attitudes of security experts in Australia with regards to public data availability on critical infrastructure protection (CIP). A qualitative survey was distributed to a individuals considered experts in CIP-related research in Australia, in order to address the censorship versus open access debate concerning public data. The intention of the study was to gain an insight into the perceived benefits and threats of public data availability by security experts, and to provide the basis for a security solution to be utilised by the Australian Government sector (at all levels). The findings however can also be applied to …


A Discussion About The Importance Of Laws And Policies For Data Sharing For Public Health In The People's Republic Of China, Xiue Fan, Ping Yu Jan 2007

A Discussion About The Importance Of Laws And Policies For Data Sharing For Public Health In The People's Republic Of China, Xiue Fan, Ping Yu

Faculty of Informatics - Papers (Archive)

This paper introduces the current status of data sharing in the People's Republic of China. It discusses barriers to data sharing and proposes three key solutions to overcome these barriers in China. The establishment of national laws and policies for data sharing is considered the key prerequisite to ensuring the successful implementation of resource sharing activities in public health. Driven by established laws and policies, the relevant operational models should be developed. It is also important to have strategies in place to ensure the established laws and policies are implemented by various organizations in different jurisdictions. These discussions are supported …


Application Of Semistructured Data Model To The Implementation Of Semantic Content-Based Video Retrieval System, Lilac A. E. Al-Safadi, Janusz R. Getta Jan 2007

Application Of Semistructured Data Model To The Implementation Of Semantic Content-Based Video Retrieval System, Lilac A. E. Al-Safadi, Janusz R. Getta

Faculty of Informatics - Papers (Archive)

Semantic indexing of a video document is a process that performs the identification of elementary and complex semantic units in the indexed document in order to create a semantic index defined as a mapping of semantic units into the sequences of video frames. Semantic content-based video retrieval system is a software system that uses a semantic index built over a collection of video documents to retrieve the sequences of video frames that satisfy the given conditions. This work introduces a new multilevel view of data for the semantic content-based video retrieval systems. At the topmost level, we define an abstract …


Secure Data Transmission Using Quibits, Iman Marvian, Saied Hosseini-Khayat Jan 2006

Secure Data Transmission Using Quibits, Iman Marvian, Saied Hosseini-Khayat

Faculty of Informatics - Papers (Archive)

A quantum protocol for secure transmission of data using qubits is presented. This protocol sends one qubit in a round-trip to transmit one bit of data. The protocol offers an improvement over the BB84 QKD protocol. BB84, in conjunction with one-time pad encryption, has been shown to be unconditionally secure. However its security relies on the assumption that the qubit source device does not emit multiple replicas of the same qubit for each transmitted bit. If this happens a multi-qubit emission attack can be launched. In addition, BB84 cannot be used to send predetermined bit strings as it generates a …


Wfms-Based Data Integration For E-Learning, Jianming Yong, Jun Yan, Xiaodi Huang Jan 2006

Wfms-Based Data Integration For E-Learning, Jianming Yong, Jun Yan, Xiaodi Huang

Faculty of Informatics - Papers (Archive)

As more and more organisations and institutions are moving towards the e-learning strategy, more and more disparate data are distributed by different e-learning systems. How to effectively use this vast amount of distributed data becomes a big challenge. This paper addresses this challenge and works out a new mechanism to implement data integration for e-learning. A workflow management system based (WFMS-based) data integration model is contributed to the e-learning.


The Risk Of Public Data Availability On Critical Infrastructure Protection, Roba Abbas Jan 2006

The Risk Of Public Data Availability On Critical Infrastructure Protection, Roba Abbas

Faculty of Informatics - Papers (Archive)

This paper examines the threat of freely available information on critical infrastructure protection (CIP) efforts. Critical infrastructure are the services required to maintain the stability and security of a country, and comprise both physical and cyber infrastructures. These interdependent entities must be protected from natural disasters, accidental errors, and deliberate attacks. The CIP process typically includes vulnerability assessment, risk assessment and risk management, and has been a global concern for many years; the concern now amplified in Australia due to a number of recent events such the 9/11 attacks, and the Bali bombings. The events have called into question the …


Analyzing Harmonic Monitoring Data Using Data Mining, Ali Asheibi, David A. Stirling, Danny Sutanto Jan 2006

Analyzing Harmonic Monitoring Data Using Data Mining, Ali Asheibi, David A. Stirling, Danny Sutanto

Faculty of Informatics - Papers (Archive)

Harmonic monitoring has become an important tool for harmonic management in distribution systems. A comprehensive harmonic monitoring program has been designed and implemented on a typical electrical MV distribution system in Australia. The monitoring program involved measurements of the three-phase harmonic currents and voltages from the residential, commercial and industrial load sectors. Data over a three year period has been downloaded and available for analysis. The large amount of acquired data makes it difficult to identify operational events that impact significantly on the harmonics generated on the system. More sophisticated analysis methods are required to automatically determine which part of …


On The Design Of Early Generation Variety Trials With Correlated Data, Brian R. Cullis, A B. Smith, N. E. Coombes Jan 2006

On The Design Of Early Generation Variety Trials With Correlated Data, Brian R. Cullis, A B. Smith, N. E. Coombes

Faculty of Informatics - Papers (Archive)

This article considers the design of early generation variety trials with a prespecified spatial correlation structure and introduces a new class of partially replicated designs called p-rep designs in which the plots of standard varieties are replaced by additional plots of test lines. We show how efficient p-rep designs can be readily generated using the modified Reactive TABU search algorithm. The expected and realized genetic gain of p-rep and grid plot designs is compared in a simulation study.


A Data-Fitting Approach For Displacements And Vibration Measurement Using Self-Mixing Interferometers, Yi Zhang, Jiangtao Xi, Joe F. Chicharo, Yanguang Yu Jan 2005

A Data-Fitting Approach For Displacements And Vibration Measurement Using Self-Mixing Interferometers, Yi Zhang, Jiangtao Xi, Joe F. Chicharo, Yanguang Yu

Faculty of Informatics - Papers (Archive)

This paper presents a signal processing approach for vibration measurement using self-mixing interferometer (SMI). Compared to existing approaches, the proposed approach is able to achieve an accuracy of λ/40 which significantly exceeds the accuracy limit associated with conventional simple SMI systems λ/4.


Power Quality Data Analysis Using Unsupervised Data Mining, Ali Asheibi, David A. Stirling, Sarath Perera, D A. Robinson Jan 2004

Power Quality Data Analysis Using Unsupervised Data Mining, Ali Asheibi, David A. Stirling, Sarath Perera, D A. Robinson

Faculty of Informatics - Papers (Archive)

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Data mining is a process that uses a variety of data analysis tools to identify hidden patterns and relationships within large samples of data. This paper presents several data mining tools and techniques that are applicable to power quality data analysis to enable efficient reporting of disturbance indices and identify network problems through pattern recognition. This paper also presents results of data mining techniques applied …


Data Management For Large Scale Power Quality Surveys, Murray-Luke Peard, Sean T. Elphick, Victor W. Smith, Victor J. Gosbell, D A. Robinson Jan 2004

Data Management For Large Scale Power Quality Surveys, Murray-Luke Peard, Sean T. Elphick, Victor W. Smith, Victor J. Gosbell, D A. Robinson

Faculty of Informatics - Papers (Archive)

For large scale power quality surveys, the management of the large amount of data generated is a major issue. This paper presents solutions to three main areas of data management, viz. a data interchange format, database design and data processing. Consideration of these issues has come about as a result of the Long Term National Power Quality Survey currently being conducted by the University of Wollongong, and reference is made to that specific application for illustrative purposes.


Fast, Resolution-Consistent Spatial Prediction Of Global Processes From Satellite Data, Hsin-Cheng Huang, Noel A. Cressie, John Gabrosek Jan 2002

Fast, Resolution-Consistent Spatial Prediction Of Global Processes From Satellite Data, Hsin-Cheng Huang, Noel A. Cressie, John Gabrosek

Faculty of Informatics - Papers (Archive)

Polar orbiting satellites remotely sense the earth and its atmosphere, producing datasets that give daily global coverage. For any given day, the data are many and measured at spatially irregular locations. Our goal in this article is to predict values that are spatially regular at different resolutions; such values are often used as input to general circulation models (GCMs) and the like. Not only do we wish to predict optimally, but because data acquisition is relentless, our algorithm must also process the data very rapidly. This article applies a multiresolution autoregressive tree-structured model, and presents a new statistical prediction methodology …


Bayesian Hierarchical Analysis Of Minefield Data, Noel A. Cressie, Andrew B. Lawson Jan 1998

Bayesian Hierarchical Analysis Of Minefield Data, Noel A. Cressie, Andrew B. Lawson

Faculty of Informatics - Papers (Archive)

Based on remote sensing of a potential minefield, point locations are identified, some of which may not be mines. The mines and mine-like objects are to be distinguished based on their point patterns, although it must be emphasized that all we see is the superposition of their locations. In this paper, we construct a hierarchical spatial point-process model that accounts for the different patterns of mines and mine-like objects and uses posterior analysis to distinguish between them. Our Bayesian approach is applied to COBRA image data obtained from the NSWC Coastal Systems Station, Dahlgren Division, Panama City, Florida. 2003 Copyright …