Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Research outputs pre 2011

Series

Data Mining

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

A Novel Subspace Outlier Detection Approach In High Dimensional Data Sets, Jinsong Leng Jan 2010

A Novel Subspace Outlier Detection Approach In High Dimensional Data Sets, Jinsong Leng

Research outputs pre 2011

Many real applications are required to detect outliers in high dimensional data sets. The major difficulty of mining outliers lies on the fact that outliers are often embedded in subspaces. No efficient methods are available in general for subspace-based outlier detection. Most existing subspacebased outlier detection methods identify outliers by searching for abnormal sparse density units in subspaces. In this paper, we present a novel approach for finding outliers in the ‘interesting’ subspaces. The interesting subspaces are strongly correlated with `good' clusters. This approach aims to group the meaningful subspaces and then identify outliers in the projected subspaces. In doing …


A Wrapper-Based Feature Selection For Analysis Of Large Data Sets, Jinsong Leng, Craig Valli, Leisa Armstrong Jan 2010

A Wrapper-Based Feature Selection For Analysis Of Large Data Sets, Jinsong Leng, Craig Valli, Leisa Armstrong

Research outputs pre 2011

Knowledge discovery from large data sets using classic data mining techniques has been proved to be difficult due to large size in both dimension and samples. In real applications, data sets often consist of many noisy, redundant, and irrelevant features, resulting in degrading the classification accuracy and increasing the complexity exponentially. Due to the inherent nature, the analysis of the quality of data sets is difficult and very limited approaches about this issue can be found in the literature. This paper presents a novel method to investigate the quality and structure of data sets, i.e., how to analyze whether there …


Application Of A Data Mining Framework For The Identification Of Agricultural Production Areas In Wa, Yunous Vagh, Leisa Armstrong, Dean Diepeveen Jan 2010

Application Of A Data Mining Framework For The Identification Of Agricultural Production Areas In Wa, Yunous Vagh, Leisa Armstrong, Dean Diepeveen

Research outputs pre 2011

This paper will propose a data mining framework for the identification of agricultural production areas ill WA. The data mining (DM) framework was developed with the aim of enhancing the analysis of agricultural datasets compared to currently used statistical methods. The DM framework is a synthesis of different technologies brought together for the purpose of enhancing the interrogation of these datasets. The DM framework is based on the data, information, knowledge and wisdom continuum as a horizontal axis, with DM and online analytical processing (OLAP) forming the vertical axis. In addition the DM framework incorporates aspects of data warehousing phases, …