Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Computer Sciences

Deep Air Learning: Interpolation, Prediction, And Feature Analysis Of Fine-Grained Air Quality, Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei Mark Zhang Dec 2018

Deep Air Learning: Interpolation, Prediction, And Feature Analysis Of Fine-Grained Air Quality, Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei Mark Zhang

Research Collection School Of Computing and Information Systems

The interpolation, prediction, and feature analysis of fine-gained air quality are three important topics in the area of urban air computing. The solutions to these topics can provide extremely useful information to support air pollution control, and consequently generate great societal and technical impacts. Most of the existing work solves the three problems separately by different models. In this paper, we propose a general and effective approach to solve the three problems in one model called the Deep Air Learning (DAL). The main idea of DAL lies in embedding feature selection and semi-supervised learning in different layers of the deep …


Autospearman: Automatically Mitigating Correlated Software Metrics For Interpreting Defect Models, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude Nov 2018

Autospearman: Automatically Mitigating Correlated Software Metrics For Interpreting Defect Models, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude

Research Collection School Of Computing and Information Systems

The interpretation of defect models heavily relies on software metrics that are used to construct them. However, such software metrics are often correlated in defect models. Prior work often uses feature selection techniques to remove correlated metrics in order to improve the performance of defect models. Yet, the interpretation of defect models may be misleading if feature selection techniques produce subsets of inconsistent and correlated metrics. In this paper, we investigate the consistency and correlation of the subsets of metrics that are produced by nine commonly-used feature selection techniques. Through a case study of 13 publicly-available defect datasets, we find …


Artefact: An R Implementation Of The Autospearman Function, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude Nov 2018

Artefact: An R Implementation Of The Autospearman Function, Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, Christoph Treude

Research Collection School Of Computing and Information Systems

This artefact is the implementation of AutoSpearman, an automated metric selection approach based on correlation analyses. The goal of AutoSpearman is to automatically mitigate correlated metrics prior to constructing analytical models. This artefact is implemented as an R package and is available in the GitHub repository. We provide descriptions and R code snippets for the installation of AutoSpearman and usage examples.


Sparse Modeling-Based Sequential Ensemble Learning For Effective Outlier Detection In High-Dimensional Numeric Data, Guansong Pang, Longbing Cao, Ling Chen, Defu Lian, Huan Liu Feb 2018

Sparse Modeling-Based Sequential Ensemble Learning For Effective Outlier Detection In High-Dimensional Numeric Data, Guansong Pang, Longbing Cao, Ling Chen, Defu Lian, Huan Liu

Research Collection School Of Computing and Information Systems

The large proportion of irrelevant or noisy features in reallife high-dimensional data presents a significant challenge to subspace/feature selection-based high-dimensional outlier detection (a.k.a. outlier scoring) methods. These methods often perform the two dependent tasks: relevant feature subset search and outlier scoring independently, consequently retaining features/subspaces irrelevant to the scoring method and downgrading the detection performance. This paper introduces a novel sequential ensemble-based framework SEMSE and its instance CINFO to address this issue. SEMSE learns the sequential ensembles to mutually refine feature selection and outlier scoring by iterative sparse modeling with outlier scores as the pseudo target feature. CINFO instantiates SEMSE …