Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Databases and Information Systems

Salience-Aware Adaptive Resonance Theory For Large-Scale Sparse Data Clustering, Lei Meng, Ah-Hwee Tan, Chunyan Miao Dec 2019

Salience-Aware Adaptive Resonance Theory For Large-Scale Sparse Data Clustering, Lei Meng, Ah-Hwee Tan, Chunyan Miao

Research Collection School Of Computing and Information Systems

Sparse data is known to pose challenges to cluster analysis, as the similarity between data tends to be ill-posed in the high-dimensional Hilbert space. Solutions in the literature typically extend either k-means or spectral clustering with additional steps on representation learning and/or feature weighting. However, adding these usually introduces new parameters and increases computational cost, thus inevitably lowering the robustness of these algorithms when handling massive ill-represented data. To alleviate these issues, this paper presents a class of self-organizing neural networks, called the salience-aware adaptive resonance theory (SA-ART) model. SA-ART extends Fuzzy ART with measures for cluster-wise salient feature modeling. …


Topicsummary: A Tool For Analyzing Class Discussion Forums Using Topic Based Summarizations, Swapna Gottipati, Venky Shankararaman, Renjini Ramesh Oct 2019

Topicsummary: A Tool For Analyzing Class Discussion Forums Using Topic Based Summarizations, Swapna Gottipati, Venky Shankararaman, Renjini Ramesh

Research Collection School Of Computing and Information Systems

This Innovative Practice full paper, describes the application of text mining techniques for extracting insights from a course based online discussion forum through generation of topic based summaries. Discussions, either in classroom or online provide opportunity for collaborative learning through exchange of ideas that leads to enhanced learning through active participation. Online discussions offer a number of benefits namely providing additional time to reflect and synthesize information before writing, providing a natural platform for students to voice their ideas without any one student dominating the conversation, and providing a record of the student’s thoughts. An online discussion forum provides a …


Redpc: A Residual Error-Based Density Peak Clustering Algorithm, Milan Parmar, Di Wang, Xiaofeng Zhang, Ah-Hwee Tan, Chunyan Miao, You Zhou Jul 2019

Redpc: A Residual Error-Based Density Peak Clustering Algorithm, Milan Parmar, Di Wang, Xiaofeng Zhang, Ah-Hwee Tan, Chunyan Miao, You Zhou

Research Collection School Of Computing and Information Systems

The density peak clustering (DPC) algorithm was designed to identify arbitrary-shaped clusters by finding density peaks in the underlying dataset. Due to its aptitudes of relatively low computational complexity and a small number of control parameters in use, DPC soon became widely adopted. However, because DPC takes the entire data space into consideration during the computation of local density, which is then used to generate a decision graph for the identification of cluster centroids, DPC may face difficulty in differentiating overlapping clusters and in dealing with low-density data points. In this paper, we propose a residual error-based density peak clustering …


Cure: Flexible Categorical Data Representation By Hierarchical Coupling Learning, Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, Hang Gao May 2019

Cure: Flexible Categorical Data Representation By Hierarchical Coupling Learning, Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, Hang Gao

Research Collection School Of Computing and Information Systems

The representation of categorical data with hierarchical value coupling relationships (i.e., various value-to-value cluster interactions) is very critical yet challenging for capturing complex data characteristics in learning tasks. This paper proposes a novel and flexible coupled unsupervised categorical data representation (CURE) framework, which not only captures the hierarchical couplings but is also flexible enough to be instantiated for contrastive learning tasks. CURE first learns the value clusters of different granularities based on multiple value coupling functions and then learns the value representation from the couplings between the obtained value clusters. With two complementary value coupling functions, CURE is instantiated into …