Sparse subspace representation for spectral document clustering
Access Status
Authors
Date
2012Type
Metadata
Show full item recordCitation
Source Title
Source Conference
ISSN
Remarks
Copyright © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Collection
Abstract
We present a novel method for document clustering using sparse representation of documents in conjunction with spectral clustering. An ℓ1-norm optimization formulation is posed to learn the sparse representation of each document, allowing us to characterize the affinity between documents by considering the overall information instead of traditional pair wise similarities. This document affinity is encoded through a graph on which spectral clustering is performed. The decomposition into multiple subspaces allows documents to be part of a sub-group that shares a smaller set of similar vocabulary, thus allowing for cleaner clusters. Extensive experimental evaluations on two real-world datasets from Reuters-21578 and 20Newsgroup corpora show that our proposed method consistently outperforms state-of-the-art algorithms. Significantly, the performance improvement over other methods is prominent for this datasets.
Related items
Showing items related by title, author, creator and subject.
-
Zhang, X.; Pham, DucSon; Phung, D.; Liu, Wan-Quan; Saha, B.; Venkatesh, S. (2015)Many vision problems deal with high-dimensional data, such as motion segmentation and face clustering. However, these high-dimensional data usually lie in a low-dimensional structure. Sparse representation is a powerful ...
-
Li, Q.; Liu, Wan-Quan; Li, Ling (2018)Subspace clustering refers to the problem of finding low-dimensional subspaces (clusters) for high-dimensional data. Current state-of-the-art subspace clustering methods are usually based on spectral clustering, where an ...
-
Budhaditya, S.; Pham, DucSon; Phung, D.; Venkatesh, S. (2013)We propose in this paper a novel sparse subspace clustering method that regularizes sparse subspace representation by exploiting the structural sharing between tasks and data points via group sparse coding. We derive ...