Graph-based clustering with DRepStream
Access Status
Authors
Date
2017Type
Metadata
Show full item recordCitation
Source Title
ISBN
School
Collection
Abstract
© 2017 ACM. Finding and setting input parameters for clustering algorithms is a challenging thing due to the unsupervised nature of clustering. The accuracy of clustering algorithms can be affected greatly by setting parameters appropriately for the dataset, however without ground truth labels and external validation it can be impossible to know when the parameters are set well. In this paper we propose the DRepStream algorithm, which extends the RepStream algorithm. DRepStream uses a graph-based approach, and unlike its predecessor does not require the primary K parameter used in K-nearest neighbour graphs. Our algorithm automatically computes the number of outgoing edges for each vertex in the graph using a computed metric known as the anomalous edge score. We evaluate the performance of our algorithm on other previous stream clustering algorithms on real world benchmark datasets.
Related items
Showing items related by title, author, creator and subject.
-
Li, Yanrong (2009)Clustering and association rules mining are two core data mining tasks that have been actively studied by data mining community for nearly two decades. Though many clustering and association rules mining algorithms have ...
-
Callister, Ross (2020)Data streams present a number of challenges, caused by change in stream concepts over time. In this thesis we present a novel method for detection of concept drift within data streams by analysing geometric features of ...
-
Lam, Bee K. (1999)A network is a system that involves movement or flow of some commodities such as goods and services. In fact any structure that is in the form of a system of components some of which interact can be considered as a network. ...