The adaptable buffer algorithm for high quantile estimation in non-stationary data streams
MetadataShow full item record
The need to estimate a particular quantile of a distribution is an important problem which frequently arises in many computer vision and signal processing applications. For example, our work was motivated by the requirements of many semi-automatic surveillance analytics systems which detect abnormalities in close-circuit television (CCTV) footage using statistical models of low-level motion features. In this paper we specifically address the problem of estimating the running quantile of a data stream with non-stationary stochasticity when the memory for storing observations is limited. We make several major contributions: (i) we derive an important theoretical result which shows that the change in the quantile of a stream is constrained regardless of the stochastic properties of data, (ii) we describe a set of high-level design goals for an effective estimation algorithm that emerge as a consequence of our theoretical findings, (iii) we introduce a novel algorithm which implements the aforementioned design goals by retaining a sample of data values in a manner adaptive to changes in the distribution of data and progressively narrowing down its focus in the periods of quasi-stationary stochasticity, and (iv) we present a comprehensive evaluation of the proposed algorithm and compare it with the existing methods in the literature on both synthetic data sets and three large 'real-world' streams acquired in the course of operation of an existing commercial surveillance system. Our findings convincingly demonstrate that the proposed method is highly successful and vastly outperforms the existing alternatives, especially when the target quantile is high valued and the available buffer capacity severely limited.
Showing items related by title, author, creator and subject.
Arandjelovic, O.; Pham, DucSon; Venkatesh, S. (2015)The need to estimate a particular quantile of a distribution is an important problem that frequently arises in many computer vision and signal processing applications. For example, our work was motivated by the requirements ...
Techniques for improving clustering and association rules mining from very large transactional databasesLi, Yanrong (2009)Clustering and association rules mining are two core data mining tasks that have been actively studied by data mining community for nearly two decades. Though many clustering and association rules mining algorithms have ...
Arandjelovic, O.; Pham, DucSon; Venkatesh, S. (2014)We address the problem of estimating the running quantile of a data stream when the memory for storing observations is limited. We (i) highlight the limitations of approaches previously described in the literature which ...