Two Maximum Entropy-Based Algorithms for Running Quantile Estimation in Nonstationary Data Streams
Access Status
Authors
Date
2015Type
Metadata
Show full item recordCitation
Source Title
ISSN
School
Remarks
© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Collection
Abstract
The need to estimate a particular quantile of a distribution is an important problem that frequently arises in many computer vision and signal processing applications. For example, our work was motivated by the requirements of many semiautomatic surveillance analytics systems that detect abnormalities in close-circuit television footage using statistical models of low-level motion features. In this paper, we specifically address the problem of estimating the running quantile of a data stream when the memory for storing observations is limited. We make the following several major contributions: 1) we highlight the limitations of approaches previously described in the literature that make them unsuitable for nonstationary streams; 2) we describe a novel principle for the utilization of the available storage space; 3) we introduce two novel algorithms that exploit the proposed principle in different ways; and 4) we present a comprehensive evaluation and analysis of the proposed algorithms and the existing methods in the literature on both synthetic data sets and three large real-world streams acquired in the course of operation of an existing commercial surveillance system. Our findings convincingly demonstrate that both of the proposed methods are highly successful and vastly outperform the existing alternatives. We show that the better of the two algorithms (data-aligned histogram) exhibits far superior performance in comparison with the previously described methods, achieving more than 10 times lower estimate errors on real-world data, even when its available working memory is an order of magnitude smaller.
Related items
Showing items related by title, author, creator and subject.
-
Li, Yanrong (2009)Clustering and association rules mining are two core data mining tasks that have been actively studied by data mining community for nearly two decades. Though many clustering and association rules mining algorithms have ...
-
Arandjelovic, O.; Pham, DucSon; Venkatesh, S. (2015)The need to estimate a particular quantile of a distribution is an important problem which frequently arises in many computer vision and signal processing applications. For example, our work was motivated by the requirements ...
-
Arandjelovic, O.; Pham, DucSon; Venkatesh, S. (2015)The need to estimate a particular quantile of a distribution is an important problem that frequently arises in many computer vision and signal processing applications. For example, our work was motivated by the requirements ...