A robust framework for 2D human pose tracking with spatial and temporal constraints
MetadataShow full item record
We work on the task of 2D articulated human pose tracking in monocular image sequences, an extremely challenging task due to background cluttering, variation in body appearance, occlusion and imaging conditions. Most of current approaches only deal with simple appearance and $adjacent$ body part dependencies, especially the Gaussian tree-structured priors assumed over body part connections. Such prior makes the part connections independent to image evidence and in turn severely limits accuracy. Building on the successful pictorial structures model, we propose a novel framework combining an image-conditioned model that incorporates higher order dependencies of multiple body parts. In order to establish the conditioning variables, we employ the effective poselet features. In addition to this, we introduce a full body detector as the first step of our framework to reduce the search space for pose tracking. We evaluate our framework on two challenging image sequences and conduct a series of comparison experiments to compare the performance with another two approaches. The results illustrate that the proposed framework in this work outperforms the state-of-the-art 2D pose tracking systems.
Showing items related by title, author, creator and subject.
Tian, J.; Li, L.; Liu, Wan-Quan (2016)2D articulated human pose tracking in monocular image sequences remains an extremely challenging task due to background cluttering, variation in body appearance, occlusion and imaging conditions. Most of the current ...
Tian, J.; Li, Ling; Liu, Wan-Quan (2014)In this paper we address the problem of tracking human poses in multiple perspective scales in 2D monocular images/videos. In most state-of-the-art 2D tracking approaches, the issue of scale variation is rarely discussed. ...
Zhang, Li (2009)This research aims to address one of the most challenging problems in the field of computer vision and computer graphics, that is, the reconstruction of smooth 3D human motions from monocular video containing unrestricted ...