Human animation from analysis and reconstruction of human motion in video sequences

Zhang, Li

Access Status

Fulltext not available

Authors

Zhang, Li

Date

2009

Supervisor

Assoc. Prof. Ling Li

Type

Thesis

Award

PhD

Metadata

Show full item record

School

Department of Computing

URI

http://hdl.handle.net/20.500.11937/684

Collection

Curtin Theses

Abstract

This research aims to address one of the most challenging problems in the field of computer vision and computer graphics, that is, the reconstruction of smooth 3D human motions from monocular video containing unrestricted human movement. The objective is to propose novel methods which differ from the traditional kinematics/dynamics formulations and image based reconstruction methods to provide an alternative highly automated way for human animation from the most widely available source that records human movements and activities. Such methods should be relatively low-cost while avoiding many limitations that come up with current motion tracking equipment.Monocular images or video sequences are chosen as the source of the project, due to the fact that they are widely available through many ways, such as film making and even simple home videos. Most of such monocular images or video sequences are generally uncalibrated, i.e., the information on the camera from which these images are taken is not available. In addition, although 2D joint locations and body silhouettes can be extracted from the images, accuracy of such 2D feature extractions may not be satisfactory by current image processing techniques. Using such monocular image sequence as the input source, many techniques and algorithms are proposed in this research.A 3D skeletal human model based on human anatomy is constructed with angular constraints encoded in the joints according to the biomechanical and physiological knowledge. The model is simple and yet sufficient to simulate realistic human motions, while the computational expense is much lower when dealing with skeletal model than with any other human models. Relative lengths of every body part of the human model are adjusted to be consistent with the human subject in the source images before reconstruction. That is achieved by preciously acquired geometry information on the human subject of interest.A Motion Trend Analysis (MTA) method is proposed in this research to automatically reconstruct the 3D postures of the human subject of interest directly from the extracted 2D joint locations (with possible noise tolerance) at each image frame. This method utilized the information on previously reconstructed postures to assist positioning the joints of the human model to their proper 3D locations in the current frame. To ensure a reliable starting point in the reconstruction, manual adjustment may be required to improve the accuracy of the first three posture recoveries. 3D positional coherence of every joint between adjacent recovered postures can be obtained and maintained at a satisfactory level. Objective Function (OF) is defined to represent the 2D residuals between the extracted feature points and the corresponding features resulted from projecting the reconstructed human model to the projection plane. The OF also includes considerations of all 3D positional discrepancies at every joint between adjacent postures. By translating the pelvis joint and rotating each joint of the human model, a proper human posture that resembles the one represented in the monocular image can be created by searching for the minimum value of the OF. To balance 2D residuals and 3D discrepancies, weighting parameters (WP) determination routine was developed which is able to dynamically adjust the WP values for the 2D and 3D factors in an OF.3D acquisition of smooth and reasonable human motion is the main focus of this research. Such human motions are in high demand for applications involving virtual human movements. A Motion Level Control (MLC) algorithm was proposed to be integrated with the MTA system to further ensure the rotational coherence of the reconstruction results in 3D and improve the efficiency of the search process. Application of MLC can be divided into two modules: the relocation of the pelvis joint and the recoveries of the skeleton segments rotations. Based on MLC, the computational cost of the search procedure for the pelvis relocation and the skeleton adjustment in the human posture recovery will be significantly reduced. At the same time, rotational consistency of each body segment in the reconstructed motion can reach a satisfactory level. Experimental results from the proposed algorithm are highly satisfactory.This research also attempts to acquire smooth yet reasonable 3D human motions from the monocular images with body occlusion or based on extracted silhouettes. Existing techniques for recovering human postures usually require as input a human motion sequence where every body segment is visible at all time. Such requirement might not always be satisfied. The human motion to be reconstructed could contain occlusions where part or the body is obscured by other object in the scene or by another part of the human body itself. The input video could also be not clear enough to provide acceptable 2D joint extractions. This research extends the developed MTA and MLC and proposes novel methods to acquire the smooth and reasonable human postures under such circumstances. Experiments have produced very promising results.There are still many challenges ahead, especially on the preprocessing of the monocular images such as the accurate extraction of joint locations and fullyautomatic posture recoveries. Besides, the Objective Function and biomechanical constraints should be further studied to improve the general performance of the motion reconstruction system. Human motion reconstruction algorithm without accurate 2D inputs in terms of joint features or silhouettes should also be explored. The human motion reconstruction obtained from this research currently can handle 2D input with minor extraction errors. A more applicable system that can tolerate more input errors will be highly desirable and hence should attract some research attention. Technical issues to be explored and addressed in the future are identified and discussed in this thesis.