A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities

Wong, Y.W.; Ch’ng, S.I.; Seng, K.P.; Ang, L.; Chin, S.W.; Chew, W.J.; Lim, Hann

doi:10.1016/j.patrec.2011.06.011

dc.contributor.author	Wong, Y.W.
dc.contributor.author	Ch’ng, S.I.
dc.contributor.author	Seng, K.P.
dc.contributor.author	Ang, L.
dc.contributor.author	Chin, S.W.
dc.contributor.author	Chew, W.J.
dc.contributor.author	Lim, Hann
dc.date.accessioned	2017-01-30T11:38:43Z
dc.date.available	2017-01-30T11:38:43Z
dc.date.created	2014-11-19T01:13:25Z
dc.date.issued	2011
dc.identifier.citation	Wong, Y.W. and Ch’ng, S.I. and Seng, K.P. and Ang, L. and Chin, S.W. and Chew, W.J. and Lim, H. 2011. A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities. Pattern Recognition Letters. 32 (13): pp. 1503-1510.
dc.identifier.uri	http://hdl.handle.net/20.500.11937/13681
dc.identifier.doi	10.1016/j.patrec.2011.06.011
dc.description.abstract	Audio-visual recognition system is becoming popular because it overcomes certain problems of traditional audio-only recognition system. However, difficulties due to visual variations in video sequencecan significantly degrade the recognition performance of the system. This problem can be further complicated when more than one visual variation happen at the same time. Although several databases have been created in this area, none of them includes realistic visual variations in video sequence. With the aim to facilitate the development of robust audio-visual recognition systems, the new audio-visualUNMC-VIER database is created. This database contains various visual variations including illumination,facial expression, head pose, and image resolution variations. The most unique aspect of this database is that it includes more than one visual variation in the same video recording. For the audio part, the utterances are spoken in slow and normal speech pace to improve the learning process of audio-visual speech recognition system. Hence, this database is useful for the development of robust audio-visual person,speech recognition and face recognition systems.
dc.publisher	Elsevier BV, North-Holland
dc.subject	Audio-visual database
dc.subject	Speech recognition
dc.subject	Face recognition
dc.subject	Visual variation
dc.title	A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities
dc.type	Journal Article
dcterms.source.volume	32
dcterms.source.number	13
dcterms.source.startPage	1503
dcterms.source.endPage	1510
dcterms.source.issn	0010-4469
dcterms.source.title	Pattern Recognition Letters
curtin.accessStatus	Fulltext not available

Files in this item

Name:: 203990_51081_A_new_multi-purpo ...
Size:: 596.4Kb
Format:: PDF

This item appears in the following Collection(s)

Curtin Research Publications

Show simple item record

A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities

Files in this item

This item appears in the following Collection(s)

Related items