Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities

    Access Status
    Fulltext not available
    Authors
    Wong, Y.W.
    Ch’ng, S.I.
    Seng, K.P.
    Ang, L.
    Chin, S.W.
    Chew, W.J.
    Lim, Hann
    Date
    2011
    Type
    Journal Article
    
    Metadata
    Show full item record
    Citation
    Wong, Y.W. and Ch’ng, S.I. and Seng, K.P. and Ang, L. and Chin, S.W. and Chew, W.J. and Lim, H. 2011. A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities. Pattern Recognition Letters. 32 (13): pp. 1503-1510.
    Source Title
    Pattern Recognition Letters
    DOI
    10.1016/j.patrec.2011.06.011
    ISSN
    0010-4469
    URI
    http://hdl.handle.net/20.500.11937/13681
    Collection
    • Curtin Research Publications
    Abstract

    Audio-visual recognition system is becoming popular because it overcomes certain problems of traditional audio-only recognition system. However, difficulties due to visual variations in video sequencecan significantly degrade the recognition performance of the system. This problem can be further complicated when more than one visual variation happen at the same time. Although several databases have been created in this area, none of them includes realistic visual variations in video sequence. With the aim to facilitate the development of robust audio-visual recognition systems, the new audio-visualUNMC-VIER database is created. This database contains various visual variations including illumination,facial expression, head pose, and image resolution variations. The most unique aspect of this database is that it includes more than one visual variation in the same video recording. For the audio part, the utterances are spoken in slow and normal speech pace to improve the learning process of audio-visual speech recognition system. Hence, this database is useful for the development of robust audio-visual person,speech recognition and face recognition systems.

    Related items

    Showing items related by title, author, creator and subject.

    • Message vs. messenger effects on cross-modal matching for spoken phrases
      Best, C.; Kroos, Christian; Mulak, K.; Halovic, S.; Fort, M.; Kitamura, C. (2015)
      A core issue in speech perception and word recognition research is the nature of information perceivers use to identify spoken utterances across indexical variations in their phonetic details, such as talker and accent ...
    • Audio networks for speech enhancement and indexing
      Kühnapfel, Thorsten (2009)
      For humans, hearing is the second most important sense, after sight. Therefore, acoustic information greatly contributes to observing and analysing an area of interest. For this reason combining audio and video cues for ...
    • Dynamic Hybrid Learning for Improving Facial Expression Classifier Reliability
      Vice, Jordan; Khan, Masood ; Tan, Tele; Yanushkevich, Svetlana (2022)
      Independent, discrete models like Paul Ekman’s six basic emotions model are widely used in affective state assessment (ASA) and facial expression classification. However, the continuous and dynamic nature of human expressions ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.