On the integration of time-frequency masking speech separation and recognition in underdetermined environments

Jafari, I.; Haque, S.; Togneri, R.; Nordholm, Sven

doi:10.1109/ACSSC.2012.6489303

Access Status

Fulltext not available

Authors

Jafari, I.

Haque, S.

Togneri, R.

Nordholm, Sven

Date

2012

Type

Conference Paper

Metadata

Show full item record

Citation

Jafari, I. and Haque, S. and Togneri, R. and Nordholm, S. 2012. On the integration of time-frequency masking speech separation and recognition in underdetermined environments, pp. 1613-1617.

Source Title

Conference Record - Asilomar Conference on Signals, Systems and Computers

DOI

10.1109/ACSSC.2012.6489303

ISBN

9781467350518

School

Department of Electrical and Computer Engineering

URI

http://hdl.handle.net/20.500.11937/5370

Collection

Curtin Research Publications

Abstract

The successful application of automatic speech recognition systems in the real world is conditional on its ability to handle realistic environments with unfavorable conditions such as reverberation and multiple sources of inteference. Previous research has identified time-frequency masking based approaches to blind source separation as a viable approach for multisource reverberant source separation. It is proposed the use of such separation techniques as a front-end to speech recognition will encourage greater recognition accuracy. Experimental evaluations confirmed the hypothesis with an improvement in recognition accuracy of over 20% at a reverberation time of RT60 = 300ms; this is indicative of the potential for future research in this field. © 2012 IEEE.