On the integration of time-frequency masking speech separation and recognition in underdetermined environments
Access Status
Authors
Date
2012Type
Metadata
Show full item recordCitation
Source Title
ISBN
School
Collection
Abstract
The successful application of automatic speech recognition systems in the real world is conditional on its ability to handle realistic environments with unfavorable conditions such as reverberation and multiple sources of inteference. Previous research has identified time-frequency masking based approaches to blind source separation as a viable approach for multisource reverberant source separation. It is proposed the use of such separation techniques as a front-end to speech recognition will encourage greater recognition accuracy. Experimental evaluations confirmed the hypothesis with an improvement in recognition accuracy of over 20% at a reverberation time of RT60 = 300ms; this is indicative of the potential for future research in this field. © 2012 IEEE.
Related items
Showing items related by title, author, creator and subject.
-
Kuhne, M.; Togneri, R.; Nordholm, Sven (2011)Conventional hidden Markov model (HMM) decoders often experience severe performance degradations in practice due to their inability to cope with uncertain data in time-varying environments. In order to address this issue, ...
-
Li, B.; Mian, A.; Liu, Wan-Quan; Krishna, Aneesh (2015)In this paper, we present a new algorithm that utilizes low-quality red, green, blue and depth (RGB-D) data from the Kinect sensor for face recognition under challenging conditions. This algorithm extracts multiple features ...
-
Li, Billy; Mian, A.; Liu, Wan-Quan; Krishna, Aneesh (2013)We present an algorithm that uses a low resolution 3D sensor for robust face recognition under challenging conditions. A preprocessing algorithm is proposed which exploits the facial symmetry at the 3D point cloud level ...