Comparison of Various Neural Network Language Models in Speech Recognition
Access Status
Authors
Date
2016Type
Metadata
Show full item recordCitation
Source Title
ISBN
School
Collection
Abstract
© 2016 IEEE. In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three competing concepts have been developed: Firstly, feed forward neural networks representing an n-gram approach, Secondly, recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words, Thirdly, the long short-term memory (LSTM) neural networks can fully exploits the correlation on a telephone conversation corpus. In this paper, we compare count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks. Furthermore, we put forward a language model estimation method introduced the information of history sentences. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. The experimental results show that the performance of LSTM neural network language model is optimal in n-best lists re-score. Compared to the first pass decoding, the relative decline in average word error rate is 4.3% when using ten candidate results to re-score in conversational telephone speech recognition tasks.
Related items
Showing items related by title, author, creator and subject.
-
Chan, Kit Yan; Nordholm, Sven; Yiu, Ka Fai; Togneri, R. (2013)Industrial automation with speech control functions is generally installed with a speech recognition sensor which is used as an interface for users to articulate speech commands. However, recognition errors are likely to ...
-
Huang, L.; Li, Jun; Hao, Hong; Li, X. (2018)Recent years have witnessed a clear trend to develop deeper and longer tunnels to meet the growing needs of mining. Micro-seismic events location is vital for predicting and avoiding the traditional mine disasters induced ...
-
Mostafa, Fahed. (2011)Market risk refers to the potential loss that can be incurred as a result of movements inmarket factors. Capturing and measuring these factors are crucial in understanding andevaluating the risk exposure associated with ...