Paper

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

We describe a simple but effective way of using multi-frame targets to improve the accuracy of Artificial Neural NetworkHidden Markov Model (ANN-HMM) hybrid systems. In this approach a Deep Neural Network (DNN) is trained to predict the forced-alignment state of multiple frames using a separate softmax unit for each of the frames. This is in contrast to the usual method of training a DNN to predict only the state of the central frame. By itself this is not sufficient to improve accuracy of the system significantly. However, if we average the predictions for each frame from the different contexts it is associated with we achieve state of the art results on TIMIT using a fully connected Deep Neural Network without convolutional architectures or dropout training. On a 14 hour subset of Wall Street Journal (WSJ) using a context dependent DNN-HMM system it leads to a relative improvement of 6.4% on the dev set (testdev93) and 9.3% on test set (test-eval92).

Interspeech 2014Published 2014-09-14Paper link

Authors: Navdeep Jaitly · Vincent Vanhoucke · Geoffrey Hinton

Topics

Relevant entities

People

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.