Paper
Diffusion of Credit in Markovian Models
This paper studies the problem of diffusion in Markovian models (such as hidden Markov models) and how it makes very difficult the task of learning of long-term dependencies in sequences. 1 Introduction This paper is part of our research on the problem of learning long-term dependencies in sequences. In our previous work (Bengio, Simard & Frasconi, 1994) we found theoretical reasons for the difficulty in training recurrent networks (or more generally parametric dynamical systems) to learn long-term dependencies. The main result stated that either long-term storing or gradient propagation would be harmed, depending on whether the norm of the Jacobian of the state to state function was greater or less than 1. In this paper we consider a special case in which the norm of the Jacobian of the state to state function is constrained to be exactly 1 because this matrix is a stochastic matrix. This paper thus deals with learning long-term dependencies in systems that have this property, i.e. M...
Authors: Yoshua Bengio · Paolo Frasconi