We are interested in the relationship between learning efficiency and representation in the case of supervised neural networks for pattern classification trained by continuous error minimization techniques, such as gradient descent. In particular, we focus our attention on a recently introduced architecture called recursive neural network (RNN) which is able to learn class membership of patterns represented as labeled directed ordered acyclic graphs (DOAG). RNNs offer several benefits compared to feedforward and recurrent networks for sequences. However, how RNNs compare to these models in terms of learning efficiency still needs investigation. In this paper we give a theoretical answer by giving a set of results concerning the shape of the error surface and critically discussing the implications of these results on the relative difficulty of learning with different data representations. The message of this paper is that, whenever structured representations are available, they should be preferred to "flat" (array based) representations because they are likely to simplify learning in terms of time complexity.
Gori, M., Frasconi, P., Sperduti, A. (2000). Learning efficiently with neural networks: a theoretical comparison between structured and flat representations. In ECAI 2000: 14th European Conference on Artificial Intelligence (pp.300-305). Amsterdam : IOS PRESS.
Learning efficiently with neural networks: a theoretical comparison between structured and flat representations
GORI, MARCO;
2000-01-01
Abstract
We are interested in the relationship between learning efficiency and representation in the case of supervised neural networks for pattern classification trained by continuous error minimization techniques, such as gradient descent. In particular, we focus our attention on a recently introduced architecture called recursive neural network (RNN) which is able to learn class membership of patterns represented as labeled directed ordered acyclic graphs (DOAG). RNNs offer several benefits compared to feedforward and recurrent networks for sequences. However, how RNNs compare to these models in terms of learning efficiency still needs investigation. In this paper we give a theoretical answer by giving a set of results concerning the shape of the error surface and critically discussing the implications of these results on the relative difficulty of learning with different data representations. The message of this paper is that, whenever structured representations are available, they should be preferred to "flat" (array based) representations because they are likely to simplify learning in terms of time complexity.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/36654
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo