In this paper we study a constraint-based representation of neural network architectures. We cast the learning problem in the Lagrangian framework and we investigate a simple optimization procedure that is well suited to fulfil the so-called architectural constraints, learning from the available supervisions. The computational structure of the proposed Local Propagation (LP) algorithm is based on the search for saddle points in the adjoint space composed of weights, neural outputs, and Lagrange multipliers. All the updates of the model variables are locally performed, so that LP is fully parallelizable over the neural units, circumventing the classic problem of gradient vanishing in deep networks. The implementation of popular neural models is described in the context of LP, together with those conditions that trace a natural connection with Backpropagation. We also investigate the setting in which we tolerate bounded violations of the architectural constraints, and we provide experimental evidence that LP is a feasible approach to train shallow and deep networks, opening the road to further investigations on more complex architectures, easily describable by constraints.
Marra, G., Tiezzi, M., Melacci, S., Betti, A., Maggini, M., Gori, M. (2020). Local Propagation in Constraint-based Neural Networks. In 2020 International Joint Conference on Neural Networks (IJCNN) (pp.1-8). New York : Institute of Electrical and Electronics Engineers Inc. [10.1109/IJCNN48605.2020.9207043].
Local Propagation in Constraint-based Neural Networks
Tiezzi, M.;Melacci, S.;Betti, A.;Maggini, M.;Gori, M.
2020-01-01
Abstract
In this paper we study a constraint-based representation of neural network architectures. We cast the learning problem in the Lagrangian framework and we investigate a simple optimization procedure that is well suited to fulfil the so-called architectural constraints, learning from the available supervisions. The computational structure of the proposed Local Propagation (LP) algorithm is based on the search for saddle points in the adjoint space composed of weights, neural outputs, and Lagrange multipliers. All the updates of the model variables are locally performed, so that LP is fully parallelizable over the neural units, circumventing the classic problem of gradient vanishing in deep networks. The implementation of popular neural models is described in the context of LP, together with those conditions that trace a natural connection with Backpropagation. We also investigate the setting in which we tolerate bounded violations of the architectural constraints, and we provide experimental evidence that LP is a feasible approach to train shallow and deep networks, opening the road to further investigations on more complex architectures, easily describable by constraints.File | Dimensione | Formato | |
---|---|---|---|
melacci_IJCNN2020a.pdf
non disponibili
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
947.64 kB
Formato
Adobe PDF
|
947.64 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1122683