Continuously processing a stream of not-i.i.d. data by neural models with the goal of progressively learning new skills is largely known to introduce significant challenges, frequently leading to catastrophic forgetting. In this paper we tackle this problem focusing on the low-level aspects of the neural computation model, differently from the most common existing approaches. We propose a novel neuron model, referred to as Continual Neural Unit (CNU), which does not only compute a response to an input pattern, but also diversifies computations to preserve what was previously learned, while being plastic enough to adapt to new knowledge. The values attached to weights are the outcome of a computational process which depends on the neuron input, implemented by a key-value map to select and blend multiple sets of learnable memory units. This computational mechanism implements a natural, learnable form of soft parameter isolation, virtually defining multiple computational paths within each neural unit. We show that such a computational scheme is related to the ones of popular models which perform computations relying on a set of samples stored in a memory buffer, including Kernel Machines and Transformers. Experiments in class-and-domain incremental streams processed in online and single-pass manner show how CNUs can mitigate forgetting without any replays or more informed learning criteria, while keeping competitive or better performance with respect to continual learning methods that explicitly store and replay data.

Tiezzi, M., Marullo, S., Becattini, F., Melacci, S. (2024). Continual Neural Computation. In Machine Learning and Knowledge Discovery in Databases. Research Track European Conference, ECML PKDD 2024, Proceedings, Part II (pp.340-356). Cham : Springer [10.1007/978-3-031-70344-7_20].

Continual Neural Computation

Becattini, Federico;Melacci, Stefano
2024-01-01

Abstract

Continuously processing a stream of not-i.i.d. data by neural models with the goal of progressively learning new skills is largely known to introduce significant challenges, frequently leading to catastrophic forgetting. In this paper we tackle this problem focusing on the low-level aspects of the neural computation model, differently from the most common existing approaches. We propose a novel neuron model, referred to as Continual Neural Unit (CNU), which does not only compute a response to an input pattern, but also diversifies computations to preserve what was previously learned, while being plastic enough to adapt to new knowledge. The values attached to weights are the outcome of a computational process which depends on the neuron input, implemented by a key-value map to select and blend multiple sets of learnable memory units. This computational mechanism implements a natural, learnable form of soft parameter isolation, virtually defining multiple computational paths within each neural unit. We show that such a computational scheme is related to the ones of popular models which perform computations relying on a set of samples stored in a memory buffer, including Kernel Machines and Transformers. Experiments in class-and-domain incremental streams processed in online and single-pass manner show how CNUs can mitigate forgetting without any replays or more informed learning criteria, while keeping competitive or better performance with respect to continual learning methods that explicitly store and replay data.
2024
9783031703430
9783031703447
Tiezzi, M., Marullo, S., Becattini, F., Melacci, S. (2024). Continual Neural Computation. In Machine Learning and Knowledge Discovery in Databases. Research Track European Conference, ECML PKDD 2024, Proceedings, Part II (pp.340-356). Cham : Springer [10.1007/978-3-031-70344-7_20].
File in questo prodotto:
File Dimensione Formato  
978-3-031-70344-7_20.pdf

non disponibili

Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.94 MB
Formato Adobe PDF
1.94 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1277218