The development of Convolutional Neural Networks (CNNs) trends towards models with an ever growing number of Convolutional Layers (CLs) and increases the number of trainable parameters significantly. Such models are sensitive to these structural parameters, which implies that large models have to be carefully tuned using hyperparameter optimisation, a process that can be very time consuming. In this paper, we study the usage of Recursive Convolutional Layers (RCLs), a module relying on an algebraic feedback loop wrapped around a CL, which can replace any CL in CNNs. Using three publicly available datasets, CIFAR10, CIFAR100 and SVHN, and a simple model comprised of 4 RCLs, we compare its performances with those obtained by its feedforward counterpart, and exhibit some core properties and use-cases of RCLs. In particular, we show that RCLs can lead to models of better performances, and that reducing the number of modules from four to one lead to a decrease in accuracy of 3.5% on average for models using RCLs, compared to 23% using CLs. Hence, the resulting architecture is much more robust to the addition or the removal of layers. We conclude by relating the effects obtained using additional CLs with those obtained using additional recursion on RCLs, which provides incentives that the latter can simulate an increase of depth but with no extra cost of parameters. Such results point to the potential benefits of either selectively or replacing all CLs by RCLs, in most recently introduced CNNs.

Chagnon, J., Hagenbuchner, M., Tsoi, A.C., Scarselli, F. (2023). Going Deeper with Recursive Convolutional Layers. In 2023 International Joint Conference on Neural Networks (IJCNN). New York : IEEE [10.1109/IJCNN54540.2023.10191350].

Going Deeper with Recursive Convolutional Layers

Scarselli F.
2023-01-01

Abstract

The development of Convolutional Neural Networks (CNNs) trends towards models with an ever growing number of Convolutional Layers (CLs) and increases the number of trainable parameters significantly. Such models are sensitive to these structural parameters, which implies that large models have to be carefully tuned using hyperparameter optimisation, a process that can be very time consuming. In this paper, we study the usage of Recursive Convolutional Layers (RCLs), a module relying on an algebraic feedback loop wrapped around a CL, which can replace any CL in CNNs. Using three publicly available datasets, CIFAR10, CIFAR100 and SVHN, and a simple model comprised of 4 RCLs, we compare its performances with those obtained by its feedforward counterpart, and exhibit some core properties and use-cases of RCLs. In particular, we show that RCLs can lead to models of better performances, and that reducing the number of modules from four to one lead to a decrease in accuracy of 3.5% on average for models using RCLs, compared to 23% using CLs. Hence, the resulting architecture is much more robust to the addition or the removal of layers. We conclude by relating the effects obtained using additional CLs with those obtained using additional recursion on RCLs, which provides incentives that the latter can simulate an increase of depth but with no extra cost of parameters. Such results point to the potential benefits of either selectively or replacing all CLs by RCLs, in most recently introduced CNNs.
2023
978-1-6654-8867-9
Chagnon, J., Hagenbuchner, M., Tsoi, A.C., Scarselli, F. (2023). Going Deeper with Recursive Convolutional Layers. In 2023 International Joint Conference on Neural Networks (IJCNN). New York : IEEE [10.1109/IJCNN54540.2023.10191350].
File in questo prodotto:
File Dimensione Formato  
Going_Deeper_with_Recursive_Convolutional_Layers.pdf

non disponibili

Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.21 MB
Formato Adobe PDF
1.21 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1244094