A Study on the effects of recursive convolutional layers in convolutional neural networks

IRIS

To overcome problems with the design of large networks, particularly with respect to the depth of the network, this paper presents a new model of convolutional neural networks (CNN) which features fully recursive convolutional layers (RCLs). An RCL is a generalization of the classic one-stage feedforward convolutional layer (CL) to fully direct feedback connections between the outputs of the CL and its inputs. A traditional deep CNN consisting of many CLs, can then be generalized to include some CLs, and some RCLs in the intermediate stages. We call the corresponding network a Convolutional Neural Network with Fully Recursive Perceptron Network (C-FRPN). Through an analysis of results obtained from applications of the C-FRPN to three benchmark image classification datasets: CIFAR-10, SVHN, ISIC, it is found that (i) in general, the performance of a C-FRPN, even with only one RCL, is better than the performance of the corresponding deep CNN with all CLs, under the constraint of having the same number of unknown parameters; (ii) the performance of the C-FRPN varies with respect to (a) where the RCLs are located, and (b) the number of RCLs in the C-FRPN; and, (iii) the effectiveness of the RCLs depends on the size of the training dataset. The results suggest that: (a) it is advisable to use RCLs particularly when training very large sets of data, (b) it is best to prioritize placement of RCLs close to the input layer of the C-FRPN, and (c) it is advisable to increase the number of RCLs as long as the training dataset can sustain without overfitting being observed.

Rossi, A., Hagenbuchner, M., Scarselli, F., Tsoi, A.C. (2021). A Study on the effects of recursive convolutional layers in convolutional neural networks. NEUROCOMPUTING, 460, 59-70 [10.1016/j.neucom.2021.07.021].

A Study on the effects of recursive convolutional layers in convolutional neural networks

Rossi A.;Hagenbuchner M.;Scarselli F.;Tsoi A. C.

2021-01-01

Abstract

To overcome problems with the design of large networks, particularly with respect to the depth of the network, this paper presents a new model of convolutional neural networks (CNN) which features fully recursive convolutional layers (RCLs). An RCL is a generalization of the classic one-stage feedforward convolutional layer (CL) to fully direct feedback connections between the outputs of the CL and its inputs. A traditional deep CNN consisting of many CLs, can then be generalized to include some CLs, and some RCLs in the intermediate stages. We call the corresponding network a Convolutional Neural Network with Fully Recursive Perceptron Network (C-FRPN). Through an analysis of results obtained from applications of the C-FRPN to three benchmark image classification datasets: CIFAR-10, SVHN, ISIC, it is found that (i) in general, the performance of a C-FRPN, even with only one RCL, is better than the performance of the corresponding deep CNN with all CLs, under the constraint of having the same number of unknown parameters; (ii) the performance of the C-FRPN varies with respect to (a) where the RCLs are located, and (b) the number of RCLs in the C-FRPN; and, (iii) the effectiveness of the RCLs depends on the size of the training dataset. The results suggest that: (a) it is advisable to use RCLs particularly when training very large sets of data, (b) it is best to prioritize placement of RCLs close to the input layer of the C-FRPN, and (c) it is advisable to increase the number of RCLs as long as the training dataset can sustain without overfitting being observed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Rivista su cui è pubblicata l'opera
	
				NEUROCOMPUTING
			
	Citazione
	
				Rossi, A., Hagenbuchner, M., Scarselli, F., Tsoi, A.C. (2021). A Study on the effects of recursive convolutional layers in convolutional neural networks. NEUROCOMPUTING, 460, 59-70 [10.1016/j.neucom.2021.07.021].
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0925231221010559-main.pdf non disponibili Tipologia: PDF editoriale Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 1.55 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.55 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1153503