Networks with trainable amplitude of activation functions

Trentin, Edmondo

doi:10.1016/S0893-6080(01)00028-4

Network training algorithms have heavily concentrated on the learning of connection weights. Little effort has been made to learn the amplitude of activation functions, which defines the range of values that the function can take. This paper introduces novel algorithms to learn the amplitudes of nonlinear activations in layered networks, without any assumption on their analytical form. Three instances of the algorithms are developed: (i) a common amplitude is shared among all nonlinear units; (ii) each layer has its own amplitude; and (iii) neuron-specific amplitudes are allowed. The algorithms can also be seen as a particular double-step gradient-descent procedure, as gradient-driven adaptive learning rate schemes, or as weight-grouping techniques that are consistent with known scaling laws for regularization with weight decay. As a side effect, a self-pruning mechanism of redundant neurons may emerge. Experimental results on function approximation, classification, and regression tasks, with synthetic and real-world data, validate the approach and show that the algorithms speed up convergence and modify the search path in the weight space, possibly reaching deeper minima that may also improve generalization.

Trentin, E. (2001). Networks with trainable amplitude of activation functions. NEURAL NETWORKS, 14(4-5), 471-493 [10.1016/S0893-6080(01)00028-4].

Networks with trainable amplitude of activation functions

TRENTIN EDMONDO

2001-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2001
			
	Rivista su cui è pubblicata l'opera
	
				NEURAL NETWORKS
			
	Citazione
	
				Trentin, E. (2001). Networks with trainable amplitude of activation functions. NEURAL NETWORKS, 14(4-5), 471-493 [10.1016/S0893-6080(01)00028-4].
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
05-lambda.pdf non disponibili Tipologia: Post-print Licenza: PUBBLICO - Pubblico con Copyright Dimensione 394.32 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	394.32 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/22245

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

Networks with trainable amplitude of activation functions

TRENTIN EDMONDO

2001-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)