Toward Novel Optimizers: A Moreau-Yosida View of Gradient-Based Learning

Betti, A.; Ciravegna, G.; Gori, M.; Melacci, S.; Mottin, K.; Precioso, F.

doi:10.1007/978-3-031-47546-7_15

Machine Learning (ML) strongly relies on optimization procedures that are based on gradient descent. Several gradient-based update schemes have been proposed in the scientific literature, especially in the context of neural networks, that have become common optimizers in software libraries for ML. In this paper, we re-frame gradient-based update strategies under the unifying lens of a Moreau-Yosida (MY) approximation of the loss function. By means of a first-order Taylor expansion, we make the MY approximation concretely exploitable to generalize the model update. In turn, this makes it easy to evaluate and compare the regularization properties that underlie the most common optimizers, such as gradient descent with momentum, ADAGRAD, RMSprop, and ADAM. The MY-based unifying view opens to the possibility of designing novel update schemes with customizable regularization properties. As case-study we propose to use the network outputs to deform the notion of closeness in the parameter space.

Betti, A., Ciravegna, G., Gori, M., Melacci, S., Mottin, K., Precioso, F. (2023). Toward Novel Optimizers: A Moreau-Yosida View of Gradient-Based Learning. In AIxIA 2023 – Advances in Artificial Intelligence XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023 (pp.218-230). Cham : Springer [10.1007/978-3-031-47546-7_15].

Toward Novel Optimizers: A Moreau-Yosida View of Gradient-Based Learning

Betti, A.;Ciravegna, G.;Gori, M.;Melacci, S.;Mottin, K.;Precioso, F.

2023-01-01

Abstract

Machine Learning (ML) strongly relies on optimization procedures that are based on gradient descent. Several gradient-based update schemes have been proposed in the scientific literature, especially in the context of neural networks, that have become common optimizers in software libraries for ML. In this paper, we re-frame gradient-based update strategies under the unifying lens of a Moreau-Yosida (MY) approximation of the loss function. By means of a first-order Taylor expansion, we make the MY approximation concretely exploitable to generalize the model update. In turn, this makes it easy to evaluate and compare the regularization properties that underlie the most common optimizers, such as gradient descent with momentum, ADAGRAD, RMSprop, and ADAM. The MY-based unifying view opens to the possibility of designing novel update schemes with customizable regularization properties. As case-study we propose to use the network outputs to deform the notion of closeness in the parameter space.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Codice ISBN
	
				978-3-031-47545-0
978-3-031-47546-7
			
	Citazione
	
				Betti, A., Ciravegna, G., Gori, M., Melacci, S., Mottin, K., Precioso, F. (2023). Toward Novel Optimizers: A Moreau-Yosida View of Gradient-Based Learning. In AIxIA 2023 – Advances in Artificial Intelligence XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023 (pp.218-230). Cham : Springer [10.1007/978-3-031-47546-7_15].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
melacci_AIXIA2023.pdf non disponibili Tipologia: PDF editoriale Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 286.62 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	286.62 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1252641

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

Toward Novel Optimizers: A Moreau-Yosida View of Gradient-Based Learning

Betti, A.;Ciravegna, G.;Gori, M.;Melacci, S.;Mottin, K.;Precioso, F.

2023-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)