An energy-based comparative analysis of common approaches to text classification in the Legal domain

IRIS

Most Machine Learning research evaluates the best solutions in terms of performance. However, in the race for the best performing model, many important aspects are often overlooked when, on the contrary, they should be carefully considered. In fact, sometimes the gaps in performance between different approaches are neglectable, whereas factors such as production costs, energy consumption, and carbon footprint must take into consideration. Large Language Models (LLMs) are extensively adopted to address NLP problems in academia and industry. In this work, we present a detailed quantitative comparison of LLM and traditional approaches (e.g. SVM) on the LexGLUE benchmark, which takes into account both performance (standard indices) and alternative metrics such as timing, power consumption and cost, in a word: the carbon-footprint. In our analysis, we considered the prototyping phase (model selection by training-validation-test iterations) and in-production phases separately, since they follow different implementation procedures and also require different resources. The results indicate that very often, the simplest algorithms achieve performance very close to that of large LLMs but with very low power consumption and lower resource demands. The results obtained could suggest companies to include additional evaluations in the choice of Machine Learning (ML) solutions.

Gultekin, S., Globo, A., Zugarini, A., Ernandes, M., Rigutini, L. (2024). An energy-based comparative analysis of common approaches to text classification in the Legal domain. In 4th International Conference on AI, Machine Learning and Applications (AIMLA 2024), January 27 - 28, 2024, Copenhagen, Denmark (pp.31-41). Tamil Nadu : AIRCC Publishing Corporation [10.5121/csit.2024.140203].

An energy-based comparative analysis of common approaches to text classification in the Legal domain

Gultekin, Sinan;Globo, Achille^Validation;Zugarini, Andrea^{Investigation};Ernandes, Marco^{Membro del Collaboration Group};Rigutini, Leonardo^{Conceptualization}

2024-01-01

Abstract

Most Machine Learning research evaluates the best solutions in terms of performance. However, in the race for the best performing model, many important aspects are often overlooked when, on the contrary, they should be carefully considered. In fact, sometimes the gaps in performance between different approaches are neglectable, whereas factors such as production costs, energy consumption, and carbon footprint must take into consideration. Large Language Models (LLMs) are extensively adopted to address NLP problems in academia and industry. In this work, we present a detailed quantitative comparison of LLM and traditional approaches (e.g. SVM) on the LexGLUE benchmark, which takes into account both performance (standard indices) and alternative metrics such as timing, power consumption and cost, in a word: the carbon-footprint. In our analysis, we considered the prototyping phase (model selection by training-validation-test iterations) and in-production phases separately, since they follow different implementation procedures and also require different resources. The results indicate that very often, the simplest algorithms achieve performance very close to that of large LLMs but with very low power consumption and lower resource demands. The results obtained could suggest companies to include additional evaluations in the choice of Machine Learning (ML) solutions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				978-1-923107-17-5
			
	Citazione
	
				Gultekin, S., Globo, A., Zugarini, A., Ernandes, M., Rigutini, L. (2024). An energy-based comparative analysis of common approaches to text classification in the Legal domain. In 4th International Conference on AI, Machine Learning and Applications (AIMLA 2024), January 27 - 28, 2024, Copenhagen, Denmark (pp.31-41). Tamil Nadu : AIRCC Publishing Corporation [10.5121/csit.2024.140203].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
csit140203.pdf non disponibili Tipologia: PDF editoriale Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 758.74 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	758.74 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1255437