The common practice in Machine Learning research is to evaluate the top-performing models based on their performance. However, this often leads to overlooking other crucial aspects that should be given careful consideration. In some cases, the performance differences between various approaches may be insignificant, whereas factors like production costs, energy consumption, and carbon footprint should be taken into account. Large Language Models (LLMs) are widely used in academia and industry to address NLP problems. In this study, we present a comprehensive quantitative comparison between traditional approaches (SVM-based) and more recent approaches such as LLM (BERT family models) and generative models (GPT-2 and LLAMA2), using the LexGLUE benchmark. Our evaluation takes into account not only performance parameters (standard indices), but also alternative measures such as timing, energy consumption and costs, which collectively contribute to the carbon footprint. To ensure a complete analysis, we separately considered the prototyping phase (which involves model selection through training-validation-test iterations) and the in-production phases. These phases follow distinct implementation procedures and require different resources. The results indicate that simpler algorithms often achieve performance levels similar to those of complex models (LLM and generative models), consuming much less energy and requiring fewer resources. These findings suggest that companies should consider additional considerations when choosing machine learning (ML) solutions. The analysis also demonstrates that it is increasingly necessary for the scientific world to also begin to consider aspects of energy consumption in model evaluations, in order to be able to give real meaning to the results obtained using standard metrics (Precision, Recall, F1 and so on).
Rigutini, L., Globo, A., Stefanelli, M., Zugarini, A., Gultekin, S., Ernandes, M. (2024). Performance, Energy Consumption and Costs: A Comparative Analysis of Automatic Text Classification Approaches in the Legal Domain. INTERNATIONAL JOURNAL ON NATURAL LANGUAGE COMPUTING, 13(1), 19-35 [10.5121/ijnlc.2024.13102].
Performance, Energy Consumption and Costs: A Comparative Analysis of Automatic Text Classification Approaches in the Legal Domain
Rigutini, Leonardo
Conceptualization
;Stefanelli, MarcoInvestigation
;
2024-01-01
Abstract
The common practice in Machine Learning research is to evaluate the top-performing models based on their performance. However, this often leads to overlooking other crucial aspects that should be given careful consideration. In some cases, the performance differences between various approaches may be insignificant, whereas factors like production costs, energy consumption, and carbon footprint should be taken into account. Large Language Models (LLMs) are widely used in academia and industry to address NLP problems. In this study, we present a comprehensive quantitative comparison between traditional approaches (SVM-based) and more recent approaches such as LLM (BERT family models) and generative models (GPT-2 and LLAMA2), using the LexGLUE benchmark. Our evaluation takes into account not only performance parameters (standard indices), but also alternative measures such as timing, energy consumption and costs, which collectively contribute to the carbon footprint. To ensure a complete analysis, we separately considered the prototyping phase (which involves model selection through training-validation-test iterations) and the in-production phases. These phases follow distinct implementation procedures and require different resources. The results indicate that simpler algorithms often achieve performance levels similar to those of complex models (LLM and generative models), consuming much less energy and requiring fewer resources. These findings suggest that companies should consider additional considerations when choosing machine learning (ML) solutions. The analysis also demonstrates that it is increasingly necessary for the scientific world to also begin to consider aspects of energy consumption in model evaluations, in order to be able to give real meaning to the results obtained using standard metrics (Precision, Recall, F1 and so on).File | Dimensione | Formato | |
---|---|---|---|
13124ijnlc02.pdf
accesso aperto
Tipologia:
PDF editoriale
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
230.06 kB
Formato
Adobe PDF
|
230.06 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1260905