Pitfalls in Processing Infinite-Length Sequences with Popular Approaches for Sequential Data

Casoni, Michele; Tommaso, Guidi; Matteo, Tiezzi; Alessandro, Betti; Gori, Marco; Melacci, Stefano

doi:10.1007/978-3-031-71602-7_4

One of the enduring challenges for the Machine Learning community is developing models that can process and learn from very long data sequences. Transformer-based models and Recurrent Neural Networks (RNNs) have excelled in processing long sequences, yet face challenges in transitioning to processing infinite-length sequences online, a crucial step in mimicking human learning over continuous data streams. While Transformer models handle large context windows efficiently, they suffer from quadratic computational costs, motivating research into alternative attention mechanisms. Conversely, RNNs, particularly Deep State-Space Models (SSMs), have shown promise in long sequence tasks, outperforming Transformers in certain benchmarks. However, current approaches are limited to finite-length sequences, which are pre-buffered and randomly shuffled to cope with stochastic gradient descent. This paper addresses the fundamental gap in transitioning from offline-processing of a dataset of sequences to online-processing of possibly infinite-length sequences, a scenario often neglected in existing research. Empirical evidence is presented, demonstrating the performance and limits of existing models. We highlight the challenges and opportunities in learning from a continuous data stream, paving the way for future research in this area.

Casoni, M., Guidi, T., Tiezzi, M., Betti, A., Gori, M., Melacci, S. (2024). Pitfalls in Processing Infinite-Length Sequences with Popular Approaches for Sequential Data. In Artificial Neural Networks in Pattern Recognition (pp.37-48). Cham : Springer [10.1007/978-3-031-71602-7_4].

Pitfalls in Processing Infinite-Length Sequences with Popular Approaches for Sequential Data

Casoni Michele;Guidi Tommaso;Tiezzi Matteo;Betti Alessandro;Gori Marco;Melacci Stefano

2024-01-01

Abstract

One of the enduring challenges for the Machine Learning community is developing models that can process and learn from very long data sequences. Transformer-based models and Recurrent Neural Networks (RNNs) have excelled in processing long sequences, yet face challenges in transitioning to processing infinite-length sequences online, a crucial step in mimicking human learning over continuous data streams. While Transformer models handle large context windows efficiently, they suffer from quadratic computational costs, motivating research into alternative attention mechanisms. Conversely, RNNs, particularly Deep State-Space Models (SSMs), have shown promise in long sequence tasks, outperforming Transformers in certain benchmarks. However, current approaches are limited to finite-length sequences, which are pre-buffered and randomly shuffled to cope with stochastic gradient descent. This paper addresses the fundamental gap in transitioning from offline-processing of a dataset of sequences to online-processing of possibly infinite-length sequences, a scenario often neglected in existing research. Empirical evidence is presented, demonstrating the performance and limits of existing models. We highlight the challenges and opportunities in learning from a continuous data stream, paving the way for future research in this area.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				9783031716010
9783031716027
			
	Citazione
	
				Casoni, M., Guidi, T., Tiezzi, M., Betti, A., Gori, M., Melacci, S. (2024). Pitfalls in Processing Infinite-Length Sequences with Popular Approaches for Sequential Data. In Artificial Neural Networks in Pattern Recognition (pp.37-48). Cham : Springer [10.1007/978-3-031-71602-7_4].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
melacci_ANNPR2024.pdf non disponibili Tipologia: PDF editoriale Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 397.7 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	397.7 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1274934

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

Pitfalls in Processing Infinite-Length Sequences with Popular Approaches for Sequential Data

Casoni Michele;Guidi Tommaso;Tiezzi Matteo;Betti Alessandro;Gori Marco;Melacci Stefano

2024-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)