In this paper, we present PARTIME, a software library written in Python and based on PyTorch, designed specifically to speed up neural networks whenever data is continuously streamed over time, for both learning and inference. Existing libraries are designed to exploit data-level parallelism, assuming that samples are batched, a condition that is not naturally met in applications that are based on streamed data. Differently, PARTIME starts processing each data sample at the time in which it becomes available from the stream. PARTIME wraps the code that implements a feed-forward multi-layer network and it distributes the layer-wise processing among multiple devices, such as Graphics Processing Units (GPUs). Thanks to its pipeline-based computational scheme, PARTIME allows the devices to perform computations in parallel. At inference time this results in scaling capabilities that are theoretically linear with respect to the number of devices. During the learning stage, PARTIME can leverage the non-i.i.d. nature of the streamed data with samples that are smoothly evolving over time for efficient gradient computations. Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning, distributing operations on up to 8 NVIDIA GPUs, showing significant speedups that are almost linear in the number of devices, mitigating the impact of the data transfer overhead.

Meloni, E., Faggi, L., Marullo, S., Betti, A., Tiezzi, M., Gori, M., et al. (2022). PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks. In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) (pp.665-670). New York : IEEE [10.1109/ICMLA55696.2022.00110].

PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks

Faggi, Lapo;Tiezzi, Matteo;Gori, Marco;Melacci, Stefano
2022-01-01

Abstract

In this paper, we present PARTIME, a software library written in Python and based on PyTorch, designed specifically to speed up neural networks whenever data is continuously streamed over time, for both learning and inference. Existing libraries are designed to exploit data-level parallelism, assuming that samples are batched, a condition that is not naturally met in applications that are based on streamed data. Differently, PARTIME starts processing each data sample at the time in which it becomes available from the stream. PARTIME wraps the code that implements a feed-forward multi-layer network and it distributes the layer-wise processing among multiple devices, such as Graphics Processing Units (GPUs). Thanks to its pipeline-based computational scheme, PARTIME allows the devices to perform computations in parallel. At inference time this results in scaling capabilities that are theoretically linear with respect to the number of devices. During the learning stage, PARTIME can leverage the non-i.i.d. nature of the streamed data with samples that are smoothly evolving over time for efficient gradient computations. Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning, distributing operations on up to 8 NVIDIA GPUs, showing significant speedups that are almost linear in the number of devices, mitigating the impact of the data transfer overhead.
2022
978-1-6654-6283-9
Meloni, E., Faggi, L., Marullo, S., Betti, A., Tiezzi, M., Gori, M., et al. (2022). PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks. In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) (pp.665-670). New York : IEEE [10.1109/ICMLA55696.2022.00110].
File in questo prodotto:
File Dimensione Formato  
melacci_ICMLA2022.pdf

non disponibili

Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 638.09 kB
Formato Adobe PDF
638.09 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1231239