In this paper we describe a video surveillance system able to detect traffic events in videos acquired by fixed videocameras on highways. The events of interest consist in a specific sequence of situations that occur in the video, as for instance a vehicle stopping on the emergency lane. Hence, the detection of these events requires to analyze a temporal sequence in the video stream. We compare different approaches that exploit architectures based on Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs). A first approach extracts vectors of features, mostly related to motion, from each video frame and exploits a RNN fed with the resulting sequence of vectors. The other approaches are based directly on the sequence of frames, that are eventually enriched with pixel-wise motion information. The obtained stream is processed by an architecture that stacks a CNN and a RNN, and we also investigate a transfer-learning-based model. The results are very promising and the best architecture will be tested online in real operative conditions.
Tiezzi, M., Melacci, S., Maggini, M., Frosini, A. (2018). Video surveillance of highway traffic events by deep learning architectures. In Artificial Neural Networks and Machine Learning – ICANN 2018 (pp.584-593). Berlin : Springer Verlag [10.1007/978-3-030-01424-7_57].
Video surveillance of highway traffic events by deep learning architectures
TIEZZI, MATTEO;Melacci, Stefano;Maggini, Marco;
2018-01-01
Abstract
In this paper we describe a video surveillance system able to detect traffic events in videos acquired by fixed videocameras on highways. The events of interest consist in a specific sequence of situations that occur in the video, as for instance a vehicle stopping on the emergency lane. Hence, the detection of these events requires to analyze a temporal sequence in the video stream. We compare different approaches that exploit architectures based on Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs). A first approach extracts vectors of features, mostly related to motion, from each video frame and exploits a RNN fed with the resulting sequence of vectors. The other approaches are based directly on the sequence of frames, that are eventually enriched with pixel-wise motion information. The obtained stream is processed by an architecture that stacks a CNN and a RNN, and we also investigate a transfer-learning-based model. The results are very promising and the best architecture will be tested online in real operative conditions.File | Dimensione | Formato | |
---|---|---|---|
melacci_ICANN2018a.pdf
non disponibili
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.25 MB
Formato
Adobe PDF
|
1.25 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1065979