Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams

IRIS

Devising intelligent agents able to live in an environment and learn by observing the surroundings is a longstanding goal of Artificial Intelligence. From a bare Machine Learning perspective, challenges arise when the agent is prevented from leveraging large fully-annotated dataset, but rather the interactions with supervisory signals are sparsely distributed over space and time. This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream. The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations. Spatio-temporal stochastic coherence along the attention trajectory, paired with a contrastive term, leads to an unsupervised learning criterion that naturally copes with the considered setting. Differently from most existing works, the learned representations are used in open-set class-incremental classification of each frame pixel, relying on few supervisions. Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream. Inheriting features from state-of-the art models is not as powerful as one might expect.

Tiezzi, M., Marullo, S., Faggi, L., Meloni, E., Betti, A., Melacci, S. (2022). Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams. In IJCAI International Joint Conference on Artificial Intelligence (pp.3480-3486). International Joint Conferences on Artificial Intelligence [10.24963/ijcai.2022/483].

Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams

Tiezzi M.;Marullo S.;Faggi L.;Meloni E.;Betti A.;Melacci S.

2022-01-01

Abstract

Devising intelligent agents able to live in an environment and learn by observing the surroundings is a longstanding goal of Artificial Intelligence. From a bare Machine Learning perspective, challenges arise when the agent is prevented from leveraging large fully-annotated dataset, but rather the interactions with supervisory signals are sparsely distributed over space and time. This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream. The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations. Spatio-temporal stochastic coherence along the attention trajectory, paired with a contrastive term, leads to an unsupervised learning criterion that naturally copes with the considered setting. Differently from most existing works, the learned representations are used in open-set class-incremental classification of each frame pixel, relying on few supervisions. Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream. Inheriting features from state-of-the art models is not as powerful as one might expect.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Codice ISBN
	
				9781956792003
			
	Citazione
	
				Tiezzi, M., Marullo, S., Faggi, L., Meloni, E., Betti, A., Melacci, S. (2022). Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams. In IJCAI International Joint Conference on Artificial Intelligence (pp.3480-3486). International Joint Conferences on Artificial Intelligence [10.24963/ijcai.2022/483].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
melacci_IJCAI2022.pdf accesso aperto Tipologia: PDF editoriale Licenza: PUBBLICO - Pubblico con Copyright Dimensione 1.3 MB Formato Adobe PDF Visualizza/Apri	1.3 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1218735