Visual attention refers to the human brain’s ability to select relevant sensory information for preferential processing, improving performance in visual and cognitive tasks. It proceeds in two phases. One in which visual feature maps are acquired and processed in parallel. Another where the information from these maps is merged in order to select a single location to be attended for further and more complex computations and reasoning. Its computational description is challenging, especially if the temporal dynamics of the process are taken into account. Numerous methods to estimate saliency have been proposed in the last 3 decades. They achieve almost perfect performance in estimating saliency at the pixel level, but the way they generate shifts in visual attention fully depends on winner-take-all (WTA) circuitry. WTA is implemented by the biological hardware in order to select a location with maximum saliency, towards which to direct overt attention. In this paper we propose a gravitational model to describe the attentional shifts. Every single feature acts as an attractor and the shifts are the result of the joint effects of the attractors. In the current framework, the assumption of a single, centralized saliency map is no longer necessary, though still plausible. Quantitative results on two large image datasets show that this model predicts shifts more accurately than winner-take-all.

Zanca, D., Gori, M., Melacci, S., Rufa, A. (2020). Gravitational models explain shifts on human visual attention. SCIENTIFIC REPORTS, 10(1) [10.1038/s41598-020-73494-2].

Gravitational models explain shifts on human visual attention

Zanca, D.
;
Gori, M.;Melacci, S.;Rufa, A.
2020-01-01

Abstract

Visual attention refers to the human brain’s ability to select relevant sensory information for preferential processing, improving performance in visual and cognitive tasks. It proceeds in two phases. One in which visual feature maps are acquired and processed in parallel. Another where the information from these maps is merged in order to select a single location to be attended for further and more complex computations and reasoning. Its computational description is challenging, especially if the temporal dynamics of the process are taken into account. Numerous methods to estimate saliency have been proposed in the last 3 decades. They achieve almost perfect performance in estimating saliency at the pixel level, but the way they generate shifts in visual attention fully depends on winner-take-all (WTA) circuitry. WTA is implemented by the biological hardware in order to select a location with maximum saliency, towards which to direct overt attention. In this paper we propose a gravitational model to describe the attentional shifts. Every single feature acts as an attractor and the shifts are the result of the joint effects of the attractors. In the current framework, the assumption of a single, centralized saliency map is no longer necessary, though still plausible. Quantitative results on two large image datasets show that this model predicts shifts more accurately than winner-take-all.
2020
Zanca, D., Gori, M., Melacci, S., Rufa, A. (2020). Gravitational models explain shifts on human visual attention. SCIENTIFIC REPORTS, 10(1) [10.1038/s41598-020-73494-2].
File in questo prodotto:
File Dimensione Formato  
melacci_SCIENTIFICREPORTS2020.pdf

accesso aperto

Tipologia: PDF editoriale
Licenza: Creative commons
Dimensione 1.15 MB
Formato Adobe PDF
1.15 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1122705