The ambition to create increasingly realistic images has driven researchers to develop increasingly powerful models, capable of generalizing and generating high-resolution images, even in a multimodal setup (e.g., from textual input). Among the most recent generative networks, Stable Diffusion Models (SDMs) have achieved state-of-the-art showing great generative capabilities but also a high degree of complexity, both in terms of training and interpretability. Indeed, the impressive generalization capability of pre-trained SDMs has pushed researchers to exploit their internal representation to perform downstream tasks (e.g., classification and segmentation). Understanding how well the model preserves semantic information is fundamental to improve its performance. Our approach, namely Diff-Props, analyses the features extracted from the U-Net within Stable Diffusion Model to unveil how Stable Diffusion retains semantic information of an image in a pre-trained setup. Exploiting a set of different distance metrics, Diff-Props aims to analyse how features at different depths contribute to preserving the meaning of the objects in the image.

Bonechi, S., Andreini, P., Corradini, B.T., Scarselli, F. (2024). Diff-Props: is Semantics Preserved within a Diffusion Model?. PROCEDIA COMPUTER SCIENCE, 246, 5244-5253 [10.1016/j.procs.2024.09.628].

Diff-Props: is Semantics Preserved within a Diffusion Model?

Bonechi, Simone
;
Andreini, Paolo;Scarselli, Franco
2024-01-01

Abstract

The ambition to create increasingly realistic images has driven researchers to develop increasingly powerful models, capable of generalizing and generating high-resolution images, even in a multimodal setup (e.g., from textual input). Among the most recent generative networks, Stable Diffusion Models (SDMs) have achieved state-of-the-art showing great generative capabilities but also a high degree of complexity, both in terms of training and interpretability. Indeed, the impressive generalization capability of pre-trained SDMs has pushed researchers to exploit their internal representation to perform downstream tasks (e.g., classification and segmentation). Understanding how well the model preserves semantic information is fundamental to improve its performance. Our approach, namely Diff-Props, analyses the features extracted from the U-Net within Stable Diffusion Model to unveil how Stable Diffusion retains semantic information of an image in a pre-trained setup. Exploiting a set of different distance metrics, Diff-Props aims to analyse how features at different depths contribute to preserving the meaning of the objects in the image.
2024
Bonechi, S., Andreini, P., Corradini, B.T., Scarselli, F. (2024). Diff-Props: is Semantics Preserved within a Diffusion Model?. PROCEDIA COMPUTER SCIENCE, 246, 5244-5253 [10.1016/j.procs.2024.09.628].
File in questo prodotto:
File Dimensione Formato  
Diff-Props: is Semantics Preserved within a Diffusion Model?.pdf

accesso aperto

Tipologia: PDF editoriale
Licenza: Creative commons
Dimensione 779.04 kB
Formato Adobe PDF
779.04 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1279577