The ambition to create increasingly realistic images has driven researchers to develop increasingly powerful models, capable of generalizing and generating high-resolution images, even in a multimodal setup (e.g., from textual input). Among the most recent generative networks, Stable Diffusion Models (SDMs) have achieved state-of-the-art showing great generative capabilities but also a high degree of complexity, both in terms of training and interpretability. Indeed, the impressive generalization capability of pre-trained SDMs has pushed researchers to exploit their internal representation to perform downstream tasks (e.g., classification and segmentation). Understanding how well the model preserves semantic information is fundamental to improve its performance. Our approach, namely Diff-Props, analyses the features extracted from the U-Net within Stable Diffusion Model to unveil how Stable Diffusion retains semantic information of an image in a pre-trained setup. Exploiting a set of different distance metrics, Diff-Props aims to analyse how features at different depths contribute to preserving the meaning of the objects in the image.
Bonechi, S., Andreini, P., Corradini, B.T., Scarselli, F. (2024). Diff-Props: is Semantics Preserved within a Diffusion Model?. PROCEDIA COMPUTER SCIENCE, 246, 5244-5253 [10.1016/j.procs.2024.09.628].
Diff-Props: is Semantics Preserved within a Diffusion Model?
Bonechi, Simone
;Andreini, Paolo;Scarselli, Franco
2024-01-01
Abstract
The ambition to create increasingly realistic images has driven researchers to develop increasingly powerful models, capable of generalizing and generating high-resolution images, even in a multimodal setup (e.g., from textual input). Among the most recent generative networks, Stable Diffusion Models (SDMs) have achieved state-of-the-art showing great generative capabilities but also a high degree of complexity, both in terms of training and interpretability. Indeed, the impressive generalization capability of pre-trained SDMs has pushed researchers to exploit their internal representation to perform downstream tasks (e.g., classification and segmentation). Understanding how well the model preserves semantic information is fundamental to improve its performance. Our approach, namely Diff-Props, analyses the features extracted from the U-Net within Stable Diffusion Model to unveil how Stable Diffusion retains semantic information of an image in a pre-trained setup. Exploiting a set of different distance metrics, Diff-Props aims to analyse how features at different depths contribute to preserving the meaning of the objects in the image.File | Dimensione | Formato | |
---|---|---|---|
Diff-Props: is Semantics Preserved within a Diffusion Model?.pdf
accesso aperto
Tipologia:
PDF editoriale
Licenza:
Creative commons
Dimensione
779.04 kB
Formato
Adobe PDF
|
779.04 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1279577