Artificial Intelligence (AI)–generated images represent a significant threat in various fields, such as security, privacy, media forensics and content moderation. In this paper, a novel approach for the detection of StyleGAN2–generated human faces is presented, leveraging a Transfer Learning strategy to improve the Classification performance of the models. A modified version of the state– of–the–art semantic segmentation model DeepLabV3+, using either a ResNet50 or a MobileNetV3 Large as feature extraction backbones, is used to create both a face segmentation model and the synthetic image detector. To achieve this goal, the models are at first trained for face segmentation in a multi–class Classification task on a widely used semantic segmentation dataset, achieving remarkable results for both configurations. Then, the pre–trained models are retrained on a collection of real and generated images, gathered from different sources to solve a binary Classification task, namely to detect synthetic (i.e. generated) images, thus carrying out two different transfer learning strategies. The results indicate that this targeted methodology significantly improves the detection rates compared to analyzing the face as a whole, and underlines the importance of advanced image recognition technologies when tackling the challenge of detecting generated faces.
Tanfoni, M., Ceroni, E.G., Pancino, N., Bianchini, M., Maggini, M. (2024). Facial Segmentation in Deepfake Classification: a Transfer Learning Approach. PROCEDIA COMPUTER SCIENCE, 246, 4160-4168 [10.1016/j.procs.2024.09.255].
Facial Segmentation in Deepfake Classification: a Transfer Learning Approach
Marco Tanfoni
;Elia Giuseppe Ceroni
;Monica Bianchini;Marco Maggini
2024-01-01
Abstract
Artificial Intelligence (AI)–generated images represent a significant threat in various fields, such as security, privacy, media forensics and content moderation. In this paper, a novel approach for the detection of StyleGAN2–generated human faces is presented, leveraging a Transfer Learning strategy to improve the Classification performance of the models. A modified version of the state– of–the–art semantic segmentation model DeepLabV3+, using either a ResNet50 or a MobileNetV3 Large as feature extraction backbones, is used to create both a face segmentation model and the synthetic image detector. To achieve this goal, the models are at first trained for face segmentation in a multi–class Classification task on a widely used semantic segmentation dataset, achieving remarkable results for both configurations. Then, the pre–trained models are retrained on a collection of real and generated images, gathered from different sources to solve a binary Classification task, namely to detect synthetic (i.e. generated) images, thus carrying out two different transfer learning strategies. The results indicate that this targeted methodology significantly improves the detection rates compared to analyzing the face as a whole, and underlines the importance of advanced image recognition technologies when tackling the challenge of detecting generated faces.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S1877050924022749-main.pdf
accesso aperto
Tipologia:
PDF editoriale
Licenza:
Creative commons
Dimensione
1.42 MB
Formato
Adobe PDF
|
1.42 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1279958