Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis

IRIS

We propose a Universal Defence against backdoor attacks based on Clustering and Centroids Analysis (CCA-UD). The goal of the defence is to reveal whether a Deep Neural Network model is subject to a backdoor attack by inspecting the training dataset. CCA-UD first clusters the samples of the training set by means of density-based clustering. Then, it applies a novel strategy to detect the presence of poisoned clusters. The proposed strategy is based on a general misclassification behaviour observed when the features of a representative example of the analysed cluster are added to benign samples. The capability of inducing a misclassification error is a general characteristic of poisoned samples, hence the proposed defence is attack-agnostic. This marks a significant difference with respect to existing defences, that, either can defend against only some types of backdoor attacks, or are effective only when some conditions on the poisoning ratio or the kind of triggering signal used by the attacker are satisfied. Experiments carried out on several classification tasks and network architectures, considering different types of backdoor attacks (with either clean or corrupted labels), and triggering signals, including both global and local triggering signals, as well as sample-specific and source-specific triggers, reveal that the proposed method is very effective to defend against backdoor attacks in all the cases, always outperforming the state of the art techniques.

Guo, W., Tondi, B., Barni, M. (2024). Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 19, 970-984 [10.1109/tifs.2023.3329426].

Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis

Wei Guo;Benedetta Tondi;Mauro Barni

2024-01-01

Abstract

We propose a Universal Defence against backdoor attacks based on Clustering and Centroids Analysis (CCA-UD). The goal of the defence is to reveal whether a Deep Neural Network model is subject to a backdoor attack by inspecting the training dataset. CCA-UD first clusters the samples of the training set by means of density-based clustering. Then, it applies a novel strategy to detect the presence of poisoned clusters. The proposed strategy is based on a general misclassification behaviour observed when the features of a representative example of the analysed cluster are added to benign samples. The capability of inducing a misclassification error is a general characteristic of poisoned samples, hence the proposed defence is attack-agnostic. This marks a significant difference with respect to existing defences, that, either can defend against only some types of backdoor attacks, or are effective only when some conditions on the poisoning ratio or the kind of triggering signal used by the attacker are satisfied. Experiments carried out on several classification tasks and network architectures, considering different types of backdoor attacks (with either clean or corrupted labels), and triggering signals, including both global and local triggering signals, as well as sample-specific and source-specific triggers, reveal that the proposed method is very effective to defend against backdoor attacks in all the cases, always outperforming the state of the art techniques.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista su cui è pubblicata l'opera
	
				IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY
			
	Citazione
	
				Guo, W., Tondi, B., Barni, M. (2024). Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 19, 970-984 [10.1109/tifs.2023.3329426].
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Universal_Detection_of_Backdoor_Attacks_via_Density-Based_Clustering_and_Centroids_Analysis.pdf accesso aperto Tipologia: PDF editoriale Licenza: Creative commons Dimensione 3.54 MB Formato Adobe PDF Visualizza/Apri	3.54 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1252335