The popularity of printing devices has multiplied the diffusion of printed documents, raising concerns regarding the security and integrity of their content. The same device that prints reliable contracts, newspapers, and others, can also be used for malicious purposes, such as printing fake money, forging fake contracts, and produce illegal packaging, thus calling for the development of image forensics techniques to pinpoint criminal printed materials and trace back to their origin. Despite some recent advances, previous works model such a problem as a big data-focused closed-set classification problem. In this work, we address the source linking problem of printed color documents by treating it as a verification problem. Specifically, we aim at deciding if two documents have been printed by the same printer or not. To achieve this goal, and to cope with the data scarcity deriving from the difficulty of gathering massive amounts of printed and scanned documents, we propose to use an ensemble of Siamese Neural Networks, with unique architectures expressly designed to work with a small training dataset. As a further unique feature, the proposed approach is suited to work in an open set scenario, where the printers used to produce the documents analyzed at the test time are not included in the training set. Results obtained under both open and closed set conditions, with a thorough comparison with available baseline methods, showed classification performance higher than 97% in the closed set scenario and higher than 86% in the open set case, highlighting the practicality of such approaches in real-world scenarios.

Ferreira, A., Purnekar, N., Barni, M. (2021). Ensembling Shallow Siamese Neural Network Architectures for Printed Documents Verification in Data-Scarcity Scenarios. IEEE ACCESS, 9, 133924-133939 [10.1109/ACCESS.2021.3110297].

Ensembling Shallow Siamese Neural Network Architectures for Printed Documents Verification in Data-Scarcity Scenarios

Purnekar N.;Barni M.
2021-01-01

Abstract

The popularity of printing devices has multiplied the diffusion of printed documents, raising concerns regarding the security and integrity of their content. The same device that prints reliable contracts, newspapers, and others, can also be used for malicious purposes, such as printing fake money, forging fake contracts, and produce illegal packaging, thus calling for the development of image forensics techniques to pinpoint criminal printed materials and trace back to their origin. Despite some recent advances, previous works model such a problem as a big data-focused closed-set classification problem. In this work, we address the source linking problem of printed color documents by treating it as a verification problem. Specifically, we aim at deciding if two documents have been printed by the same printer or not. To achieve this goal, and to cope with the data scarcity deriving from the difficulty of gathering massive amounts of printed and scanned documents, we propose to use an ensemble of Siamese Neural Networks, with unique architectures expressly designed to work with a small training dataset. As a further unique feature, the proposed approach is suited to work in an open set scenario, where the printers used to produce the documents analyzed at the test time are not included in the training set. Results obtained under both open and closed set conditions, with a thorough comparison with available baseline methods, showed classification performance higher than 97% in the closed set scenario and higher than 86% in the open set case, highlighting the practicality of such approaches in real-world scenarios.
2021
Ferreira, A., Purnekar, N., Barni, M. (2021). Ensembling Shallow Siamese Neural Network Architectures for Printed Documents Verification in Data-Scarcity Scenarios. IEEE ACCESS, 9, 133924-133939 [10.1109/ACCESS.2021.3110297].
File in questo prodotto:
File Dimensione Formato  
Ensembling_Shallow_Siamese_Neural_Network_Architectures_for_Printed_Documents_Verification_in_Data-Scarcity_Scenarios.pdf

accesso aperto

Tipologia: PDF editoriale
Licenza: Creative commons
Dimensione 1.42 MB
Formato Adobe PDF
1.42 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1204090