COCO_TS Dataset: Pixel–Level Annotations Based on Weak Supervision for Scene Text Segmentation

IRIS

The absence of large scale datasets with pixel–level supervisions is a significant obstacle for the training of deep convolutional networks for scene text segmentation. For this reason, synthetic data generation is normally employed to enlarge the training dataset. Nonetheless, synthetic data cannot reproduce the complexity and variability of natural images. In this paper, a weakly supervised learning approach is used to reduce the shift between training on real and synthetic data. Pixel–level supervisions for a text detection dataset (i.e. where only bounding–box annotations are available) are generated. In particular, the COCO–Text–Segmentation (COCO_TS) dataset, which provides pixel–level supervisions for the COCO–Text dataset, is created and released. The generated annotations are used to train a deep convolutional neural network for semantic segmentation. Experiments show that the proposed dataset can be used instead of synthetic data, allowing us to use only a fraction of the training samples and significantly improving the performances.

Bonechi, S., Andreini, P., Bianchini, M., Scarselli, F. (2019). COCO_TS Dataset: Pixel–Level Annotations Based on Weak Supervision for Scene Text Segmentation. In K.P. Tetko I.V. (a cura di), Artificial Neural Networks and Machine Learning – ICANN 2019: Image Processing. ICANN 2019 (pp. 238-250). Berlino : Springer-Verlag [10.1007/978-3-030-30508-6_20].

COCO_TS Dataset: Pixel–Level Annotations Based on Weak Supervision for Scene Text Segmentation

Bonechi, Simone;Andreini, Paolo;Bianchini, Monica;Scarselli, Franco

2019-01-01

Abstract

The absence of large scale datasets with pixel–level supervisions is a significant obstacle for the training of deep convolutional networks for scene text segmentation. For this reason, synthetic data generation is normally employed to enlarge the training dataset. Nonetheless, synthetic data cannot reproduce the complexity and variability of natural images. In this paper, a weakly supervised learning approach is used to reduce the shift between training on real and synthetic data. Pixel–level supervisions for a text detection dataset (i.e. where only bounding–box annotations are available) are generated. In particular, the COCO–Text–Segmentation (COCO_TS) dataset, which provides pixel–level supervisions for the COCO–Text dataset, is created and released. The generated annotations are used to train a deep convolutional neural network for semantic segmentation. Experiments show that the proposed dataset can be used instead of synthetic data, allowing us to use only a fraction of the training samples and significantly improving the performances.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Codice ISBN
	
				978-3-030-30507-9
978-3-030-30508-6
			
	Citazione
	
				Bonechi, S., Andreini, P., Bianchini, M., Scarselli, F. (2019). COCO_TS Dataset: Pixel–Level Annotations Based on Weak Supervision for Scene Text Segmentation. In K.P. Tetko I.V. (a cura di), Artificial Neural Networks and Machine Learning – ICANN 2019: Image Processing. ICANN 2019 (pp. 238-250). Berlino : Springer-Verlag [10.1007/978-3-030-30508-6_20].
			
	Appare nelle tipologie:
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
COCO_TS_Dataset_Pixel-level_Annotations_Based_on_Weak_Supervision_for_Scene_Text_Segmentation.pdf non disponiibile Tipologia: Post-print Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 4.98 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	4.98 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1080152