Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian Language

IRIS

Italian is a Romance language that has its roots in Vulgar Latin. The birth of the modern Italian started in Tuscany around the 14th century, and it is mainly attributed to the works of Dante Alighieri, Francesco Petrarca and Giovanni Boccaccio, who are among the most acclaimed authors of the medieval age in Tuscany. However, Italy has been characterized by a high variety of dialects, which are often loosely related to each other, due to the past fragmentation of the territory. Italian has absorbed influences from many of these dialects, as also from other languages due to dominion of portions of the country by other nations, such as Spain and France. In this work we present Vulgaris, a project aimed at studying a corpus of Italian textual resources from authors of different regions, ranging in a time period between 1200 and 1600. Each composition is associated to its author, and authors are also grouped in families, i.e. sharing similar stylistic/chronological characteristics. Hence, the dataset is not only a valuable resource for studying the diachronic evolution of Italian and the differences between its dialects, but it is also useful to investigate stylistic aspects between single authors. We provide a detailed statistical analysis of the data, and a corpus-driven study in dialectology and diachronic varieties.

Zugarini, A., Tiezzi, M., Maggini, M. (2020). Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian Language. In Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (pp.150-159). STROUDSBURG, PA : International Committee on Computational Linguistics (ICCL).

Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian Language

Zugarini, Andrea;Tiezzi, Matteo;Maggini, Marco

2020-01-01

Abstract

Italian is a Romance language that has its roots in Vulgar Latin. The birth of the modern Italian started in Tuscany around the 14th century, and it is mainly attributed to the works of Dante Alighieri, Francesco Petrarca and Giovanni Boccaccio, who are among the most acclaimed authors of the medieval age in Tuscany. However, Italy has been characterized by a high variety of dialects, which are often loosely related to each other, due to the past fragmentation of the territory. Italian has absorbed influences from many of these dialects, as also from other languages due to dominion of portions of the country by other nations, such as Spain and France. In this work we present Vulgaris, a project aimed at studying a corpus of Italian textual resources from authors of different regions, ranging in a time period between 1200 and 1600. Each composition is associated to its author, and authors are also grouped in families, i.e. sharing similar stylistic/chronological characteristics. Hence, the dataset is not only a valuable resource for studying the diachronic evolution of Italian and the differences between its dialects, but it is also useful to investigate stylistic aspects between single authors. We provide a detailed statistical analysis of the data, and a corpus-driven study in dialectology and diachronic varieties.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Codice ISBN
	
				978-1-952148-47-7
			
	Citazione
	
				Zugarini, A., Tiezzi, M., Maggini, M. (2020). Vulgaris: Analysis of a Corpus for Middle-Age Varieties of Italian Language. In Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (pp.150-159). STROUDSBURG, PA : International Committee on Computational Linguistics (ICCL).
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2020.vardial-1.14.pdf accesso aperto Tipologia: PDF editoriale Licenza: Creative commons Dimensione 644.18 kB Formato Adobe PDF Visualizza/Apri	644.18 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1223676