UN CORPUS DELLA STAMPA ITALIANA LOCALE

oleh: Simone Torsani

Format: Article
Diterbitkan: Università degli Studi di Torino 2019-12-01

Deskripsi

A corpus of the Italian local press. This paper introduces CoSIL, a corpus of articles from Italian local newspapers containing about 180,000 texts and 66,000,000 words. The corpus was built to provide researchers with a freely downloadable balanced corpus of journalistic texts and a material for linguistic research on online local press, a nowadays-pervasive source of information. Besides the objectives behind the construction of the corpus, the paper describes its design and development, focusing on its representativeness and balance.