Script Identification of Multi-Script Documents: A Survey

oleh: Kurban Ubul, Gulzira Tursun, Alimjan Aysa, Donato Impedovo, Giuseppe Pirlo, Tuergen Yibulayin

Format: Article
Diterbitkan: IEEE 2017-01-01

Deskripsi

In recent years, with the widespread of Internet and digitized processing of multi-script documents worldwide, script identification techniques have become more important in the pattern recognition field. Script identification concerns methods for identifying different scripts in multi-lingual, multi-script documents. This paper presents a comprehensive overview on research activities in the field and focuses on the most valuable results obtained so far. The most vital processes in script identification are addressed in detail: identification and discriminating methods, features extraction (local and global), and classification. Different kinds of approaches have been developed and promising results have been achieved. This paper reports SoA performance results. This paper reports methods concerning handwritten, printed, and hybrid document processing. More research is necessary to meet the performance levels essential for everyday applications.