An Enriched Information-Theoretic Definition of Semantic Similarity in a Taxonomy

oleh: Anna Formica, Francesco Taglino

Format: Article
Diterbitkan: IEEE 2021-01-01

Deskripsi

This paper addresses the notion of semantic similarity between concepts organized according to a taxonomy, based on the well-known <italic>information content</italic> approach. This approach has been widely experimented in the literature over the years and, in general, outperforms other proposals which do not originate from it. However, it shows some limitations related to the notion of <italic>generic sense</italic> of a concept. In this paper we illustrate the problem arising by using the traditional approach, and a novel information-theoretic definition of semantic similarity in a taxonomy is proposed which also takes into account the <italic>intended sense</italic> of a concept in a given context. This proposal has been applied to some among the most representative state-of-the-art similarity measures based on the information content approach, and the experiment shows that it achieves very high correlation values with human judgment.