Semantic Relata for the Evaluation of Distributional Models in Mandarin Chinese

Liu, H., Chersoni, E., Klyueva, N., Santus, E., & Huang, C. R. (2019). Semantic Relata for the Evaluation of Distributional Models in Mandarin Chinese. IEEE Access, 7, 145705-145713. [8854798]. https://doi.org/10.1109/ACCESS.2019.2945061

Abstract

Distributional Semantic Models (DSMs) established themselves as a standard for the representation of word and sentence meaning. However, DSMs provide quantitative measurement of how strongly two linguistic expressions are related, without being able to automatically classify different semantic relations. Hence the notion of semantic similarity is underspecified in DSMs. We introduce Evalution-MAN in this paper as an effort to address this underspecification problem. Following the EVALution 1.0 dataset for English, we present a dataset for evaluating DSMs on the task of the identification of semantic relations in Mandarin Chinese. Moreover, we test different types of word vectors on the automatic learning of these semantic relations, and we evaluate them both in a unsupervised and in a supervised setting, finding that distributional models tend, in general, to assign higher similarity scores to synonyms and that deep learning classifiers are the best performing ones in the identification of semantic relations.

Link to publication in Scopus

Link to publication in IEEE Xplore