The International Arab Journal of Information Technology (IAJIT)


Research on the Similarity between Nodes with Hypernymy/Hyponymy Relations based on IC and Taxonomical Structure

The similarity method has an important effect on some tasks of natural language processing, such as information retrieval, automatic translation and named entity recognition. Hypernymy/hyponymy relations are widespread in semantic webs and knowledge graphs, so computing the similarity of hypernymy/hyponymy is a key issue in the text processing field. All measures of both feature-based and IC-based methods have obvious deficiencies. The feature-based method estimated the similarity by the depth of the node, and the IC-based method computed the similarity by the position of the deepest common parent. The deficiency of the feature-based method and IC-based method is that they include one parameter, so the performance is slightly inaccurate and unstable. To address this deficiency, our paper proposed a hybrid method that computes the similarity of hypernymy/hyponymy by a hybrid parameter (dhype(lch)) that implies two parameters: depth of the node and position of the deepest common parent. Compared with several similarity methods, the proposed method achieved better performance in terms of accuracy rate, Pearson correlation coefficient and artificial fitting effect.

[1] Adhikari A., Singh S., Dutta A., and Dutta B., “A Novel Information Theoretic Approach for Finding Semantic Similarity i+n Word Net,” in Proceeding of TENCON IEEE Region 10 Conference, Macau, pp.1-6, 2016.

[2] AlMousa M., Benlamri R., and Khoury R., “Exploiting Non-Taxonomic Relations for Measuring Semantic Similarity and Relatedness in WordNet,” Knowledge-Based Systems, vol. 212, pp. 106565, 2021.

[3] Aouicha M. and Taieb M., “Computing semantic Similarity Between Biomedical Concepts Using New Information Content Approach,” Journal of Biomedical Informatics, vol. 59, no. 1, pp. 258- 275, 2016.

[4] Baker W., “Understanding English as a Lingua Franca-By B. Seidlhofer,” International Journal of Applied Linguistics, vol. 22, no. 1, pp. 124-128, 2012.

[5] Banerjee S., “Extended Gloss Overlaps As A Method of Semantic Relatedness,” in Proceeding of International Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishers Inc, pp. 805-810, 2003.

[6] Cai Y., Zhang Q., Lu W., and Che X., “A Hybrid Approach for Measuring Semantic Similarity Based on IC-Weighted Path distance in Word Net,” Journal of intelligent information systems, vol. 51, no. 1, pp. 23-47, 2018.

[7] Cai Y., Pan S., Wang X., Chen H., Cai X., and Zuo M., “Measuring Distance-Based Semantic Similarity Using Meronymy and Hyponymy Relations,” Neural Computing and Applications, vol. 32, no. 8, pp. 3521-3534, 2020.

[8] Hengqi H, Juan Y., Xiao L., and YunJiang X., “Review on Knowledge Graphs,” Computer Systems and Applications, vol. 28, no. 6, pp. 1-12, 2019.

[9] Hussain M., Wasti S., Huang G., Wei L., Jiang Y., and Tang Y., “An Approach for Measuring Semantic Similarity Between Wikipedia Concepts Using Multiple Inheritances,” Information Processing and Management, vol. 57, no. 3, pp. 102188, 2020.

[10] Jiang J. and Conrath D., “Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy,” in Proceedings of International Conference Research on Computational Linguistics, Taipei, pp. 19-33, 1997.

[11] Lastra-Díaz J., Goikoetxea J., Taieb M., Serrano A., Aouicha M., and Agirre E., “Reproducibility Dataset for A Large Experimental Survey on Word Embedding and Ontology-Based Methods for Word Similarity” Data in Brief, vol. 26, pp. 104432, 2019.

[12] Leacock C. and Chodorow M., “Combining Local Context and Word Net Similarity for Word Sense Identification,” Word Net: An electronic Lexical Database, vol. 49, no. 2, pp. 265-283, 1998.

[13] Lin D., “An Information-Theoretic Definition of Similarity,” in Preceding of Fifteenth International Conference on Machine Learning, San Francisco, pp. 296-304, 1998.

[14] Majumder G., Pakray P., and Avendano D., “Measuring Semantic Textual Similarity Using Modified Information Content of WordNet and Trigram Language Model,” International Journal of Computational Linguistics Research, vol. 8, no. 4, pp. 171-177, 2017.

[15] Patwardhan S., “Incorporating Dictionary and Corpus Information into a Vector Method of Semantic Relatedness” M.S Thesis, University of Minnesota, 2003.

[16] Pirró G. and Euzenat J., “A Feature And Information Theoretic Framework For Semantic Similarity and Relatedness,” in Proceeding of The Semantic Web–ISWC, Berlin, pp. 615-630, 2010.

[17] Rada R., Mili H., Bicknell E., and Blettner M., “Development and Application of a Metric in Semantic Nets,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 19, no. 1, pp. 17-30, 1989.

[18] Resnik P., “Using Information Content to Evaluate Semantic Similarity in A Taxonomy,” in Proceedings of the 14th International Joint Research on the Similarity between Nodes with Hypernymy/Hyponymy Relations ... 395 Conference on, Artificial Intelligence, Morgan Kaufmann Publishers Inc. pp. 448-453, 1995.

[19] Sánchez D., Ribalta A., Batet M., and Serratosa F., “Enabling Semantic Similarity Estimation across Multiple Ontologies: An Evaluation in The Biomedical Domain,” Journal of Biomedical Information, vol. 25, no.1, pp. 141-155, 2012.

[20] Sánchez D., Batet M., and Isern D., “Ontology- based Information Content Computation,” Knowledge-Based Systems, vol. 24, no. 2, pp. 297- 303, 2011.

[21] Sathiya B. and Geetha T., “A Review on Semantic Similarity Measures for Ontology,” Journal of Intelligent and Fuzzy Systems, vol. 36, no. 4, pp. 3045-3059, 2019.

[22] Seco N., Veale T., and Hayes J., “An Intrinsic Information Content Metric for Semantic Similarity in Word Net,” in Proceeding of European Conference on Artificial Intelligence, Ecai'2004, Including Prestigious Applicants of Intelligent Systems, Paris, pp. 1089-1090, 2004.

[23] Taieb M., Aouicha M., and Hamadou A., “Computing Semantic Relatedness Using Wikipedia Features,” Knowledge Based Systems, vol. 50, no. 9, pp. 260-278, 2013.

[24] Tversky A., “Features of Similarity,” Readings in Cognitive Science, vol. 84, no. 4, 290-302, 1988.

[25] Wu Z. and Palmer M., “Verb Semantics and Lexical Selection” in Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, Stroudsburg, pp. 133-138, 2012.

[26] Xiaoli M., Robert R., and Donald R., “Comparing Correlated Correlation Coefficients,” Psychological Bulletin, vol. 111, no. 1, pp.172- 175, 1992.

[27] Yanna W., Zili Z., and Yan H., “The Concept Semantic Similarity Estimation based IC in WordNet,” Computer Engineering, vol. 37, no. 22, pp. 42-44, 2011.

[28] Zhang X., Sun S., and Zhang K., “A New Hybrid Improved Method for Measuring Concept Semantic Similarity in Word Net,” The International Arab Journal of Information Technology, vol. 17, no. 4, pp. 433-439, 2020.

[29] Zhu G. and Iglesias C., “Exploiting Semantic Similarity for Named Entity Disambiguation in Knowledge Graphs,” Expert System with Application, vol. 101, pp. 8-24, 2018.

[30] Zhu X., Li F., Chen H., and Peng Q., “An Efficient Path Computing Model for Measuring Semantic Similarity Using Edge and Density,” Knowledge and Information Systems, vol. 55, no. 1, pp. 79-111, 2018.