The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


A New Hybrid Improved Method for Measuring Concept Semantic Similarity in WordNet

Computing semantic similarity between concepts is an important issue in natural language processing, artificial intelligence, information retrieval and knowledge management. The measure of computing concept similarity is a fundament of semantic computation. In this paper, we analyze typical semantic similarity measures and note Wu and Palmer’s measure which does not distinguish the similarities between nodes from a node to different nodes of the same level. Then, we synthesize the advantages of measure of path-based and IC-based, and propose a new hybrid method for measuring semantic similarity. By testing on a fragment of WordNet hierarchical tree, the results demonstrate the proposed method accurately distinguishes the similarities between nodes from a node to different nodes of the same level and overcome the shortcoming of the Wu and Palmer’s measure.


[1] Adhikari A., Dutta B., Dutta A., Mondal D., and Singh S., “An Intrinsic Information Content- Based Semantic Similarity Measure Considering the Disjoint Common Subsumers of Concepts of an Ontology,” Journal of the Association for Information Science and Technology, vol. 69, no. 8, pp. 1023-1034, 2018.

[2] Aouicha M., Taieb M., and Ben Hamadou A., “Taxonomy-Based Information Content and Wordnet-Wiktionary-Wikipedia Glosses for Semantic Relatedness,” Applied Intelligence, vol. 45, no. 2, pp. 475-511, 2016.

[3] Bailey R., “Frequency Analysis of English Usage: Lexicon and Grammar, and: Word Frequencies in British and American English,” Dictionaries 438 The International Arab Journal of Information Technology, Vol. 17, No. 4, July 2020 Journal of the Dictionary Society of North America, vol. 5, no. 1, pp. 128-134,1983.

[4] Cai Y., Zhang Q., Lu W., and Che X., “A Hybrid Approach for Measuring Semantic Similarity Based on IC-Weighted Path Distance in Wordnet,” Journal of Intelligent Information Systems, vol. 51, no. 1, pp. 23-47, 2018.

[5] Devitt A. and Vogel C., “The Topology of Wordnet: Some Metrics,” in Proceedings of GWC-04, 2nd Global WordNet Conference, Brno, pp. 106-111, 2004.

[6] Jiang J. and Conrath W., “Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy,” in Proceedings of International Conference on Research in Computational Linguistics, Taiwan, pp. 19-33, 1997.

[7] Karra W. and Slimani T., “A New Approach for Arabic Named Entity Recognition,” The International Arab Journal of Information Technology, vol. 14, no. 3, pp. 332-338, 2017.

[8] Leacock C. and Chodorow M., “C-Rater: Automated Scoring of Short-Answer Questions,” Computers and the Humanities, vol. 37, no. 4, pp. 389-405, 2003.

[9] Lin D., “An Information-Theoretic Definition of Similarity,” in Proceedings of 15th International Conference on Machine Learning, Morgan Kaufmann, pp. 296-304, 1998.

[10] Lofi C., “Measuring Semantic Similarity and Relatedness with Distributional and Knowledge- based Approachesm,” Information and Media Technologies, vol.10, no. 3, pp. 493-501, 2015.

[11] Lu W., Cai Y., Che X., and Lu Y., “Joint Semantic Similarity Assessment with Raw Corpus and Structured Ontology for Semantic- Oriented Service Discovery,” Personal and Ubiquitous Computing, vol. 20, no. 3, pp. 311- 323, 2016.

[12] Miller G. and Charles W., “Contextual Correlates of Semantic Similarity,” Language Cognition and Neuroscience, vol. 6, no. 1, pp. 1-28, 1991.

[13] Pirró G., “A Semantic Similarity Metric Combining Features and Intrinsic Information Content,” Data and Knowledge Engineering, vol. 68, no. 11, pp. 1289-1308, 2009.

[14] Pirró G. and Euzenat J., “A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness,” International semantic web conference, Berlin, pp. 615-630, 2010.

[15] Rada R., Mili H., Bicknell E., and Blettner M., “Development and Application of A Metric on Semantic Nets,” IEEE Transactions on Systems Man and Cybernetics, vol. 19, no. 1, pp. 17-30, 1989.

[16] Resnik P, “Using Information Content to Evaluate Semantic Similarity in A Taxonomy,” in Proceeding of international Joint Conference on Artificial Intelligence, San Francisco, pp. 448-453, 1995.

[17] Rubenstein H. and Goodenough J., “Contextual Correlates of Synonymy,” Communications of the ACM, vol. 8, no. 10, pp. 627-633, 1965.

[18] Sánchez D. and Batet M., “Semantic similarity estimation in the Biomedical Domain: An Ontology-Based Information-Theoretic Perspective,” Journal of Biomedical Informatics, vol. 44, no. 5, pp. 749-59, 2011.

[19] Sánchez D., Ribalta A., Batet M., and Serratosa F., “Enabling Semantic Similarity Estimation Acrossmultiple Ontologies: An Evaluation in the Biomedical Domain,” Journal of Biomedical Informatics, vol. 45, no. 1, pp. 141-155, 2012.

[20] Seco N., Veale T., and Hayes J., “An Intrinsic Information Content Metric for Semantic Similarity in Wordnet,” in Proceeding of European Conference on Artificial Intelligence, Ecai', Including Prestigious Applicants of Intelligent Systems, Spain, pp. 1089-1090, 2004.

[21] Taieb M., Ben Aouicha M., and Ben Hamadou A., “Ontology-Based Approach for Measuring Semantic Similarity,” Engineering Applications of Artificial Intelligence, vol. 36, pp. 238-261, 2014.

[22] Taieb M., Ben Aouicha M., and Ben Hamadou A., “Computing Semantic Relatedness Using Wikipedia Features,” Knowledge-Based Systems, vol. 50, no. 50, pp. 260-278, 2013.

[23] Taieb M., Ben Aouicha M., and Ben Hamadou A., “A New Semantic Relatedness Measurement Using Wordnet Features,” Knowledge and Information Systems, vol. 41, no. 2, pp. 467-497, 2014.

[24] Varelas, G., Voutsakis E., Raftopoulou P., and Petrakis E., “Semantic Similarity Methods in Wordnet and Their Application to Information Retrieval on the Web,” in Proceeding of ACM International Workshop on Web Information and Data Management, Germany, pp. 10-16, 2005.

[25] Wu Z. and Palmer M., “Verb Semantics and Lexical Selection,” in Proceedings of Annual Meeting on Association for Computational Linguistics, United States, pp. 133-138, 1994.

[26] Zesch T., “Study of Semantic Relatedness of Words Using Collaboratively Constructed Semantic Resources,” Thesis, Technische Universität, 2010.

[27] Zhang Y., Shang L., Huang L., Porter A., Zhang G., Lu J., and Zhu D., “A Hybrid Similarity Measure Method for Patent Portfolio Analysis,” Journal of Informetrics, vol. 10, no. 4, pp. 1108- 1130, 2016.

[28] Zhou Z., Wang Y., and Gu J., “A New Model of Information Content for Semantic Similarity in Wordnet,” in Proceeding of International A New Hybrid Improved Method for Measuring Concept Semantic Similarity in WordNet 439 Conference on Future Generation Communication and Networking Symposia, China, pp. 85-89, 2008. Xiaogang Zhang PhD student, College of Computer Science and Technology, Zhejiang University. Major research: Data Mining, Semantic Computation. Shouqian Sun, College of Computer Science and Technology, Zhejiang University. Major research: Application of ergonomics and design, creative design services, intelligent sports equipment technology. Kejun Zhang, College of Computer Science and Technology, Zhejiang University. Major research: Data Mining, applying Machine Learning, Evolutionary Algorithms, and Computational Linguistics techniques to the extraction of knowledge from music.