The International Arab Journal of Information Technology (IAJIT)


Interactive Query Expansion using Concept-Based

Despite the advances in information retrieval the search engines still result in imprecise or poor results, mainly due to the quality of the query being submitted. The query formulation to express their information need has always been challenging for the users. In this paper, we have proposed an interactive query expansion methodology using Concept-Based Directions Finder (CBDF). The approach determines the directions in which the search can be continued by the user using Explicit Semantic Analysis (ESA) for a given query. The CBDF identifies the relevant terms with a corresponding label for each of the directions found, based on the content and link structure of Wikipedia. The relevant terms identified along with its label are suggested to the user for query expansion through the new visual interface proposed. The visual interface named as terms mapper, accepts the query, and displays the potential directions and a group of relevant terms along with the label for the direction chosen by the user. We evaluated the results of the proposed approach and the visual interfacefor the identified queries. The experimental result shows that the approach produces a good Mean Average Precision (MAP) for the queries chosen.

[1] Avancini H., Lavelli A., Sebastiani F., and Zanoli R., Automatic Expansion of Domain-Specific Lexicons by Term Categorization, ACM Transactions on Speech and Language Processing, vol. 3, no. 1, pp. 1-30, 2006.

[2] Christopher M., Prabhakar R., and Hinrich S., An Introduction to Information Retrieval, Cambridge University Press, Cambridge, 2008.

[3] Croft B. and Thompson H., I3R: A New Approach to the Design of Document Retrieval Systems, Journal of the American Society for Information Science, vol. 38, no. 6, pp. 389-404, 1987.

[4] Egozi O., Gabrilovich E., and Markovitch S., Concept-Based Feature Generation and Selection for Information Retrieval, in Proceedings of the 23 rd National Conference on Arti cial Intelligence, Chicago, vol. 2, pp. 1132- 1137, 2008.

[5] Egozi O., Markovitch S., and Gabrilovich E., Concept Based Information Retrieval using Explicity Semantic Analysis, ACM Transactions on Information Systems, vol. 29, no. 2, pp. 1-34, 2011.

[6] Fonseca B., Golgher P., P ssas B., Ribeiro-Neto B., and Ziviani N., Concept-Based Interactive Query Expansion, in Proceedings of the 14 th ACM International Conference on Information and Knowledge Management, Germany, pp. 696- 703, 2005.

[7] Gabrilovich E. and Markovitch S., Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge, in Proceedings of the 21 st National Conference on Artificial Intelligence, vol. 2, pp. 1301-1306, 2006.

[8] Gabrilovich E. and Markovitch S., Computing Semantic Relatedness using Wikipedia-Based Explicit Semantic Analysis, in Proceedings of the 20 th International Joint Conference on Artificial Intelligence, USA, pp. 1606-1611, 2007.

[9] Ghobadi A. and Rahgozar M., An Ontology- Based Semantic Extraction Approach for B2C e- Commerce, International Arab Journal of Information Technology, vol. 8, no. 2, pp. 163- 170, 2011.

[10] Gregorowics A. and Mark K., Mining a Large- Scale Term-Concept Network from Wikipeida, Technical Report, MITRE Corporation, USA, 2006.

[11] Google, available at:, last visited 2011.

[12] Jansen B., Booth D., and Spink A., Determining the Informational, Navigational, and Transactional Intent of Web Queries, International Journal on Information Processing & Management, vol. 44, no. 3, pp. 1251-1266, 2008.

[13] Li Y., Luk P., Ho S., and Chung F., Improving Weak Ad-hoc Queries using Wikipedia as External Corpus, in Proceedings of the 30 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, pp. 797-798, 2007. The International Arab Journal of Information Technology, Vol. 10, No. 6, November 2013

[14] Milne D., Computing Semantic Relatedness using Wikipedia Link Structure, in Proceedings of the New Zealand Computer Science Research Student Conference, New Zealand, pp. 1-8, 2007.

[15] Mima H. and Ananiadou S., An Application and Evaluation of the C/NC-Value Approach for the Automatic Term Recognition of Multi-Word Units in Japanese, International Journal Terminology, vol. 8, no. 2, pp. 175-194, 2001.

[16] Mima H., Ananiadou S., and Matsushima K., Terminology-Based Knowledge Mining for New Knowledge Discovery, ACM Transactions on Asian Language Information Processing, vol. 5, no. 1, pp. 74-88, 2006.

[17] Research-ESA Web Service, available at: configurator/index/, last visited 2011.

[18] Syafrullah M. and Salim N., Improving Term Extraction using Particle Swarm Optimization Techniques, Journal of Computing, vol. 2, no. 2, pp. 116-120, 2010.

[19] TREC 2010 Web Track, available at:, last visited 2011.

[20] Toutanova K., Klein D., Manning C., and Singer Y., Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network, in Proceedings of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, USA, pp. 252-259, 2003.

[21] Velardi P., Navigli R., and D Amadio P., Mining the Web to Create Specialized Glossaries, IEEE Intelligent Systems, vol. 23, no. 5, pp. 18-25, 2008.

[22] Wikipedia, available at: wiki/Wikipedia:About#Basic_navigation_in_Wik ipedia, last visited 2011.

[23] Wikipedia-Manual, available at:, last visited 2011.

[24] Wikipedia-Help: Link, available at:, last visited 2011.

[25] Wikipedia, available at: wiki/Backlink, last visited 2011.

[26] Yahoo, available at:, last visited 2011. Yuvarani Meiyappan is working as Lead in Education and Research at Infosys limited, India. Currently, she is doing her PhD in VIT University. Her research interest includes information retrieval, machine learning and semantics. Sriman Narayana Iyengar obtained his MSc, ME, PhD. Currently, he is director for Perivar EVR Central Library and senior professor at the School of Computing Science and Engineering at Vellore, India. His research interests include agent based distributed computing, security aspects of all networks including VOIP, intelligent information retrieval, computational methods, bio informatics and fluid mechanics. He has authored and co-authored several books and had nearly 120 research publications in reputed peer reviewed international Journals. He served as PCM/Reviewer for many international and IEEE conferences. He is a chief editor for IJSEA of AIRCC, guest editor for special issue on cloud computing and services of Int l J. of communications, network and system sciences. He is also, an editorial board member for many reputed international journals like IJCA, IJCTE, IJSE, IJEMTA, JCMS, and many more.