The International Arab Journal of Information Technology (IAJIT)


Toward a New Arabic Question Answering System Imane Lahbari, Said El Alaoui, and Khalid Zidani

Question Answering Systems (QAS) aim at returning precise answers to user’s questions that are written in natural language. In this paper, we describe our question processing and document retrieval as two components of Arabic QAS. First, we present Arabic question classification method based on SVM classifier and Li and Roth’s [24] taxonomy. Then, we describe our proposed technique to transform an Arabic question, to a query which is available to get information from the Arabic Wikipedia. In this paper, we use a hybrid Arabic Part-of-Speech (POS) tagging and Arabic WordNet (AWN) for query expansion. We have conducted several experiments using Text Retrieval Conference (TREC) and Cross Lingual Evaluation Forum (CLEF) datasets. The obtained results have shown that the proposed method is more effective as compared with the existing methods.

[1] Ababou N. and Mazroui A., A Hybrid Arabic POS Tagging for Simple and Compound Morphosyntactic Tags, International Journal of Speech Technology, vol. 19, no. 2, pp. 289-302, 2016.

[2] Abderrahim M., Abderrahim M., and Chikh M., Using Arabic Wordnet for Query Expansion in Information Retrieval System, in Proceedings of the 3rd International Conference on Web and Information Technologies IEEE, Marrakech, 2010.

[3] Abouenour L., Bouzoubaa K., and Rosso P., On the Evaluation and Improvement of Arabic Wordnet Coverage and Usability, Language Resources and Evaluation Journal, vol. 47, no. 3, pp 891-917, 2013.

[4] Adam L., Desktop Search Engine Rankings, Technical Report, 2016.

[5] Akour M., Abu Fardeh S., Magel K., and Al- Radaideh Q., QArabPro: A Rule Based Question Answering System for Reading Comprehension Tests in Arabic, The American Journal of Applied Sciences, vol. 8, no. 6, pp. 652-661, 2011.

[6] Al-Shalabi R., Kanaan G., Yaseen M., Al- Sarayreh B., and Al-Naji N., Arabic Query Expansion using Interactive Word Sense Disambiguation, in Proceedings of the 2nd International Conference in Arabic Language Resources and Tools, Cairo, 2009.

[7] Bakari W., Bellot P., and Neji M., A Logical Representation of Arabic Questions Toward Automatic Passage Extraction from the Web, International Journal of Speech Technology, vol. 20, no. 2, pp. 339-353, 2017.

[8] Benajiba Y., Zitouni I, Diab M., and Rosso P., Arabic Named Entity Recognition: Using Features Extracted from Noisy Data, in Proceedings of the Association for Question . . . . . . Class: country Classification : noun : noun : verb (will be deleted) : non detected Class: country ( ) POS tag , , , Class: country ( ) AWN expansion Query , Class: country ( ) Search engine (Google API) Wiki Documents 618 The International Arab Journal of Information Technology, Vol. 15, No. 3A, Special Issue 2018 Computational Linguistics Conference Short Papers, Uppsala, pp 281-285, 2010.

[9] Black W., Elkateb S., Rodriguez H., Alkhalifa M., Vossen P., Pease A., Bertran M., and Fellbaum C., The Arabic WordNet Project, in Proceedings of the International Conference on Language Resources and Evaluation, Genoa, 2006.

[10] Boudchiche M., Mazroui M., Bebah M., and Lakhouaja A., L analyseur Morphosyntaxique Alkhalil Morpho Sys 2, 1 re Journ e Doctorale Nationale sur L'Ing nierie de la Langue Arabe, Rabat, 2014.

[11] Chang C., and Lin C., A Library for Support Vector Machines, ACM Transactions on Intelligent Systems and Technology Journal, vol. 2, no. 3, pp. 1-27, 2011.

[12] Darwish K., Abdelali A., and Mubarak H., Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging, in Proceedings of the International Conference on Language Resources and Evaluation, Iceland, 2014.

[13] El-Halees A., Arabic Text Classification using Maximum Entropy, the Islamic University Journal-Series of Natural Studies and Engineering, vol. 15, no. 1, pp. 157-167, 2007.

[14] El-Kourdi M., Bensaid A. and Rachidi T., Automatic Arabic Document Categorization Based on the Na ve Bayes Algorithm, in Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, Geneva, pp. 51-58, 2004.

[15] Elsebai A., Meziane F., and Belkredim F., A Rule Based Persons Names Arabic Extraction System, in Proceedings of the 11th International Business Information Management Association Conference, Cairo, 2009

[16] Frank E. and Bouckaert R., Naive Bayes for Text Classification with Unbalanced Classes, in Proceedings of the 10th European conference of Knowledge Discovery in Databases, pp 503-510, Berlin, 2006.

[17] Hadni M., El Alaoui S., and Lachkar A., Word Sense Disambiguation for Arabic Text Categorization, The International Arab Journal of Information Technology, vol. 13, no. 1A, pp. 215-222, 2016.

[18] Hammo B., Abu-Salem H., and Lytinen S., QARAB: A Question Answering System to Support the Arabic Language, in Proceedings of the workshop on Computational Approaches to Semitic Languages, Philadelphia, pp. 1-11, 2002.

[19] Harrag F., El-Qawasmeh E., and Pichappan P., Improving Arabic Text Categorization Using Decision Trees, in Proceedings of the First International Conference on Networked Digital Technologies, Ostrava, pp. 110-115, 2009.

[20] Weisscher A., arabicwn, Last Visited, 2014.

[21] Krutchen P., The Rational Unified Process Model: An Introduction, Addison-Wesley Professional, 2004.

[22] Lahbari I., Ouatik S. E., Zidani K. A., A Rule- based Method for Arabic Question Classification, in Proceedings of The International Conference on Wireless Networks and Mobile Communications, Rabat, pp. 1-6, 2017.

[23] Li X., Xuan-Jing H., and Li-De W., Question Classification using Multiple Classifiers, in Proceedings of the 5th Workshop on Asian Language Resources (ALR-05) and First Symposium on Asian Language Resources Network, Jeju Island, 2005.

[24] Li X. and Roth D., Learning Question Classifiers: The Role of Semantic Information, Natural Language Engineering, vol. 12, no. 3, pp. 229-249, 2002.

[25] Mesleh A., Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study, in Proceedings of the 12th WSEAS International Conference on Applied Mathematics, Cairo, pp. 228-233, 2007.

[26] Sawaf H., Zaplo J., and Ney H., Statistical Classification Methods for Arabic News Articles, in Proceedings of the Workshop on Arabic Natural Language Processing, Toulouse, 2001.

[27] Shaalan K., A Survey of Arabic Named Entity Recognition and Classification, Computational Linguistics, vol. 40, no. 2, pp. 469-510, 2014.

[28] Srinivasan R., Multiclass Text Classification A Decision Tree based SVM Approach, Technical Report, 2006.

[29] Toutanova K., Klein D., Manning C., and Singer Y., Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network, in Proceedings of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, pp. 173-180, 2003.

[30] Trigui O., Belguith L., and Rosso P., DefArabicQA: Arabic Definition Question Answering System, in Proceedings of the 7th Workshop on Language Resources and Human Language Technologies for Semitic Languages, Valletta, pp. 40-45, 2010. Toward a New Arabic Question Answering System 619 Imane Lahbari is a Phd student in Laboratory of Informatics and Modeling, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco. She received the master degree in Information Systems, Network and Multimedia in 2015. Her research interests include Natural language processing and Question Answering Systems. Said El Alaoui is working as a Professor since 1997 in Department of Computer Science, Faculty of Sciences Dhar EL-Mahraz (FSDM) at Sidi Mohamed Ben Abdellah University (USMBA), Fez, Morocco. His current research interests include Natural Language Processing, Information Retrieval, Biomedical Question Answering, Biomedical Information Extraction, and Arabic Document Clustering and Categorization, High-dimensional indexing and Content-Based Image Retrieval. Khalid Zidani received his PhD degree from the Faculty of Sciences, Nantes, France, in 1994 in computer Science. His current research interests include Natural Language Processing, Arabic Text Mining, Information Retrieval, Document Clustering and Categorization, Content-Based Image Retrieval, Large Image Databases Indexing.