The International Arab Journal of Information Technology (IAJIT)


A Machine Learning System for Distinguishing Nominal and Verbal Arabic Sentences

The complexity of Arabic language takes origin from the richness in morphology, differences and difficulties of its structures than other languages. Thus, it is important to learn about the specialty and the structure of this language to deal with its complexity. This paper presents a new inductive learning system that distinguishes the nominal and verbal sentences in Modern Standard Arabic (MSA). The use of inductive learning in association with natural language processing is a new and an interdisciplinary collaboration field, specifically in Arabic Language. A series of experiments on 376 well annotated (i.e., Gold Standards) Arabic sentences that range from 2 to 11 words, which present simple to complex MSA sentences, have been conducted. The results obtained showed that the proposed system has distinguished nominal and verbal sentences with an accuracy around 90% for 15% unseen sentences, and around 80% for 75% of unseen sentences.

[1] Abu-Soud S., A Disjunctive Learning Algorithm for Extracting General Rules, Journal of Institute of Mathematics and Computer Science (Computer Science Series), vol. 10, no. 2, pp. 201-217, 1999.

[2] Abu-Soud S. and Haj Hassan M., A Parallel Inductive Learning Algorithm, AMSE journal, France, 2000.

[3] Awajan A., Keyword Extraction from Arabic Documents using Term Equivalence classes, ACM Transactions on Asian and Low-Resource Language Information Processing,vol. 14, no. 2, 2015.

[4] Daelemans W., Weijters T., and Van den Bosch A., Empirical Learning of Natural Language Processing Tasks, Machine Learning: Proceedings of ECML-97, Springer.

[5] Ditters E., A Formal Grammar for the Description of Sentence Structure in Modern Standard Arabic, in Proceedings of Arabic Language Processing: Status and Prospects, Nijmegen, 2001.

[6] Forsyth R, Machine Learning principles and techniques, Chapman and Hall, 1989.

[7] Habash N., Gabbard R., Rambow O., Marcus M., and Kulick S., Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features, in Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, pp. 1084-1092, 2007.

[8] Hancox P., Mills W., Reid B., Keyguide to Information Sources in Artificial Intelligence/ Expert Systems, Cambridge University Press, 1990.

[9] Hammadi O. and Aziz M., Grammatical Relation Extraction in Arabic Language, Journal of Computer Science, vol. 8, no. 6, pp. 891-989, 2012.

[10] Maamouri M. and Bies A., Developing an Arabic Treebank: Methods, Guidelines, Procedures, and Tools, in Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, Geneva, pp. 2- 9, 2004.

[11] Magerman D., Statistical Decision Tree Models for Parsing, in Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, Cambridge, pp. 267- 283, 1995. 584 The International Arab Journal of Information Technology, Vol. 15, No. 3A, Special Issue 2018

[12] Michalski R, A theory and Methodology of Inductive Learning, Artificial Intelligence, Elsevier, 1983.

[13] Mohamed E. and K bler S., Arabic Part of Speech Tagging, Natural Language Engineering, vol. 18, no. 4, pp. 521-548, 2011.

[14] Othman E., Shaalan K., and Rafea A., Towards Resolving Ambiguity in Understanding Arabic Sentence, in Proceedings of International Conference on Arabic Language Resources and Tools, 2004.

[15] Quinlan J., Induction, knowledge and expert systems, in Artificial Intelligence Developments and Applications, Elsevier Science Publishers, 1988.

[16] Tolun M. and Abu Soud S., ILA: an inductive learning algorithm for rule extraction, Expert Systems with Applications, vol. 14, no. 3, pp.361- 370, 1998.

[17] Tolun M., Uludag M., Hayri S., and Abu-Soud S., ILA-2: An Inductive Learning Algorithm for Knowledge Discovery, Cybernetics and Systems: an International Journal, vol. 30, no. 7, pp. 609-628, 1999. Duaa Abdelrazaq has a master degree in computer Science from Princess Sumaya University for Technology (PSUT). Her research interest is in the area of Artificial Intelligence, Machin learning, Data Mining and Natural language processing. Worked as a teacher at the ministry of education of Jordan between 2005-2016. Working now at United Arab Emirates ministry of education as CDI Teacher. Saleh Abu-Soud is an associate professor at the Department of Software Engineering in Princess Sumaya University for Technology (PSUT). He got his PhD in Computer Science in 1992 (METU), M. Sc. in Computer Science in 1988 (METU), and B.Sc. in Computer Science in 1985 (Yarmouk University). He was working in Jordan University in the period between 1992 and 1995, then he joined PSUT till now, in which he served as the head of the department of Computer Science in the period from 2005 to 2007. He left to work in NYIT for four years in the period from 2007 to 2011, in which he was a professor of Computer Science and the director of accreditation and quality assurance department in the period from 2007 to 2010. His research interest is in the area of Artificial Intelligence. He is the owner of ILA inductive learning algorithm. He is interested mainly in many research topics as Machine Learning, Biometric Keystroke Dynamics, and Speech Synthesis with inductive learning. He has many research papers and 2 books. He supervised dozens of master students and many PhD students; more details can be seen on ( AAJ&hl=en) and https://www.researchgate. net/profile/Saleh_Abu-Soud. He is a member of many international projects. Arafat Awajan is a full professor at Princess Sumaya University for Technology (PSUT). He received his PhD degree in computer science from the University of Franche- Comte, France in 1987. He held different academic positions at the Royal Scientific Society and Princess Sumaya University for Technology. He is currently the vice president of Princess Sumaya University for Technology and the head of the Human Resources Committee. He was appointed as the chair of the Computer Science Department (2000-2003) and the chair of the Computer Graphics and Animation Department (2005-2006) at PSUT. He had been the dean of the King Hussein School for Information Technology from 2004 to 2007, the Dean of Student Affairs from 2011- 2014, the director of the Information Technology Center in the Royal Scientific Society from 2008-2010 and the dean of the King Hussein School for computing Sciences from 2014 t0 2017.