The International Arab Journal of Information Technology (IAJIT)


A New Approach for Arabic Named Entity Recognition

A Named Entity Recognition (NER) plays a noteworthy role in Natural Language Processing (NLP) research, since it makes available the detection of proper nouns in unstructured texts. NER makes easier searching, retrieving, and extracting information seeing as the significant information in texts is usually sited around proper names. This paper suggests an efficient approach that can identify Named Entities (NE) in Arabic texts without the need for morphological or syntactic analysis or gazetteers. The goal of our approach is to provide a general framework for Arabic NE recognition. Within this framework; the system learns the recognition of NE automatically and induces NE systematically, starting from sample NE instances as seeds. This method takes advantage from the web, the approach learns from a web corpus. The seeds are used to identify the contexts in the web denoting NE and then the contexts identify new NE. Thorough experimental evaluation of our approach, the performances measured by recall, precision and f-measure conducted to recognize NE are promising. We obtained an overall rate of F-measure equal to 83%.

[3] Al-Jumaily H., Mart nez P., Mart nez-Fern ndez J., and Goot E., A Real Time Named Entity Recognition System for Arabic Text Mining, Journal of Language Resources and Evaluation, vol. 46, no. 4, pp. 543-563, 2012. (11) (12) (13) number of NE recognized by the systemRecallnumber of correct NE in the corpus number of correct NE recognized by the systemPrecisionnumber of NE given by the system 2 * ( * ) () recall precisionF measurerecall precision

[25] Zribi I., Hammami S., and Belguith L., L apport d une Approche Hybride Pour la Reconnaissance des Entit s Nomm es en Langue Arabe, in Proceeding of the International Conference: Traitement Automatique des Langues Naturelles, Montr al, pp. 1-6, 2010. 338 The International Arab Journal of Information Technology, Vol. 14, No. 3, May 2017 Wahiba Karaa she is currently an assistant professor in the Department of Computer Science at Taif University, Saudi Arabia. She received the Master Degree from Paris III, New Sorbonne, France, and PhD, from Paris 7 Jussieu France. Her research interest includes Natural language processing, document annotation, information retrieval, Text Mining, Data Mining, and Image Mining. She is a member of the Editorial Board of several International Journals, and Editor in Chief of the International Journal of Image Mining (inderscience publishers). Thabet Slimani got a PhD in Computer Science from the University of Tunisia. He is currently an Assistant Professor in Computer Science department at Taif University of Saudia Arabia and a LARODEC Labo member (University of Tunisia). His research interests are mainly related to Semantic Web, Data Mining, Text Mining, Business Intelligence, Knowledge Management and Web services. He has published his research through international conferences and peer reviewed journals. He also serves as journals reviewer.