The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Event Extraction from Classical Arabic Texts

#
Event  extraction  is  one  of  the  most  useful  and  chal lenging  Information  Extraction  (IE)  tasks  that  can be  used  in  many  natural  language  processing  applications  in  pa rticular  semantic  search  systems.  Most  of  the  developed  systems  in  this  field extract events from English texts; therefore,  in many other languages in particular Arabic there  is a need for research in  this  area.  In  this  paper,  we  develop  a  system  for  e xtracting person  related  events  and  their  participa nts  from  classical  Arabic  texts  with  complex  linguistic  structure.  The  first  and  most  effective  step  to  extract  event  is  the  cor rect  diagnosis  of  the  event  mention and determining sentences which describe ev ents. Implementation and comparing performance and  the use of various  methods can help researchers to choose appropriate  method for event extraction based on their conditions and limitations. In  this  research,  we  have  implemented  three  methods  in cluding  knowledge  oriented  method  (based  on  a  set  o f  keywords  and  rules),  data-oriented  method  (based  on  Support  Vect or  Machine  (SVM))  and  semantic  oriented  method  (bas ed  on  lexical  chain)  to  automatically  classify  sentences  as  on-ev ent  or  off  eventones.  The  results  indicate  that  knowledge  oriented  and  machine  learning  methods  have  high  precision  and  re call  in  event  extraction  process.  The  semantic  oriented  method  with  acceptable  precision  minimizes  the  linguistic  knowl edge  requirements  of  knowledge  oriented  method  and  preprocessing  requirements  of  data  oriented  method;  and  also  impr oves  automatic  event  extraction  process  from  the  raw  text.  Next  step  is  developing a modular rule based approach for extrac ting event arguments such as time, place and other participants involved  in independent subtasks.


[1] Abdul Halin A., Rajeswari M., and Abbasnejad M., Soccer Event Detection via Collaborative Multimodal Feature Analysis and Candidate Ranking, the International Arab Journal of Information Technology , vol. 10, no. 5, pp. 4930 502, 2013.

[2] Abuleil S., Using NLP Techniques for Tagging Events in Arabic Text, in Proceedings of the 19th International Conference on Tools with Artificial Intelligence , Patras, Greek, pp. 4400 443, 2007.

[3] ACE Overview., available at: http://projects. ldc.upenn.edu/ace/intro.html, last visited 2012.

[4] Ahn D., Stage of Event Extraction, in Proceedings of Workshop on Annotating and Reasoning about Time and Events , Sydney, Australia, pp. 108, 2006.

[5] Ananiadou S., Pyysalo S., Tsujii J., and Kell D., Event Extraction for Systems Biology by Text Mining the Literature, Trends in Biotechnology , vol. 28, no. 7, pp. 3810390, 2010.

[6] Aone C. and Ramos0Santacruz M., REES: A Large0Scale Relation and Event Extraction System, in Proceedings of the 6th Conference on Applied Natural Language Processing , Washington, USA, pp. 76083, 2000.

[7] ArabicWordNet., available at: http://www . globalwordnet.org/AWN/, last visited 2012.

[8] Bidhendi M., Minaei0Bidgoli B., and Jouzi H., Extracting Person Names from Ancient Islamic Arabic Texts, in Proceedings of Language Resources and Evaluation for Religious Texts Workshop , Istanbul, Turkey, pp. 106, 2012.

[9] Buckwalter., available at: http://www.ldc. upenn.edu/Catalog/CatalogEntry.jsp?catalogId=L DC2002L49, last visited 2012.

[10] Diab M., Second Generation AMIRA Tools for Arabic Processing: Fast and Robust, Tokenization, POS tagging and Base Phrase Chunking, in Proceedings of the 2 nd International Conference on Arabic Language Resources and Tools , Cairo, Egypt, pp. 2850288, 2009.

[11] Hammadi O. and Ab Aziz M., Grammatical Relation Extraction in Arabic Language, the Journal of Computer Science , vol. 8, no. 6, pp. 8910898, 2012.

[12] Hogenboom F., Frasincar F., Kaymak U., De Jong F., An Overview of Event Extraction from Text, in Proceedings of Detection , Representation and Exploitation of Events in the Semantic Web, Bonn, Germany, pp . 48057, 2011.

[13] Khoja S., available at: http://zeus.cs.pacificu. edu/shereen/research.htm, last visited 2012.

[14] Lei B. and Sheng B., Methods of Customer Requirements Feature Extraction on Product Reviews, the Journal of Information and Computational Science , vol. 9, pp. 2429-2439, 2012.

[15] Lexical Chain., available at: http://en.wikipedia. org/wiki/Lexicalchain, last visited 2012.

[16] LibSVM., available at: http://www.csie.ntu.edu. tw/~cjlin/libsvm/, last visited 2012.

[17] Naughton M., Stokes N., and Carthy J., Investigating Techniques for Sentence0Level Event Classification, in Proceedings of the 22 nd International Conference on Computational Linguistics , Manchester, UK, pp. 6170624, 2008.

[18] Piskorski J., Tanev H., and Wennerberg P., Extracting Violent Events from on Line News for Ontology Population, in Proceedings of the 10 th International Conference on Business Information System , Poznan, Poland, pp. 2870 300, 2007.

[19] Sally, available at: http://mloss.org/revision/ view/960/, last visited 2012.

[20] Sangeetha S., Takur R., and Arock M., Event Detection using Lexical Chain, in Proceedings of the 7 th International Conference on Natural Language Processing , Reykjavik, Iceland, pp. 3140316, 2010.

[21] Sangeetha S., Takur R., and Arock M., Domain Independent Event Extraction System using Text Meaning Representation Adopted for Semantic 502 The International Arab Journal of Information Techn ology, Vol. 12, No. 5, September 2015 Web, the International Journal of Computer Information Systems and Industrial Management Applications , vol. 2, pp. 2520261, 2010.

[22] Vargas0Vera M. and Celjuska D., Event Recognition on News Stories and Semi0 Automatic Population of an Ontology, in Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, California, USA, pp. 6150618, 2004.

[23] Wayne C., Topic Detection Tracking (TDT), in Proceedings of DARPA Broadcast News Transcription and Understanding Workshop , Maryland, USA, pp. 103, 1998.

[24] Witten L. and Frank E., Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations , Morgan Kaufmann Publishers, 2000.

[25] Xu F., Uszkoreit H., and Li H., Automatic Event and Relation Detection with Seeds of Varying Complexity, in Proceedings of the AAAI Workshop Event Extraction and Synthesis , Massachusetts, USA, pp. 12017, 2006. Razieh Baradaran received her BSc and a MSc degrees in the Department of Information Technology at University of Qom, Iran in 2010 and 2013 respectively. Her research interests include: Information extraction, text mining and intrusion detection systems. Behrouz Minaei-Bidgoli obtained his PhD degree from Michigan State University, East Lansing, Michigan, USA, in the field of data mining and web0based educational systems in computer science and engineering department. He is working as an assistant professor in Computer Engineering Department ofIran University of Science and Technology, Tehran, Iran. He is also leading at a D ata and Text Mining research group in Computer Research Center of Islamic Sciences, NOOR co. Qom, Iran, developing large scale NLP and Text mining projects for farsi and arabic languages.