Event Extraction from Classical Arabic Texts

Event  extraction  is  one  of  the  most  useful  and  chal lenging  Information  Extraction  (IE)  tasks  that  can be  used  in  many  natural  language  processing  applications  in  pa rticular  semantic  search  systems.  Most  of  the  developed  systems  in  this  field extract events from English texts; therefore,  in many other languages in particular Arabic there  is a need for research in  this  area.  In  this  paper,  we  develop  a  system  for  e xtracting person  related  events  and  their  participa nts  from  classical  Arabic  texts  with  complex  linguistic  structure.  The  first  and  most  effective  step  to  extract  event  is  the  cor rect  diagnosis  of  the  event  mention and determining sentences which describe ev ents. Implementation and comparing performance and  the use of various  methods can help researchers to choose appropriate  method for event extraction based on their conditions and limitations. In  this  research,  we  have  implemented  three  methods  in cluding  knowledge  oriented  method  (based  on  a  set  o f  keywords  and  rules),  data-oriented  method  (based  on  Support  Vect or  Machine  (SVM))  and  semantic  oriented  method  (bas ed  on  lexical  chain)  to  automatically  classify  sentences  as  on-ev ent  or  off  eventones.  The  results  indicate  that  knowledge  oriented  and  machine  learning  methods  have  high  precision  and  re call  in  event  extraction  process.  The  semantic  oriented  method  with  acceptable  precision  minimizes  the  linguistic  knowl edge  requirements  of  knowledge  oriented  method  and  preprocessing  requirements  of  data  oriented  method;  and  also  impr oves  automatic  event  extraction  process  from  the  raw  text.  Next  step  is  developing a modular rule based approach for extrac ting event arguments such as time, place and other participants involved  in independent subtasks.

[25] Xu F., Uszkoreit H., and Li H., Automatic Event and Relation Detection with Seeds of Varying Complexity, in Proceedings of the AAAI Workshop Event Extraction and Synthesis , Massachusetts, USA, pp. 12017, 2006. Razieh Baradaran received her BSc and a MSc degrees in the Department of Information Technology at University of Qom, Iran in 2010 and 2013 respectively. Her research interests include: Information extraction, text mining and intrusion detection systems. Behrouz Minaei-Bidgoli obtained his PhD degree from Michigan State University, East Lansing, Michigan, USA, in the field of data mining and web0based educational systems in computer science and engineering department. He is working as an assistant professor in Computer Engineering Department ofIran University of Science and Technology, Tehran, Iran. He is also leading at a D ata and Text Mining research group in Computer Research Center of Islamic Sciences, NOOR co. Qom, Iran, developing large scale NLP and Text mining projects for farsi and arabic languages.