Building a Syntactic-Semantic Interface for aSemi- Automatically Generated TAG for Arabic

Syntactic and semantic resources play an important role for various Natural Language Processing (NLP) tasks by providing information about the correct structural representations of the sentences and their meaning. To date, there is not a wide-coverage electronic grammar for the Arabic language. In this context, we present a new approach for building a Tree Adjoining Grammar (TAG) to represent the syntax and the semantic of modern standard Arabic. This grammar is produced semi-automatically with the eXtensible MetaGrammar (XMG) description language. First the syntax of Arabic is described using the defined Arab-XMG meta-grammar. Then semantic information is added by introducing semantic frame-based dimension into the meta-grammar. This is achieved by exploiting lexical resources such as ArabicVerbNet. Finally, the link between semantic and syntax is established using a syntax-semantic interface that allows the construction of sentence meaning through semantic role labeling. Experiments were performed to check grammar coverage as well as the syntactic-semantic analysis. The results showed that the generated grammar can cover the basic syntactic structures of Arabic sentences and the different phrasal structures with a precision rate of about 92%. Moreover, it confirms the effectiveness of the proposed approach as we were able to parse semantically a set of sentences and build their semantic representations with a precision rate of about 72%.

Cherifa Ben Khelil received her Master s in Software Engineering from Higher Institute of Computer Science Ariana, Tunisia, and she is pursuing her Doctoral degree under joint supervision between the National School of Computer Sciences (ENSI), University of La Manouba in Tunisia and the University of Orleans in France. Her research interests are related to Natural Language Processing in particular grammar generation to represent the syntax and the semantic of Arabic.language. Chiraz Ben Othmane Zribi is a professor at the National School of Computer Science, University of La Manouba, Tunisia and a researcher at the RIADI-GDL laboratory. She received her PhD in computer science in 1998 from PARIS XI University, France. Her principal research interests are in the area of Arabic language processing. Her recent work has focused on natural language parsing, detection and correction of errors, generation of dictionaries and knowledge retrieval. Denys Duchier has been Professor of Computer Science at Universit d 'Orl ans, France, since 2006. He received his PhD from Yale University, United States, in 1991. After postdoctoral fellowships at University of Ottawa and University of Vancouver, Canada, he moved in 1996 to Saarland University, Germany, where he worked on the design and implementation of the Oz programming language. His research interests focus on the application of constraints in computational linguistics, and on the design and implementation of programming languages. Yannick Parmentier is an Associate Professor at Universit de Lorraine, France. He got his PhD in Computer Science from Henri Poincar University in Nancy, France, in 2007. During his PhD, he took part in the design and implementation of the XMG description language and its application to the formal description of French. In 2007-2008, he was a postdoctoral fellow at University of T bingen, Germany, where he worked on symbolic parsing. From 2009 to 2017, he was an Associate Professor at University of Orl ans working on constraint-based approaches in computational linguistics.