The International Arab Journal of Information Technology (IAJIT)


Conceptual Persian Text Summarizer: A New Model in Continuous Vector Space

Traditional methods of summarization are not cost-effective and possible today. Extractive summarization is a process that helps to extract the most important sentences from a text automatically, and generates a short informative summary. In this work, we propose a novel unsupervised method to summarize Persian texts. The proposed method adopt a hybrid approach that clusters the concepts of the text using deep learning and traditional statistical methods. First we produce a word embedding based on Hamshahri2 corpus and a dictionary of word frequencies. Then the proposed algorithm extracts the keywords of the document, clusters its concepts, and finally ranks the sentences to produce the summary. We evaluated the proposed method on Pasokh single-document corpus using the ROUGE evaluation measure. Without using any hand-crafted features, our proposed method achieves better results than the state-of-the-art related work results. We compared our unsupervised method with the best supervised Persian methods and we achieved an overall improvement of ROUGE-2 recall score of 7.5%.

[46] Zamanifar A. and Kashefi O., “AZOM: A Persian Structured Text Summarizer,” International Conference on Applications of Natural Language to Information Systems, Paris pp. 234-237, 2011. Mohammad Ebrahim Khademi received his M.S. degree in computer engineering from the Malek Ashtar University of Technology, Iran, in 2013. He is currently a PhD candidate in computer engineering there. His research interests include machine learning (deep learning) and natural language processing. Mohammad Fakhredanesh received his B.S., M.S. and PhD degree in computer science and Engineering from the Amirkabir University of Technology (Tehran Polytechnic), Iran, in 2005, 2007, and 2014 respectively. He is currently an assistant professor at the Malek Ashtar University of Technology. His research interests are the fields of artificial intelligence, pattern recognition, and text summarization. Seyed Mojtaba Hoseini received his B.S. degree in Electronic Engineering from Malek Ashtar University of Technology in 1991. He also received his M.S. and PhD degrees in Computer Architecture Engineering from Amirkabir University of Technology in 1995 and 2011 respectively. His research interests include Wireless sensor Networks, with an emphasis on target coverage and tracking applications, image and signal processing, and evolutionary computing.