The International Arab Journal of Information Technology (IAJIT)


Efficient Segmentation of Arabic Handwritten

Handwriting recognition is an important field as it has many practical applications such as for bank cheque processing, post office address processing and zip code recognition. Most applications are developed exclusively for Latin characters. However, despite tremendous effort by researchers in the past three decades, Arabic handwriting recognition accuracy remains low because of low efficiency in determining the correct segmentation points. This paper presents an approach for character segmentation of unconstrained handwritten Arabic words. First, we seek all possible character segmentation points based on structural features. Next, we develop a novel technique to create several paths for each possible segmentation point. These paths are used in differentiating between different types of segmentation points. Finally, we use heuristic rules and neural networks, utilizing the information related to segmentation points, to select the correct segmentation points. For comparison, we applied our method on IESK-arDB and IFN/ENIT databases, in which we achieved a success rate of 91.6% and 90.5% respectively.

[1] Abandah G. and Jamour F., Recognizing Handwritten Arabic Script through Efficient Skeleton-Based Grapheme Segmentation Algorithm, in Proceeding of International Conference on Intelligent Systems Design and Applications, Cairo, pp. 977-982, 2010.

[2] Al-Hamad H., Over-Segmentation of Handwriting Arabic Scripts Using an Efficient Heuristic Technique, in Proceeding of Wavelet Analysis and Pattern Recognition, Xian, pp. 180- 185, 2012.

[3] Al-Hamad H. and Abu-Zitar R., Development of an Efficient Neural-Based Segmentation Technique for Arabic Handwriting Recognition, Pattern Recognition, vol. 43, no. 8, pp. 2773- 2798, 2010.

[4] Al-Jawfi R., Handwriting Arabic Character Recognition LeNet Using Neural Network, The International Arab Journal of Information Technology, vol. 6, no. 3, pp. 304-309, 2009.

[5] Alaei A., Nagabhushan P., and Pal U., A Baseline Dependent Approach for Persian Handwritten Character Segmentation, in Proceeding of International Conference on Pattern Recognition, Istanbul, pp. 1977-1980, 2010.

[6] Awrangjeb M. and Lu G., Robust Image Corner Detection Based on the Chord-To-Point Distance Accumulation Technique, IEEE Transactions on Multimedia, vol. 10, no. 6, pp. 1059-1072, 2008.

[7] Bouafif F., Maddouri S., and Ellouze N., A Hybrid Method for Three Segmentation Level of Handwritten Arabic Script, The International Arab Journal of Information Technology, vol. 9, 878 The International Arab Journal of Information Technology, Vol. 14, No. 6, November 2017 no. 2, pp. 117-123, 2012.

[8] Broumandnia A. and Shanbehzadeh J., Fast Zernike Wavelet Moments for Farsi Character Recognition, Image and Vision Computing, vol. 25, no. 5, pp. 717-726, 2007.

[9] Broumandnia A., Shanbehzadeh J., and Rezakhah M., Persian/Arabic Handwritten Word Recognition Using M-Band Packet Wavelet Transform, Image and Vision Computing, vol. 26, no. 6, pp. 829-842, 2008.

[10] Elnagar A. and Bentrcia R., A Multi-Agent Approach to Arabic Handwritten Text Segmentation, Journal of Intelligent Learning Systems and Applications, vol. 4, no. 3, pp. 207- 215, 2012.

[11] Elzobi M., Al-Hamadi A., Al-Aghbari Z., and Dings L., IESK-ArDB: a Database for Handwritten Arabic and an Optimized Topological Segmentation Approach, International Journal on Document Analysis and Recognition, vol. 16, no. 3, pp. 1-14, 2012.

[12] He X. and Yung N., Curvature Scale Space Corner Detector with Adaptive Threshold and Dynamic Region of support, in Proceeding of the 17th International Conference on Pattern Recognition, Cambridge, pp. 791-794, 2004.

[13] Jayadevan R., Kolhe S., Patil P., and Pal U., Automatic Processing of Handwritten Bank Cheque Images: a Survey, International Journal on Document Analysis and Recognition, vol. 15, no. 4, pp. 267-296, 2012.

[14] Kabbani R., Selecting Most Efficient Arabic OCR Features Extraction Methods Using Key Performance Indicators, in Proceeding of International Conference on Communications, Computing and Control Applications, Marseilles, pp. 1-6, 2012.

[15] Khorsheed M., Recognising Handwritten Arabic Manuscripts Using a Single Hidden Markov Model, Pattern Recognition Letters, vol. 24, no. 14, pp. 2235-2242, 2003.

[16] Lee H. and Chen B., Recognition of Handwritten Chinese Characters via Short Line Segments, Pattern Recognition, vol. 25, no. 5, pp. 543-552, 1992.

[17] Leydier Y., Ouji A., LeBourgeois F., and Emptoz H., Towards an Omnilingual Word Retrieval System for Ancient Manuscripts, Pattern Recognition, vol. 42, no. 5, pp. 2089-2105, 2009.

[18] Liang Y., Fairhurst M., and Guest R., A Synthesised Word Approach to Word Retrieval In Handwritten Documents, Pattern Recognition, vol. 45, no. 12, pp. 4225-4236, 2012.

[19] Lu S., Ren Y., and Suen C., Hierarchical Attributed Graph Representation and Recognition of Handwritten Chinese Characters, Pattern Recognition, vol. 24, no. 7, pp. 617-632, 1991.

[20] Mansour M., Benkhadda M., and Benyettou A., Optimized Segmentation Techniques for Handwritten Arabic Word and Numbers Character Recognition, in Proceeding of IEEE Signal-Image Technology and Internet-Based Systems, pp. 96-101, 2005.

[21] Naz S., Hayat K., Razzak M., Anwar M., Madani S., and Khan S., The Optical Character Recognition of Urdu-Like Cursive Scripts, Pattern Recognition, vol. 47, no. 3, pp. 1229- 1248, 2014.

[22] Parvez M. and Mahmoud S., Arabic Handwriting Recognition using Structural and Syntactic Pattern Attributes, Pattern Recognition, vol. 46, no. 1, pp. 141-154, 2013.

[23] Pechwitz M., Maddouri S., M rgner V., Ellouze N., and Amiri H., IFN/ENIT-Database of Handwritten Arabic Words, in Proceeding of Francophone International Conference on writing and Document, Hammamet, pp. 127-136, 2002.

[24] Razak Z., Zulkiflee K., Noor N., Salleh R., and Yaacob M., Off-Line Handwritten Jawi Character Segmentation Using Histogram Normalization and Sliding Window Approach for Hardware Implementation, Malaysian Journal of Computer Science, vol. 22, no. 1, pp. 34-43, 2009.

[25] Touj S., Ben-Amara N., and Amiri H., Arabic Handwritten Words Recognition Based on a Planar Hidden Markov Model, The International Arab Journal of Information Technology, vol. 2, no. 4, pp. 318-325, 2005.

[26] Xiu P., Peng L., Ding X., and Wang H., Offline Handwritten Arabic Character Segmentation with Probabilistic Model, in Proceeding of the 7th international conference on Document Analysis Systems, Nelson, pp. 402-412, 2006.

[27] Zeki A., The Segmentation Problem in Arabic Character Recognition the State of the Art, in Proceeding of 1st International Conference on Information and Communication Technologies, Karachi, pp. 11-26, 2005.

[28] Zhang T. and Suen C., A Fast Parallel Algorithm for Thinning Digital Patterns, Communications of the ACM, vol. 27, no. 3, pp. 236-239, 1984. Efficient Segmentation of Arabic Handwritten Characters Using Structural Features 879 Mazen Bahashwan is currently a postgraduate student at the Computer Vision, Video and Image Processing Lab (CvviP), Faculty of Electrical Engineering, Universiti Teknologi Malaysia. His research interest is in the area of computer vision, particularly in Arabic handwriting recognition. He obtained his master degree from Universiti Kebangsaan Malaysia in 2011. Syed Abu-Bakar received his Ph.D. degree from the University of Bradford, England in 1997. He joined Universiti Teknologi Malaysia (UTM) in 1992. Currently he is an associate professor in the department of Electronics and Computer Engineering, Faculty of Electrical Engineering. His current research interest is in image processing focusing in video security and surveillance, medical imaging, biometrics, agricultural, and industrial applications. He has published more than 150 scientific papers both at national and international levels. He is a senior member of IEEE. Usman Sheikh received his PhD degree (2009) in image processing and computer vision from Universiti Teknologi Malaysia. His research work is mainly on computer vision and embedded systems design. He is currently a Senior Lecturer at Universiti Teknologi Malaysia, Malaysia.