The International Arab Journal of Information Technology (IAJIT)


Towards A Distributed Arabic OCR Based on the DTW Algorithm: Performance Analysis

In spite of the diversity ofprintedArabicoptical characterrecognitionproductsandproposals,the problemseems tobenotyetwellsolved.Thecomplexmorphologya ndcalligraphyoftheArabicwritingononehandandtheuseofsomelight approaches on the other hand are behind the poornes s of these products. However, some strong proposedapproaches didn’t find the opportunity to be commercialised because o f generally their corresponding complex computing.  The dynamic time warping algorithm is considered as one among these strong approaches. In fact, severalstudiesandexperimentshaveshown andconfirmedthattheprintedArabicopticalchara cterrecognitionbasedondynamictimewarpingalgo rithmprovidesavery interesting recognition rate especially for large a nd huge vocabularies. One of the attractive sides o f the dynamic time warping algorithm is its ability to recognize prope rly connected or cursive characters (words or sub w ords) without prior segmentation. Furthermore, this algorithm performs the recognition process from within a reference library of isolated characters and owns a very good immunity against no ises. Unfortunately, the big amount of its computing during the recognition process makes its execution time very s low and, hence, restricts its utilization. Many researchers attempted to speedup the execution time of this algorithm. Unfor tunately, the corresponding proposed solutions requ ire generally specific high cost architectures. Loosely coupled architectu res such as grapes or grid computing can provide en ough power without additional cost to distribute the complexity of som e greedy applications. Consequently, we report in t his paper the performance analysis of an analytical and an experi mental study of a distributed Arabic optical character recognition based on the dynamic time warping algorithm within loosel y coupled architectures. Obtained results confirm that loosely coupled architectures and more specifically grid computing present a very interesting framework to speedup the  Arabic optical characterrecognitionbasedonthedynamictimewar pingalgorithm.  

[1] Abdennadher N., Vers Un Outil Peer&To&Peer Orient Calcul Intensif, Computer Journal of Flash Informatique EPFL , vol. 3897, no. 1, pp. 291&306,2005.

[2] Abdi N., Reconnaissance Automatique de L criture Imprim e Arabe, Thesis Dissertation, Tunisia,2004.

[3] Abedi N. and Khemakhem M., Reconnaissance de Caract res Imprim s Cursifs Arabes Par Comparaison Dynamique et Mod le Cach De Markov, in Proceedings of Gen Ed Institute , Tunisia,pp.56&63 ,2004.

[4] Alves C., Parallel Dynamic Programming for Solving the String Editing Problem on CGM/BSP, inProceedings of SPAA , Canada, pp.20&29,2002.

[5] Amin A., Off&Line Arabic Character Recognition: The State of the Art, Pattern Recognition Computer Journal, vol. 31, no. 5, pp.517&530,1998.

[6] ApacheXerces,,2003.

[7] Bellman R., Dynamic Programming , NewJersey, UK,1975.

[8] Ben Amara N., A Relational Database for Arabic OCR System, The International Arab Journal of Information Technology (IAJIT) , vol. 2,no.4,pp.206&212,2005.

[9] Bridle J., An Algorithm for Connected Word Recognition, in Proceedings of IEEE International Conference on Acoustics Speech and Signal ,France,pp.899&902,1982.

[10] Buyya R., A Gentle Introduction to Grid Computing and Technologies, in Proceedings of CSI ,India,pp.733&738,2005.

[11] Cheng H., A VLSI Architecture for Dynamic Time&Wrap Recognition of Handwritten Symbols, Computer Journal of IEEE Anti Spam SMTP Proxy ,vol.34,no.3,pp.1010&1011,1986.

[12] Cheng H., VLSI Architecture for Pattern Matching Using Space&Time Domain Expansion Approach, in Proceedings of IEEE International Conference on Computer Design VLSI and Computing ,NewYork,pp.46&50,1985.

[13] Cheung A., An Arabic Optical Character Recognition System Using Recognition Based Segmentation, Pattern Recognition Computer Journal ,vol.34,no.2,pp.217&236,2001.

[14] CiyaICR product,, 2004.

[15] Dehghan M., Handwritten Farsi (Arabic) Word Recognition: A Holistic Using Discrete HMM, Pattern Recognition Computer Journal , vol. 34, no.5,pp.1057&1065,2001.

[16] ESSTT,,2006.

[17] FosterI.andKesselmanC., The Grid: Bluepoint for a Future Computing Infrastructure , Morgan Kaufmann,SanFrancisco,1999.

[18] Foster I., Kesselman C., and Tuecke S., The Anatomy of the Grid, International Journal of Supercomputer Applications , vol. 16, no. 2, pp. 115&128,2002.

[19] IBM, Introduction to Grid Computing with Globus , IBMRedbook,2003.

[20] Kanoun S., Reconnaissance D images de Textes Arabes Par Approche Af xale, in Proceedings of MCSEAI , Tunisia, pp. 25&36, 2004.

[21] Khemakhem M. and Belghith A., A Multipurpose Multi&Agent System Based on a Loosely Coupled Architecture to Speedup the DTW Algorithm for Arabic Printed Cursive OCR, in Proceedings of IEEE International Conference on Computer Systems and Applications ,Egypt,pp.301&311,2005.

[22] KhemakhemM.andBelghithA., AgentBased Architecture for Parallel and Distributed Complex Information Processing, Computer Journal of International Review on Computers and Software ,vol.2,no.1,pp. 25&29,2007.

[23] Khemakhem M. and Belghith A., The DTW Algorithm for Distributed Printed Cursive OCR within A Multi Agent System, in Proceedings of ACM , Egypt,pp.603&611,2007.

[24] Khemakhem M., Arabic Type Written Character Recognition Using Dynamic Comparison, in Proceedings of 1 st Computer Conference , Kuwait,pp.109&118,1989.

[25] KhemakhemM., ReconnaissanceDeCaract res Imprim s Par Comparaison Dynamique, in Proceedings of AFCET , France, pp. 147&161, 1987.

[26] Khemakhem M., Reconnaissance Globale de Caract res Imprim s Arabes et Latins Par Comparaison Dynamique, in Proceedings Regional Conference on Computer Science and Arabization ,Tunisia,pp.28&31,1988.

[27] Khemakhem M., Belghith A., and Ben Ahmed M., Mod lisation Architecturale de la Comparaison Dynamique Distribu e, in Proceedings in Second International Congress Towards A Distributed Arabic OCR Based on the DTW Algorithm: Performance Analysis161 on Arabic and Advanced Computer Technology , Morocco,pp.21&40,1993.

[28] Khemakhem M., Belghith A., and Ben Ahmed M., Etude et Evaluation de Deux M thodes de Distribution de L algorithme de Comparaison Dynamique Pour La Reconnaissance de Caract res Arabes, in Proceedings of First Maghrebin Symposium on Programming and Systems ,Algeria,pp.110&117,1991.

[29] Kumar A., Model&Based Annotation of Online Handwritten Datasets, International Institute of Information Technology ,India,pp.53&60,2006.

[30] Philip G., Efficient Parallel Dynamic Programming, in Preceding the 30 th Annual Allerton Conference on Communication Control and Computing , University of Illinois, pp. 185& 194,1992.

[31] Qu not G., A Dynamic Programming Processor for Speech Recognition, IEEE Journal of Solid State Circuits ,vol.24,no.9,pp.338&348,1989.

[32] Shi Z., Agent Based Grid Computing, Applied Mathematical Modelling , vol. 30, no. 7, pp. 629& 640,2006.

[33] Tapia E., A Survey on Recognition of on Line Handwritten Mathematical Notation, Technical Report ,Germany,2007.

[34] Vuori V., Experiments with Adaptation Strategies for a Prototype&Based Recognition System for Isolated Handwritten Characters, Computer Journal of International Journal of Document Analysis and Recognition , vol.3, no. 3, pp.150&159,2001.

[35] Xtremweb&ch,,2000. Maher Khemakhem received his master of science and his PhD degreesfromtheUniversityofParis 11, France in 1984 and 1987, respectively. He is currently assistant professor in computer science at the Faculty of Economy and Management Sciences at the University of Sfax, Tunisia. His research interests include distributed systems, performance evaluation, and pattern recognition. Abdelfettah Belghith received his master of science and his PhD degrees from the University of California at Los Angeles in 1982 and 1987, respectively. He is since 1992afullprofessorattheNational School of Computer Science, University of Mannouba, Tunisia. His research interests include computer networks, wireless networks, multimedia Internet, mobile computing, distributed algorithms, simulation, and performance evaluation.