The International Arab Journal of Information Technology (IAJIT)


Tunisian Dialect Recognition Based on Hybrid

In this research paper, an Arabic Automatic Speech Recognition System is implemented in order to recognize ten Arabic digits (from zero to nine) spoken in Tunisian dialect (Darija). This system is divided in two main modules: The feature extraction module by combining a few conventional feature extraction techniques, and the recognition module by using Feed- Forward Back Propagation Neural Networks (FFBPNN). For this purpose, four oral proper corpora are prepared by five speakers each. Each speaker pronounced the ten digits five times. The chosen speakers are different in gender, age and physiological conditions. We focus our experiments on a speaker dependent system and we also examined the case of speaker independent system. The obtained recognition performances are almost ideal and reached up to 98.5% when we use for the feature extraction phase the Perceptual Linear Prediction technique (PLP) followed firstly by its first-order temporal derivative (∆PLP ) and secondly by Vector Quantization of Linde-Buzo-Gray (VQLBG).

[1] Al-Irhaim Y. and Saeed E., Arabic Word Recognition Using Wavelet Neural Network, in Proceeding of Third Science Conference in Information Technology, Al Mosul, pp. 416-425, 2010.

[2] Ameen A., Uma R., and Madhusudana R., Speaker Recognition System Using Combined Vector Quantization and Discrete Hidden Markov Model, International Journal Of Computational Engineering Research, vol. 2, no. 3, pp. 692-696, 2012.

[3] Baccouche T., L emprunt En Arabe Moderne, Beit El-hikma Et Iblv, 1994.

[4] Ben-Nasr M., Talbi M. and Cherif A., Arabic Speech Recognition by MFCC and Bionic Wavelet Transform using a Multi-Layer Perceptron for Voice Control, CiiT International Journal of Software Engineering and Technology, vol. 4, no. 3, 2012.

[5] Boujelbane R., Khemekhem M., and Belguith L., Mapping Rules for Building a Tunisian Dialect Lexicon and Generating Corpora, in Proceedings of International Joint Conference on Natural Language Processing, Nagoya, pp. 419- 428, 2013.

[6] Boujelbane R., Ellouze M., Hadrich Belguith L., De L arabe Standard Vers L arabe Dialectal: Projection De Corpus Et Ressources Linguistiques En Vue Du Traitement Automatique De L oral Dans Les M dias Tunisiens, in Proceedings of Tunisia International Joint Conference on Natural Language Processing, Nagoya, pp. 419-428, 2013.

[7] Daqrouq K., Alfaouri M., Alkhateeb A., Khalaf E. and Morfeq A., Wavelet LPC with Neural Network for Spoken Arabic Digits Recognition System, British Journal of Applied Science and Technology, vol. 4, no. 8, pp. 1238-1255, 2014.

[8] El-Baroudy I., Elshorbagy A., Carey S., Giustolisi O., Savic D., Comparison of Three Data-Driven Techniques in Modelling the Evapotranspiration Process, Journal of Hydro Informatics, vol. 12, no. 4, pp. 365-379, 2010.

[9] El-Henawy I., Khedr W., ELkomy O., Abdalla A., Recognition of Phonetic Arabic Figures Via Tunisian Dialect Recognition Based on Hybrid Techniques 65 Wavelet Based Mel Frequency Cepstrum Using Hmms, HBRC Journal, vol. 10, no. 1, pp. 49-54, 2014.

[10] Elmahdy M., Gruhn R., Minker W., and Abdennadher S., Modern Standard Arabic Based Multilingual Approach for Dialectal Arabic Speech Recognition, in Proceedings of IEEE 8th International Symposium on Natural Language Processing, Bangkok, pp. 169-174, 2009.

[11] EL-Mashed S., Sharway M., and Zayed H., Speaker Independent Arabic Speech Recognition Using Support Vector Machine, in Proceedings of ICI-11 Conference and Exhibition on Information Technology and Instruction Technology, Hungary, pp. 401-416, 2011.

[12] Ganchev T., Speaker Recognition, PHD Theses, University of Patras, 2005.

[13] Ganoun A. and Almerhag I., Performance Analysis of Spoken Arabic Digits Recognition Techniques, Journal of Electronic Science and Technology, vol. 10, no. 2, pp. 153-157, 2012.

[14] Graja M., Jaoua M., and Belguith L., Building Ontologies to Understand Spoken Tunisian Dialect, International Journal of Computer Science, Engineering and Applications, vol. 1, no. 4, pp. 23-32, 2011.

[15] Gunawan W. and Hasegawa-Johnson M., PLP Coefficients Can be Quantized at 400 BPS, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, 2001.

[16] Hamdi R., La variation Rythmique Dans Les Dialects Arabes, PhD Thesis, Universit Lumi re Lyon2 & Universit 7 Novembre Carthage, 2007.

[17] Haykin S., Neural Networks and Learning Machines, Prentice Hall, 2009.

[18] Hermansky H., Perceptual Linear Predictive (PLP) Analysis for Speech, The Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, 1990.

[19] Masmoudi A., Khemakhem M., Est ve Y., Belguith L., and Habash N., A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition, in Proceedings of the 9th International Conference on Language Resources and Evaluation, Iceland, pp. 306-310, 2014.

[20] Muda L., Begam M., and Elamvazuthi I., Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques, Journal Of Computing, vol. 2, no. 3, pp. 138-143, 2010.

[21] Price J., Design an Automatic Speech Recognition System Using Maltab, University of Maryland Eastern Shore Princess Anne, 2005.

[22] Salam M., Mohamad D., and Salleh S., Malay Isolated Speech Recognition Using Neural Network: A Work in Finding Number of Hidden Nodes and Learning Parameters, The International Arab Journal of Information Technology, vol. 8, no. 4, pp. 364-371, 2011.

[23] Semet G. and TREFFO G., Reconnaissance De La Parole Avec Les Coefficients MFCC, in Proceedings of TIPE, 2002.

[24] Srinivasan A., Speech Recognition Using Hidden Markov Model, Applied Mathematical Sciences, vol. 5, no. 79, pp. 3943-3948, 2011.

[25] Zribi I., Khemekhem M., and Belguith L, Morphological Analysis of Tunisian Dialect, in Proceedings of International Joint Conference on Natural Language Processing, Nagoya, pp. 992-996, 2013. Mohamed Hassine has received a Diploma in electrical Engineering in 1997, his Master in 2005 and his PhD degree in Electrical Engineering in 2017 from the National School of Engineering of Monastir, University of Monastir in Tunisia. His current research interests include automatic speech recognition. Lotfi Boussaid has received a Diploma in Electrical Engineering in 1989 from the University of Monastir in Tunisia, his Master in Nouvelles Technologies des Syst mes Informatiques D di s in 2003 and his PhD degree in Computer Science in 2006 from the University of Sfax. He was a member of LE2I, the laboratory of Electronic, Computing and Imaging Sciences, Burgundy University, France. His current research interests include Hardware-Software design space exploration and prototyping strategies for real-time systems. Hassani Messaoud has received his Bachelor s degree in Electrical Engineering in 1983 and his Master of Science in Control Engineering 1985 from the High Normal School of Technical Education (ENSET) in Tunis-Tunisia. His PhD in Control Engineering was prepared at the University of Nice- Sophia Antipolis / France in 1993 and his Habilitation Diploma was defended at the School of Engineers (ENIT) in Tunis -Tunisia. He is presently a Professor at the School of Engineers of Monastir-Tunisia (ENIM). His main interest is robustness in identification and control of non-linear systems with application to diagnosis and equalization of numerical communication channels.