The International Arab Journal of Information Technology (IAJIT)


Hybrid SVM/HMM Model for the Arab Phonemes

 Hidden Markov Models (HMM) are currently widely use d in Automatic Speech Recognition (ASR) as being the most effective models. Yet, they sometimes pose some pro blems of discrimination. The hybridization of Artificial Neural Networks (ANN) in particular Multi Layer Perceptions (MLP) w ith HMM is a promising technique to overcome these limitations. In order to, ameliorate results of recognition system, we use Support Vector Machines (SVM) witch charact erized by a high predictive power and discrimination. The incorporation of SVM with HMM brings into existence of the new system of ASR. So, by using 2800 occurrences of Arabic phonemes, this work arises a comparative study of our acknowledgme nt system of it as the following : The use of especially the HMM standards lead to a r ecognition rate of 66.98%. Also, with the hybrid system MLP/HMM we succeed in achieving the value of 73.78% . Moreover, our proposed system SVM/HMM realizes the best performances, whereby, we achieve 75.8% as a recogn ition frequency.

[1] Al-Zabibi M., An Acoustic-Phonetic Approach in Automatic Arabic Speech Recognition, PhD Thesis , Loughborough University Institutional Repository, 1990.

[2] Alotaibi A., Investigating Spoken Arabic Digits in Speech Recognition Setting, Information Sciences , vol .173,no. 1-3, pp. 173-115, 2005.

[3] Alimi A. and Ben Jemaa M., Beta Fuzzy Neural Network Application in Recognition of Spoken Isolated Arabic Words, Control and Intelligent Systems , vol. 30, no. 2, pp. 47-51, 2002

[4] Alotaibi Y., Spoken Arabic Digits Recognizer using Recurrent Neural Networks, in Proceedings of the 4 th IEEE International Symposiumon Signal Processing and Information Technology , pp. 195-199 .2004

[5] Aradilla G., Bourlard H., Magimai M., Using KL-based Acoustic Models in a Large Vocabulary Recognition Task, available at: s/show/6, last visited 2008.

[6] Bahi H. and Sellami M., A Hybrid Approach for Arabic Speech Recognition, in Proceedings of ACS/IEEE International Conference on Computer Systems and Applications , Tunis, pp. 14-18, 2003

[7] Baloul S., D veloppement d un Syst me Automatique De Synth se De La Parole Partir du Texte Arabe Standard Voyell , PhD Thesis , University Of Maine, 2003.

[8] Ben Ayed Y. and Jamoussi S., A New Beta Function Based Kernel for SVMs: Application to Keyword Spotting, Journal of Computer Science and Engineering , vol. 9, no. 2, 2011.

[9] Ben Ayed Y., Fohr D., Haton J., and Chollet G., Confidence Measures for Key Word Spotting Using Support Vectors Machines, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 588-591, 2003.

[10] Bernadis G. and Bourlard H., Confidence Measures in Hybrid HMM/ANN Speech Recognition, in Proceedings of the 1 st workshop on Text, speech, Dialogue , 1998. Hybrid SVM/HMM Model for the Arab Phonemes Recognition 581

[11] Bilmes J., Natural Statistical Models for Automatic Speech Recognition, available at: s-bilmes99.pdf, last visited 1999.

[12] Boite J., Bourlard H., D hoore B., Accaino S., and Vantieghem J., Task Independent and Dependent Training: Performance Comparison of HMM and Hybrid HMM/MLP Approaches, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing , Adelaide, pp. 617-620, 1994 .

[13] Bourlard H. and Morgan N., Connectionist Speech Recognition: A Hybrid Approach , Norwell, Kluwer Academic, 1994.

[14] Castellani A., Botturi D., Bicego M., Fiorini P., Hybrid HMM/SVM: Model for the Analysis and Segmentation of Teleoperation Tasks, in Proceedings of IEEE International Conference on Robotics and Automation New Orleans , pp. 2918-2923, 2004 .

[15] Connel S., A Comparison of Hidden Markov Model Features for the Recognition of Cursive Handwriting, MS Thesis, Computer Science Department, Michigan State University, 1996.

[16] El-Obaid Manal., Amer Al-Nassiri., and Imen Abul Maaly Arabic Phoneme Recognition Using Neural Networks, in Proceedings of the 5 th WSEAS International Conference on Signal Processing , Istanbul, Turkey, pp. 99-104, 2006.

[17] El-Ramly S., Abdel-Kader N., and El-Adawi R., Neural Networks Used for Speech Recognition, in Proceedings of the 19 th National Radioscience Conference , pp. 200-207, 2002.

[18] Emami A. and Mangu L., Empirical Study of Neural Network Language Models for Arabic Speech Recognition, in Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, Kyoto, pp. 147-152, 2007 .

[19] Essa E., Tolba A., Elmougy S., A Comparison of Combined Classi er Architectures for Arabic Speech Recognition, in Proceedings of International Conference on Computer Engineering and Systems , Cairo, pp. 149-153, 2008.

[20] Faria A., An Investigation of Tandem MLP Features for ASR, available at: ia_icsitr.pdf, last visited 2007.

[21] Garcia-Moral A., Solera-Urena R., C. Pelaez- Mor., and Diaz-de-Maria F., Hybrid Models for Automatic Speech Recognition: A Comparison of Classical ANN and Kernel Based Methods, available at: Ext/NOLISP2007/papers/p34.pdf, last visited 2007.

[22] Gemello R., Mana F., Scanzio S., Laface P., and De Mori R., Adaptation of Hybrid ANN/HMM Models using Linear Hidden Transformations and Conservative Training, in Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing Proceedings , Toulouse, 2006.

[23] Ghassaq S. and Abduladhem A., Arabic Phoneme Recognition using Hierarchical Neural Fuzzy Petri Net and LPC Feature Extraction, Signal Processing: An International Journal , vol. 3, no. 5, pp. 161-171, 2009.

[24] Gold B. and Morgan N., Speech and Audio Signal Processing : Processing and Perception of Speech, and Music . John Wiley and Sons Inc, 1999.

[25] Hornik K., Some New Results on Neural Network Approximation, Neural Networks, vol. 6, no. 8, pp. 1069-1072, 1993.

[26] M rgner V., Cours de Support Vector Machines, Institute of Nachrichtentechnik (IfN) TU Braunschweig , 2009.

[27] Morgan N. and Bourlard H., Continuous Speech Recognition: An Introduction to the Hybrid Hmm/Connectionistapproach, available at: pers/ieeespm95-hyb.pdf, last visited 1995.

[28] Muhammad M., Recognition of Arabic Phonemes using Fuzzy Rule Base System, in Proceedings of the 7 th IEEE International Multi Topic Conference , pp. 367-370, 2003.

[29] Osowski S., Siwek K., and Markiewicz T., MLP and SVM-A Comparative Study, in Proceedings of the 6 th Nordic Signal Processing Symposium> NORSIG , Espoo, Finland, pp. 37-40, 2004.

[30] Pujol P., Bourlard H., Pol S., Nadeu C., and Hagen A., Comaparison and Combination of Features in a Hybrid HMM/MLP and a HMM/GMM Speech Recognition System, IEEE Trans. on SAP EDICS: 1>RECO , vol. 13, no. 1, pp. 14-22, 2003.

[31] Rabiner L. and Juang B., Fundamentals of Speech Recognition , Prentice-Hall, 1993.

[32] Rafik D., Houcine B., and Amara K., A Combination Approach of Gaussian Mixture Models and Support Vector Machines for Speaker Identification, The International Arab Journal of Information Technology , vol. 6, no. 5, pp. 490-497, 2009.

[33] Samanta B., Al-Balushi K., and Al-Araimi S., Artificial Neural Networks and Support Vector Machines with Genetic Algorithm for Bearing Fault Detection, Engineering Applications of Artificial Intelligence , vol. 16, no. 7-8, pp. 657- 665, 2003.

[34] Satori H., Hussein H., Harti M., and Chenfour N., Investigation Arabic Speech Recognition Using CMU Sphinx System, The International 582 The International Arab Journal of Information Technology, Vol. 13, No. 5, September 2016 Arab Journal of Information Technology, vol. 6, no. 2, pp. 186-190, 2009.

[35] Shoaib M., Rasheed F., Akhtar J., Awais M., Masud S., Shamail S., A Novel Approach to Increase the Robustness of Speaker Independent Arabic Speech Recognition, in Proceedings of the 7 th International Multi Topic Conference , pp. 371-376, 2003.

[36] Trentin E. and Gori M., A Survey of Hybrid ANN/HMM Models for Automatic Speech Recognition, Neurocomputing , vol. 37, no. 1-4, pp. 91-126, 2001.

[37] Vapnik V., Estimation of Dependences Based an Empirical Data , Springer Verlog, New York, 1979.

[38] Vapnik V., The Nature of Statical Learning Theory , Springer Verlag, New York, 1995.

[39] Von Luxburg U., Bousquet U., and Scholkopf O., A Compression Approach to Support Vector Model Selection, The Journal of Machine Learning Research , vol. 5, pp. 293-323 2004.

[40] Zarrouk E. and Ben Ayed Y., Automatic Speech Recognition with Hybrid Models, in Proceedings of SPED Conference , pp. 183-188, 2011. Elyes Zarrouk received his MS degree in Computer Science at the Higher Institute of Computer of Monastir, Tunisia in 2007. He obtained his MS degree on New Information Technologies and Systems Dedicated from National School of Engineering in Sfax, Tunisia in 2010. Currently, he is a PhD student in MIRACL, Multimedia Information System and Advanced Computing Laboratory, university of Sfax, Tunisia, Focusing his research on speech recognition. Yassine BenAyed Graduated in Electrical Engineering from National School of Engineering in Sfax, Tunisia in 1998. He obtained his PhD degree in Signal and Image from Telecom ParisTech in 2003. Curently, He is assistant professor in Electrical and Computer Engineering in the Universi ty of Sfax. He focuses his research on pattern recogni tion, artificial intelligence and speech recognition.