The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Design and Implementation of a Diacritic Arabic Text-To-Speech System

The absence of the diacritical marks from the modern Arabic text generates a significant increase of the ambiguity in the Arabic text, which can cause confusion in the pronunciation of a written word. Despite the fact that the reader with a certain level of Arabic knowledge can easily recover the missing diacritics by: using the words context, the morphology and the syntax knowledge of the Arabic language. This paper describes a design and implementation of a Text-To-Speech system for a diacritic Arabic text. The goal of this project is to obtain a set of high quality speech synthesizer based on unit selection using a bi-grams model taking into account the particularities of the language. It takes a diacritic Arabic text as input and produces corresponding speech; the output is available as male voice. The evaluation of our TTS system is based on subjective and objective tests. The final evaluation of GArabic TTS system, regarding the intelligibility, naturalness aspects (listening) and the quality (PESQ) is jugged successful.

 


[1] Bebah M., Amine C., Azzeddine M., and Abdelhak L., “Hybrid Approaches For Automatic Vowelization of Arabic Texts,” International Journal on Natural Language Computing, vol. 3, no. 4, pp. 53-71, 2014.

[2] Breen A. and Jackson P., “Non-Uniform Unit Selection and the Similarity Metric within BT's LAUREATE TTS System,” in Proceeding of 3rd ESCA International Speech Synthesis Workshop, 1998.

[3] Charpentier F. and Stella M., “Diphones Synthesis Using an Overlap-Add Technique For Speech Waveforms Concatenation,” in Proceeding of International Conference on Acoustics, Speech, and Signal Processing, Tokyo, pp. 2015-2018, 1986.

[4] Chen S. and Goodman J., “An Empirical Study of Smoothing Techniques For Language Modeling,” in Proceeding of the 34th Annual Meeting on Association for Computational Linguistics, California, pp. 310-318, 1996.

[5] Donovan R. and Eide E., “The IBM Trainable Speech Synthesis System,” in Proceeding of 5th International Conference on Spoken Language Processing, Sydney, pp. 1703-1706, 1998.

[6] Dutoit T. and Cernăk M., “TTSBOX: A Matlab Toolbox for Teaching Text-To-Speech Synthesis,” in Proceeding of International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, pp. 537-540, 2005.

[7] Duwairi R., “Arabic Text Categorization,” The International Arab Journal of Information Technology, vol. 4, no. 2, pp. 125-131, 2007.

[8] Elberrichi Z. and Abidi K., “Arabic Text Categorization: a Comparative Study of Different Representation Modes,” The International Arab Journal of Information Technology, vol. 9, no. 5, pp. 465-470, 2012.

[9] Language learning, Rosetta stone http://www.rosettastone.com , Last Visited 2015.

[10] Lee M., Lopresti D.P., and Olive J.P., “A Text-to- Speech Platform for Variable Length Optimal Unit Searching Using Perceptual Cost Functions,” International Journal of Speech Technology, vol. 6, no. 4, pp. 347-356, 2001.

[11] Maamouri M., Bies A., and Kulick S., “Diacritization; A Challenge To Arabic Tree Bank Annotation And Parsing,” in Proceeding of the British Computer Society Arabic NLP/MT Conference, England, pp. 35-47, 2006.

[12] Messaoudi A., Lori L., and Gauvain J-L., “The Limsi rt04 b Arabic System,” in Proceeding Fall 2004 Rich Transcription Workshop, Palisades, 2004.

[13] Nomura T., Mizuno H., and Sato H., “Speech Synthesis by Optimum Concatenation of Phoneme Segments,” The ESCA Workshop on Speech Synthesis, Autrans, pp. 39-42, 1991.

[14] Pantazis Y., Stylianou Y., and Klabbers E., “Discontinuity Detection in Concatenated Speech Synthesis Based on Nonlinear Speech Analysis,” in Proceeding of 9th European Conference on Speech Communication and Technology, Lisbon, pp. 1-4, 2005.

[15] Peng H., Zhao Y., and Chu M., “Perceptually Optimizing the Cost Function for Unit Selection in TTS System With one Single Run of MOS Evaluation,” in Proceeding of 7th International Conference Spoken Language Processing, Colorado, pp. 2613-2616, 2002.

[16] Prudon R. and Alessandro C., “A Selection/Concatenation Test-to-Speech System: Databases Development, System Design, Comparative Evaluation,” 4th ISCA Tutorial and Research Workshop on Speech Synthesis, Perthshire, pp. 138-143, 2001.

[17] Toda T., Kawai H., Tsuzaki M., and Shikano K., “Unit Selection for Japanese Speech Synthesis Based on Both Phoneme Unit and Diphone Unit,” in Proceeding of International Conference on Acoustics, Speech, and Signal Processing, Orlando, pp. 465-468, 2002.

[18] Vergyri D. and Kirchhoff K., “Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition,” in Proceeding of Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, Geneva, pp. 66-73, 2004.

[19] Yi J. and Glass J., “Information-Theoretic Criteria for Unit Selection Synthesis,” in Proceeding of the 7th International Conference on Spoken Language Processing, Colorado, pp. 2617-2620, 2002.

[20] Zitouni I., Sorensen J., and Sarikaya R., “Maximum entropy based restoration of arabic diacritics,” in Proceeding of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics Workshop on Computational Approaches to Semitic Languages, Sydney, pp. 577-584, 2006. 494 The International Arab Journal of Information Technology, Vol. 14, No. 4, July 2017 Aissa Amrouche is a Phd student at Electronics and Computer Science Faculty USTHB Algeria. A Researcher at Scientific and Technical Research Center for the Development of the Arabic Language. He received his Magister‟s degree from Computer Science Faculty USTHB, Algeria. His main interests include Arabic language processing and speech synthesis. Leila Falek Electronics Doctor. Director of Research. Speech communication and signal processing laboratory, Electronics and Computer Science Faculty, Telecommunications department, USTHB, Algiers. Hocine Teffahi Electronics Professor, Director of Research, Speech communication and signal processing laboratory, Electronics and Computer Science Faculty, Telecommunications department USTHB, Algiers