The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


An Efficient Mispronunciation Detection System Using Discriminative Acoustic Phonetic Features

Mispronunciation detection is an important component of Computer-Assisted Language Learning (CALL) systems. It helps students to learn new languages and focus on their individual pronunciation problems. In this paper, a novel discriminative Acoustic Phonetic Feature (APF) based technique is proposed to detect mispronunciations using artificial neural network classifier. By using domain knowledge, Arabic consonants are categorized into two groups based on their acoustic similarities. The first group consists of consonants having similar ending sounds and the second group consists of consonants with completely different sounds. In our proposed technique, the discriminative acoustic features are required for classifier training. To extract these features, discriminative parts of the Arabic consonants are identified. As a test case, a dataset is collected from native/non-native, male/female and children of different ages. This dataset comprises of 5600 isolated Arabic consonants. The average accuracy of the system, when tested with simple acoustic features are found to be 73.57%.While the use of discriminative acoustic features has improved the average accuracy to 82.27%. Some consonant pairs that are acoustically very similar, produced poor results and termed as Bad Phonemes. A subjective analysis has also been carried out to verify the effectiveness of the proposed system.


[1] Abdou S., Rashwan M., Al-Barhamtoshy H., Jambi K., and Al-Judaibi W., “Enhancing the Confidence Measure for an Arabic Pronunciation Verification System,” in Proceedings of the International Symposium on Automatic Detection of Errors in Pronunciation Training, Stockholm, pp. 6-8, 2012.

[2] Al Hindi A., Alsulaiman M., Muhammad G., and Al-Kahtani S., “Automatic Pronunciation Error Detection Of Nonnative Arabic Speech,” in Proceedings of IEEE/ACS 11th International Conference on Computer Systems and Applications, Doha, pp. 190-197, 2014.

[3] Ali H., Ahmad N., Zhou X., Ali M., and Manjotho A., “Linear Discriminant Analysis Based Approach For Automatic Speech Recognition of Urdu Isolated Words,” in Proceedings of in International Multi Topic Conference, Jamshoro, pp. 24-34, 2013.

[4] Almehmadi T. and Htike Z., “Vehicle Classification System Using Viola Jones and Multi-Layer Perceptron,” The International Arab Journal of Information Technology, vol. 13, no. 6A, pp. 965-971, 2016.

[5] Alsulaiman M., Ali Z., Muhammad G., Al Hindi A., Alfakih T., Obeidat H., and Al- Kahtani S., “Pronunciation Errors of Non-Arab Learners of Arabic Language,” in Proceedings of International Conference on Computer, Communications, and Control Technology, Langkawi, pp. 277-282, 2014.

[6] Cucchiarini C., Strik H., and Boves L., “Quantitative Assessment of Second Language Learners’ Fluency By Means of Automatic Speech Recognition Technology,” The Journal of the Acoustical Society of America107, vol. 107, no. 2, pp. 989-999, 2000.

[7] Franco H., Neumeyer L., Ramos M., and Bratt H., “Automatic Detection of Phone-Level Mispronunciation for Language Learning,” in Proceedings of 6th European Conference on Speech Communication and Technology, Budapest, pp. 851-854, 1999.

[8] Franco H., Neumeyer L., Digalakis V., and Ronen O., “Combination of Machine Scores for Automatic Grading of Pronunciation Quality,” Speech Communication, vol. 30, no. 2, pp. 121- 130, 2000.

[9] Ito A., Lim Y., Suzuki M., and Makino S., “Pronunciation Error Detection Method Based on Error Rule Clustering Using A Decision Tree,” in Proceedings of 9th European Conference on Speech Communication and Technology, Lisbon, pp. 173-176, 2005.

[10] Metwalli S., “Computer Aided Pronunciation Learning System Using Statistical Based Automatic Speech Recognition Techniques,” Ph.D. Thesis, Cairo University Giza, 2005.

[11] Odriozola I., Navas E., Hernaez I., Sainz I., Saratxaga I., Sánchez J., and Erro D., “D.: Using An ASR Database to Design A Pronunciation Evaluation System in Basque,” in Proceedings of 8th Internet Conference on Language Resources and Evaluation, Istanbul, pp. 4122-4126, 2012.

[12] Strik H., Truong K., De-Wet F., and Cucchiarinia C., “Comparing Different Approaches for Automatic Pronunciation Error Detection,” Speech Communication, vol. 51, no. 10, pp. 845- 852, 2009.

[13] Truong K., Automatic Pronunciation Error Detection in Dutch as a Second Language: An Acoustic-Phonetic Approach, MA Thesis, Utrecht University, 2006.

[14] Wei S., Hu G., Hu Y., and Wang R., “A New Method for Mispronunciation Detection Using Support Vector Machine Based on Pronunciation Space Models,” Speech Communication, vol. 51, no. 10, pp. 896-905, 2009.

[15] Weigelt L., Sadoff S., and Miller J., “Plosive/Fricative Distinction: The Voiceless 250 The International Arab Journal of Information Technology, Vol. 16, No. 2, March 2019 Case,” The Journal of the Acoustical Society of America, vol. 87, no. 6, pp. 2729-2737, 1990.

[16] Witt S. and Young S., “Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning,” Speech Communication, vol. 30, no. 2, pp. 95-108, 2000.

[17] Zahid S., Hussain F., Rashid M., Yousaf M., and Habib H., “Optimized Audio Classification And Segmentation Algorithm by Using Ensemble Methods,” Mathematical Problems in Engineering, vol. 2015, pp. 1-11, 2015. Muazzam Maqsood is currently doing his Ph.D. in Software Engineering from University of Engineering and Technology, Taxila. He has completed his MS degree in 2013 from University of Engineering and Technology, Taxila. His research interests include Speech Processing, Machine Learning, Recommender System and Image Processing. Adnan Habib completed his MS (Electrical Engineering) in 2004 and Ph.D. (Electrical Engineering) in 2007 from University of Engineering and Technology, Taxila, Pakistan. He is currently serving as Head of Department of Computer Science in UET Taxila Pakistan. His research interests include Speech Processing, Image and Video Processing, Software Development, Artificial Intelligence and Artificial Neural Networks. Tabassam Nawaz received his MS Computer Engineering in 2005 from CASE (Center for Advanced Studies in Engineering), Islamabad, Pakistan and subsequently, completed his Ph.D. in 2008. He is currently serving as a Head of Department of Software Engineering. His research interestsinclude Image and video processing, Software development, Artificial Intelligence and web development.