The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


A Model for English to Urdu and Hindi Machine Translation System using Translation Rules and

This paper illustrates the architecture and working of a proposed multilingual machine translation system which is able to translate from English to Urdu and Hindi. The system applies translation rules based approach with artificial neural network.The efficient pattern matching and the ability of learning by examples makes neural networks suitable for implementation of a translation rule based machine translation system.This paper also describes the importance of machine translation systems and status of the languages in a multilingual country like India.Machine translation evaluation score for translation output obtained from the system has been calculated using various methods such as n-gram bleu score, F-measure, Meteor and precision, recall. The evaluation scores achieved by the system for around 500 Hinditest sentences are as: n-gram bleu score 0.5903; Metric for Evaluation of Translation with Explicit ORdering (METEOR) score achieved is 0.7956 and F- score of 0.7916 and for Urdu n-gram bleu score achieved by thesystem is 0.6054; METEOR score achieved is 0.8083 and F- score of 0.8250.


[1] Akeel M. and Mishra R., “ANN and Rule Based Method for English to Arabic Machine Translation,” The International Arab Journal of Information Technology, vol. 11, no. 4, pp. 396- 405, 2014.

[2] Banerjee S. and Lavie A., “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments,” in Proceedings of The ACL Workshop on Intrinsic And Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, pp. 65-72, 2005.

[3] De Marneffe M., MacCartney B., and Manning C., “Generating Typed Dependency Parses from Phrase Structure Parses,” in Proceedings of the 5th International Conference on Language Resources and Evaluation, Genoa, pp. 449-454, 2006.

[4] Hutchins W., Machine Translation: Past, Present, Future, Ellis HorwoodChichester, 1987.

[5] Jain A., “Parsing Complex Sentences with Structured Connectionist Networks,” Neural Computation, vol. 3, no. 1, pp. 110-120, 1991.

[6] Johnson R., King M., and Tombe L., “EUROTRA: A Multilingual System under Development,” Computational Linguistics, vol. 11, no. 2-3, pp. 155-169, 1985.

[7] Lewis M., Simons G., and Fennig C., Ethnologue: Languages of the World, SIL International, 2015.

[8] Malik M., Boitet C., and Bhattacharyya P., “Hindi Urdu Machine Transliteration Using Finite-State Transducers,” in Proceedings of the 22nd International Conference on Computational Linguistics, Manchester, pp. 537-544, 2008.

[9] Masica C., The Indo-Aryan Languages, Cambridge University Press, 1993.

[10] Mishra V. and Mishra R., “ANN and Rule based Model for English to Sanskrit Machine Translation” INFOCOMP Journal of Computer Science, vol. 9, no. 1, pp. 80-89, 2009.

[11] Nirenburg S., Machine Translation: Theoretical and Methodological Issues, Cambridge University Press, 1987.

[12] Papineni K., Roukos S., Ward T., and Zhu W., “BLEU: A Method for Automatic Evaluation of Machine Translation,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, pp. 311-318, 2002.

[13] Pedtke T., “US Government Support and Use of Machine Translation: Current Status,” in Proceedings of MT Summit, San Diego, pp. 3-13, 1997.

[14] Schmidt R., Urdu, an Essential Grammar, Psychology Press, 1999. A Model for English to Urdu and Hindi Machine Translation System using ... 131

[15] Shahnawaz N. and Mishra R., “An English to Urdu Translation Model Based on CBR, ANN and Translation Rules,” International Journal of Advanced Intelligence Paradigms, vol. 7, no. 1, pp. 1-22, 2015.

[16] Sinha R., “Mining Complex Predicates in Hindi Using A Parallel Hindi-English Corpus,” in Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications, Singapore, pp. 40-46, 2009.

[17] Suratgar A., Tavakoli M., and Hoseinabadi A., “Modified Levenberg-Marquardt Method for Neural Networks Training,” International Journal of Computer and Information Engineering, vol. 1, no. 6, pp. 46-48, 2005.

[18] Toma P., “Systran as A Multilingual Machine Translation System,” in Proceedings of the 3rd European Congress on Information Systems and Networks, Overcoming the Language Barrier, München, pp. 569-581, 1977.

[19] Toutanova K., Klein D., Manning C., and Singer Y., “Feature-Rich Part-of-Speech Tagging with A Cyclic Dependency Network,” in Proceedings of Conference of The North American Chapter of The Association for Computational Linguistics on Human Language Technology, Edmonton, pp. 173-180, 2003.

[20] Turian J., Shea L., and Melamed I., “Evaluation of Machine Translation and its Evaluation,” in Proceedings of the MT Summit IX, New Orleans, pp. 386-393, 2003.

[21] Vikas O.; “Multilingualism for Cultural Diversity and Universal Access in Cyberspace: An Asian Perspective,” in Proceedings of Thematic Meeting for the World Summit on the Information Society, UNESCO, pp. 1-49, 2005.

[22] W3Techs; Usage of Content Languages for Websites, W3Techs.com, Last Visited, 2015.

[23] Waibel A., Jain A., McNair A., Saito H., Hauptmann A., and Tebelskis J., “JANUS: A Speech-to-Speech Translation System Using Connectionist and Symbolic Processing Strategies,” in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Toronto, pp. 793-796 1991. Shahnawaz Khan (Ph. D.) is currently working as an Assistant Professor in Saudi Electronic University, Saudi Arabia. He has received his Ph. D. in Computer Science from Indian Institute of Technology, Banaras Hindu University (IT-BHU), Varanasi, India. He has around 17 years of experience in teaching and research and software industry. He is a member of editorial board in. He has been editor-in-chief for 1 international conference proceedings and 1 national conference proceedings. He is author of more than 20 research papers in refereed journals and International conferences. He has supervised 5 Master dissertations. His research interests are in smartphone sensing, natural language processing, machine learning and image processing. Imran Usman received his BE degree in Software Engineering from Foundation University, Pakistan in 2003 and MS Computer System Engineering from Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Pakistan in 2006. He joined Pakistan Institute of Engineering and Applied Sciences as a research scholar and received his PhD degree in 2010. From 2009 to 2010 he served at Iqra University Islamabad, Pakistan as an Assistant Professor in Department of Computing and Technology. From 2010 to 2012 he served as Assistant Professor and Senior In-charge Graduate Program in the Department of Electrical Engineering at COMSATS Institute of Information Technology Islamabad, Pakistan. He is presently serving as Assistant Professor in College of Computing and Informatics, Saudi Electronic University, Kingdom of Saudi Arabia. His present research interests include machine learning, digital image processing, evolutionary computation and digital watermarking. Dr. Usman has a number of research papers to his credit and has supervised many BS, MS and PhD students. He has been awarded Research Productivity Award in 2011, 2012, 2013 and 2014 by COMSATS Institute