Translation Rules for English to Hindi Machine Translation System: Homoeopathy Domain

Author Abstract: Rule based machine translation system embraces a set of grammar rules which are mandatory for the mapping of,

Keywords #Machine translation #stemmer #PoS tagging #grammar rules #homoeopathy #corpus

Abstract Rule based machine translation system embraces a set of grammar rules which are mandatory for the mapping of syntactic representations of a source language, on the target language. The system necessitates good linguistic knowledge to write rules and require of acquaintance source such as corpus and bilingual dictionary. In this paper, we have described the grammar rules intended for our English to Hindi machine translation system to translate the homoeopathic literatures, medical reports, prescription etc. The rules which have been written follow the transfer based approach for reordering of rules between two languages. The paper first discusses about our developed stemmer and its rules, further we discuss the Part of Speech tagging (PoS) rules for categorizing each word of the sentence grammatically and our developed homoeopathy corpus in English and Hindi of size 20085 and 20072 words respectively and at the last we discuss the agreement/translation rules for translating various homoeopathic sentences.

References

[1] Al-Taani A. and Abu Al-Rub S., A Rule- Based Approach for Tagging Non-Vocalized Arabic Words, the International Arab Journal of Information Technology, vol. 6, no. 3, pp. 320- 328, 2009.

[2] Bahadur P., Jain A., and Chauhan D., EtranS-A Complete Framework for English to Sanskrit Machine Translation, the International Journal of Advanced Computer Science and Applications, vol. 2, no. 1, pp. 52-59, 2012.

[3] Batra K. and Lehal G., Rule Based Machine Translation of Noun Phrases from Punjabi to English, the International Journal of Computer Science Issues, vol. 7, no. 5, pp. 409-413, 2010.

[4] CLAWS Part-of-Speech Tagger for English., available at: http://ucrel.lancs.ac.uk/claws/, last visited 2013.

[5] CliniTrans: Professional Medical Translation Services., available at: http://www.1-800- translate.com/CliniTrans, Last visited 2013.

[6] Dwivedi S. and Sukhadeve P., Rule based Part of Speech Tagger for Homoeopathy Clinical Realm, the International Journal of Computer Science, vol. 8, no. 2, pp. 350-354, 2011.

[7] Francisca J., Mamun M., and Rahman M., Adapting Rule Based Machine Translation From English to Bangla, Indian Journal of Computer Science and Engineering, vol. 2, no. 3, pp. 334-342, 2011.

[8] Jurafsky D. and Martin J., Speech and Language Processing, Prentice-Hall, 2000.

[9] Jurafsky D. and Martin J., Speech and Language Processing an Introduction to Natural Processing Computational Linguistics and Speech Recognition, Prentice-Hall, 2002.

[10] Krovetz R., Viewing Morphology as an Inference Process, in Proceedings of the 16th International Conference on Research and Development in Information Retrieval, PA, USA, pp. 191-202, 1993.

[11] Rahul C., Dinunath K., Ravindran R., and Soman K., Rule Based Reordering and Morphological Processing for English-Malayalam Statistical Machine Translation, in Proceedings of International Conference on Advances in Computing, Control and Telecommunication Technologies, Kerala, India, pp. 458-460, 2009. 796 The International Arab Journal of Information Technology, Vol. 12, No. 6A, 2015

[12] Raji P., Reordering Approach in English- Malayalam Statistical Machine Translation, Master s Thesis, Coimbatore, India, 2010.

[13] Shabdkosh 'kCnd 'k: English Hindi Dictionary and Translation., available at: http://www. Shabdkosh.com, last visited 2013.

[14] Sinha R., Sivaraman K., Agrawal A., Jain R., Srivastava R., and Jain A., ANGLABHARTI: A Multilingual Machine Aided Translation Project on Translation from English to Indian Languages, in Proceedings of International Conference on Systems, Man and Cybernetics Intelligent Systems for the 21st Century, Vancouver, Canada, pp. 1609-1614, 1995.

[15] Sukhadeve P. and Dwivedi S., Advancement of Clinical Stemmer, available at: http:// languageinindia.com/may2011/kommaluricompl ete.pdf#page=51, last visited 2013.

[16] Sukhadeve P. and Dwivedi S., Developing Hindi POS Tagger for homoeopathy Clinical language, in Proceedings of the 2nd International Conference Advances in Computer Science and Information Technology, Bangalore, India, pp. 310-316, 2012.

[17] Sukhadeve P. and Dwivedi S., Enlargement of Clinical Stemmer in Hindi Language of Homoeopathy Province, in Proceedings of the 2nd International Conference Advances in Computer Science and Information Technology, Bangalore, India, pp. 239-248, 2012.

[18] The Stanford Natural Language Processing Group., available at: http://nlp.stanford.edu/ software/tagger.html, last visited 2013.

[19] Twitter Part-of-Speech Tagging., available at: http ://www.ark.cs.cmu.edu/TweetNLP/, last visited 2013.

[20] Unnikrishnan P., Antony P., and Soman K., A Novel Approach for English to South Dravidian Language Statistical Machine Translation System, the International Journal on Computer Science and Engineering, vol. 2, no. 8, pp. 2749- 2759, 2010. Sanjay Dwivedi obtained his PhD degree from Banasthali Vidyapeeth in the year 2006. He has completed his PhD in the area of web mining. His research interest are web content mining, semantic web, search engine performance evaluation, e- governance etc. He has published many of the valuable research papers in various National and International Journals. He is presently working as Associate Professor of Computer Science Departement, of BBAU, India. Pramod Sukhadeve obtained his MSc degree in the year 2006 from Nagpur University. His research interest is natural language processing, machine translation system and in homoeopathy. He has published some of the research papers in refereed Journals and International Conferences. Presently pursuing full time research from BBA University (A Central University) Lucknow.