Tamil Lang TSP: Tamil Lang Transformer Neural Text to Sign Production

Author Thillai Sivakavi S, Minu R I,

Keywords #Sign language production #Roberta morpho syntactic analysis #sign language production network

Abstract

Tamil lang Task-Specific Prompts (TSP) is an advanced machine translation system that seamlessly converts Tamil text into Tamil Sign Language. This innovative system integrates cutting-edge neural machine translation and motion graph technology to automatically generate sign language from the input text. The process involves a meticulous analysis of the morpho-syntactic structure of the Tamil sentence, followed by its conversion into American Sign Language (ASL) notation. This notation generates gloss, serving as a pivotal element for constructing a motion graph. The motion graph is then utilized to create pose sequences that align with the generated gloss. This pioneering approach represents the first complete pipeline for accurately translating Tamil language text into corresponding sign sequences. To evaluate its translation capabilities, this approach undergoes both quantitative and qualitative assessments using a custom-built dataset. Furthermore, its performance is compared with a German language translation system, providing valuable insights into its effectiveness.

References

[1] Aliwy A. and Ahmed A., “Development of Arabic Sign Language Dictionary Using 3D Avatar Technologies,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 21, no. 1, pp. 609-616, 2021. https://ijeecs.iaescore.com/index.php/IJEECS/arti cle/view/22518/14493

[2] Balaha M., El-Kady S., Balaha H., Salama M., Emad E., Hassan M., and Saafan M., “A Vision- Based Deep Learning Approach for Independent- Users Arabic Sign Language Interpretation,” Multimedia Tools and Applications, vol. 82, no. 5, pp. 6807-6826, 2023. https://link.springer.com/article/10.1007/s11042- 022-13423-9

[3] Basiri S., Taheri A., Meghdari A., Boroushaki M., and Alemi M., “Dynamic Iranian Sign Language Recognition Using an Optimized Deep Neural Network: An Implementation via a Robotic-Based Architecture,” International Journal of Social Robotics, vol. 15, no. 4, pp. 599-619, 2023. https://link.springer.com/article/10.1007/s12369- 021-00819-0

[4] Bora J., Dehingia S., Boruah A., Chetia A., and Gogoi D., “Real-Time Assamese Sign Language Recognition Using Media Pipe and Deep Learning,” Procedia Computer Science, vol. 218, pp. 1384-1393, 2023. https://doi.org/10.1016/j.procs.2023.01.117

[5] Bowden R., “Learning to Recognize Dynamic Visual Content from Broadcast Footage,” Engineering and Physical Sciences Research Council. https://gow.epsrc.ukri.org/NGBOViewGrant.aspx ?GrantRef=EP/I011811/1

[6] Brants T., “TnT-A Statistical Part-of-Speech Tagger,” in Proceedings of the 6th Applied Natural Language Processing, Seattle, pp. 224-231, 2000. https://aclanthology.org/A00-1031/

[7] Brour M. and Benabbou A., “ATLASLang NMT: Arabic Text Language into Arabic Sign Language Neural Machine Translation,” Journal of King Saud University-Computer and Information Sciences, vol. 33, no. 9, pp. 1121-1131, 2021. https://doi.org/10.1016/j.jksuci.2019.07.006

[8] Cox S., Lincoln M., Tryggvason J., Nakisa M., Wells M., Tutt M., and Abbott S., “Tessa, a System to Aid Communication with Deaf People,” in Proceedings of the 5th international ACM Conference on Assistive Technologies, Edinburgh, pp. 205-212, 2002. https://dl.acm.org/doi/10.1145/638249.638287

[9] Das Chakladar D., Kumar P., Mandal S., Roy P., Iwamura M., and Kim B., “3D Avatar Approach for Continuous Sign Movement Using Speech/Text,” Applied Sciences, vol. 11, no. 8, pp. 3439, 2021. https://doi.org/10.3390/app11083439

[10] Das S., Biswas S., and Purkayastha B., “A Deep Sign Language Recognition System for Indian Sign Language,” Neural Computing and Applications, vol. 35, no. 2, pp. 1469-1481, 2023. https://link.springer.com/article/10.1007/s00521- 022-07840-y

[11] Das S., Imtiaz M., Neom N., Siddique N., and Wang H., “A Hybrid Approach for Bangla Sign Language Recognition Using Deep Transfer Learning Model with Random Forest Classifier,” Expert Systems with Applications, vol. 213, pp. 118914, 2023. https://doi.org/10.1016/j.eswa.2022.118914

[12] Duarte A., Palaskar S., Ventura L., Ghadiyaram D., DeHaan K., Metze F., Torres J., and Giro-i- Nieto X., “How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, pp. 2735-2744, 2021. https://ieeexplore.ieee.org/document/9577749

[13] Ebling S., Camgoz N., Braem P., Tissi K., Sidler- Miserez S., Stoll S., Hadfield S., Haug T., Bowden R., Tornay S., Razavi M., and Magimai-Doss M., “SMILE Swiss German Sign Language Dataset,” in Proceedings of the 11th International Conference on Language Resources and Evaluation, Miyazaki, pp. 4221-4229, 2018. https://aclanthology.org/L18-1666/

[14] Efthimiou E., Fotinea S., Hanke T., Glauert J., Bowden R., Braffort A., Collet C., Maragos P., and Lefebvre-Albaret F., “The Dicta-Sign Wiki: Enabling Web Communication for the Deaf,” in Proceedings of the Computers Helping People with Special Needs: 13th International Conference, Linz, pp. 205-212, 2012. https://link.springer.com/chapter/10.1007/978-3- 642-31534-3_32

[15] Elliott R., Glauert J., Kennaway J., and Marshall I., “The Development of Language Processing Support for the ViSiCAST Project,” in Proceedings of the 4th International ACM Conference on Assistive Technologies, Arlington, pp. 101-108, 2000. https://dl.acm.org/doi/10.1145/354324.354349 Tamil Lang TSP: Tamil Lang Transformer Neural Text to Sign Production 1153

[16] Farooq U., Mohd Rahim M., and Abid A., “A Multi-Stack RNN-Based Neural Machine Translation Model for English to Pakistan Sign Language Translation,” Neural Computing and Applications, vol. 35, no. 18, pp. 13225-13238, 2023. https://link.springer.com/article/10.1007/s00521- 023-08424-0

[17] Filhol M. and McDonald J., “The Synthesis of Complex Shape Deployments in Sign Language,” in Proceedings of the 9th Workshop on the Representation and Processing of Sign Languages, Marseille, pp. 61-68, 2020. https://www.sign-lang.uni- hamburg.de/lrec/pub/20.html

[18] Forster J., Schmidt C., Koller O., Bellgardt M., and Ney H., “Extensions of the Sign Language Recognition and Translation Corpus RWTH- PHOENIX-Weather,” in Proceedings of the 19th International Conference on Language Resources and Evaluation, Reykjavik, pp. 1911-1916, 2014. https://aclanthology.org/L14-1472/

[19] Gibet S., Lefebvre-Albaret F., Hamon L., Brun R., and Turki A., “Interactive Editing in French Sign Language Dedicated to Virtual Signers: Requirements and Challenges,” Universal Access in the Information Society, vol. 15, no. 4, pp. 525- 539, 2016. https://link.springer.com/article/10.1007/s10209- 015-0411-6

[20] Hajic J., Disambiguation of Rich Inflection: Computational Morphology of Czech, Karolinum Press, Charles University, 2004. https://doi.org/10.2478/jazcas-2019-0067

[21] Kar P., Reddy M., Mukherjee A., and Raina A., “INGIT: Limited Domain Formulaic Translation from Hindi Strings to Indian Sign Language,” ICON, vol. 52, pp. 53-54, 2007. https://www.cse.iitk.ac.in/users/purushot/papers/i ngit.pdf

[22] Kothadiya D., Bhatt C., Saba T., Rehman A., and Bahaj S., “SIGNFORMER: DeepVision Transformer for Sign Language Recognition,” IEEE Access, vol. 11, pp. 4730-4739, 2023. https://ieeexplore.ieee.org/document/10011551

[23] Kumar Attar R., Goyal V., and Goyal L., “State of the Art of Automation in Sign Language: A Systematic Review,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, no. 4, pp. 1-80, 2023. https://dl.acm.org/doi/abs/10.1145/3564769

[24] Manzano D., “English to ASL Translator for Speech2Signs,” 2021. https://www.semanticscholar.org/paper/ENGLIS H-TO-ASL-TRANSLATOR-FOR- SPEECH2SIGNS- Manzano/88d6a9573b9a2d0cf54c90e947f4fbe12 578fbca

[25] McDonald J., Wolfe R., Schnepp J., Hochgesang J., Jamrozik D., Stumbo M., Berke L., Bialek M., and Thomas F., “An Automated Technique for Real-Time Production of Lifelike Animations of American Sign Language,” Universal Access in the Information Society, vol. 15, no. 4, pp. 551- 566, 2016. https://link.springer.com/article/10.1007/s10209- 015-0407-2

[26] Mittal A., Kumar P., Roy P., Balasubramanian R., and Chaudhuri B., “A Modified LSTM model for Continuous Sign Language Recognition Using Leap Motion,” IEEE Sensors Journal, vol. 19, no. 16, pp. 7056-7063, 2019. https://ieeexplore.ieee.org/document/8684245

[27] Othman A. and Jemni M., “English-ASL Gloss Parallel Corpus 2012: ASLG-PC12,” in Proceedings of the 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, Istanbul, pp. 151-154, 2012. https://www.sign-lang.uni- hamburg.de/lrec/pub/12019.html

[28] Ramasamy L. and Zabokrtsky Z., “Prague Dependency Style Treebank for Tamil,” in Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, pp. 1888-1894, 2012. https://aclanthology.org/L12-1242/

[29] Ramasamy L. and Zabokrtsky Z., “Tamil Dependency Treebank,” in Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing, Tokyo, pp. 82-94, 2011. https://ufal.mff.cuni.cz/~ramasamy/tamiltb/1.0/ht ml/

[30] Saunders B., Camgoz N., and Bowden R., “Everybody Signs Now: Translating Spoken Language to Photo-Realistic Sign Language Video,” arXiv Preprint, vol. arXiv:2011.09846, pp. 1-11, 2020.

[31] Saunders B., Camgoz N., and Bowden R., “Progressive Transformers for End-To-End Sign Language Production,” in Proceedings of the Computer Vision-ECCV, 16th European Conference, Glasgow, pp. 687-705, 2020. https://link.springer.com/chapter/10.1007/978-3- 030-58621-8_40

[32] Shin H., Kim W., and Jang K., “Korean Sign Language Recognition Based on Image and Convolution Neural Network,” in Proceedings of the 2nd International Conference on Image and Graphics Processing, Singapore, pp. 52-55, 2019. https://dl.acm.org/doi/10.1145/3313950.3313967

[33] Shin J., Miah A., Hasan M., Hirooka K., Suzuki K., Lee H., and Jang S., “Korean Sign Language Recognition Using Transformer-Based Deep Neural Network,” Applied Sciences, vol. 13, no. 5, 1154 The International Arab Journal of Information Technology, Vol. 21, No. 6, November 2024 pp. 3029, 2023. https://doi.org/10.3390/app13053029

[34] Stoll S., Camgoz N., Hadfield S., and Bowden R., “Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks,” in Proceedings of the 29th British Machine Vision Conference British Machine Vision Association, Newcastle, pp. 1-12, 2018. https://www.researchgate.net/publication/326811 903_Sign_Language_Production_using_Neural_ Machine_Translation_and_Generative_Adversari al_Networks

[35] Stoll S., Camgoz N., Hadfield S., and Bowden R., “Text2Sign: Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks,” International Journal of Computer Vision, vol. 128, no. 4, pp. 891-908, 2020. https://link.springer.com/article/10.1007/s11263- 019-01281-2

[36] Tokuda M. and Okumura M., Assistive Technology and Artificial Intelligence: Applications in Robotics, User Interfaces, and Natural Language Processing, Springer, 2006. https://link.springer.com/chapter/10.1007/bfb005 5973

[37] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A., Kaiser L., and Polosukhin I., “Attention is all you Need,” arXiv Preprint, vol. arXiv:1706.03762v2, pp. 1-16, 2017. https://papers.nips.cc/paper_files/paper/2017/has h/3f5ee243547dee91fbd053c1c4a845aa- Abstract.html

[38] Veale T. and Conway A., “Cross-Modal Comprehension in ZARDOZ an English to Sign- Language Translation System,” in Proceedings of the 7th International Workshop on Natural Language Generation, Maine, pp. 249-252, 1994. https://dl.acm.org/doi/10.5555/1641417.1641450

[39] Ventura L., Duarte A., and Giro-i-Nieto X., “Can Everybody Sign Now? Exploring Sign Language Video Generation from 2D Poses,” arXiv Preprint, arXiv:2012.10941, pp. 1-4, 2020. https://arxiv.org/pdf/2012.10941

[40] World Health Organization, Deafness and Hearing Loss, https://www.who.int/news-room/fact- sheets/detail/deafness-and-hearing-loss, Last Visited, 2024.

[41] Xie P., Cui Z., Du Y., Zhao M., Cui J., Wang B., and Hu X., “Multi-Scale Local-Temporal Similarity Fusion for Continuous Sign Language Recognition,” Pattern Recognition, vol. 136, pp. 109233, 2023. https://doi.org/10.1016/j.patcog.2022.109233

[42] Zelinka J. and Kanis J., “Neural Sign Language Synthesis: Words are our Glosses,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass, pp. 3395-3403. 2020. https://ieeexplore.ieee.org/document/9093516

[43] Zhao L., Kipper K., Schuler W., Vogler C., Badler N., and Palmer M., “A Machine Translation System from English to American Sign Language,” in Proceedings of the Envisioning Machine Translation in the Information Future: 4th Conference of the Association for Machine Translation in the Americas, Cuernavaca, Mexico, pp. 54-67, 2000. https://link.springer.com/chapter/10.1007/3-540- 39965-8_6

[44] Zwitserlood I., Verlinden M., Ros J., Van der Schoot S., and Netherlands T., “Synthetic Signing for the Deaf: Esign,” in Proceedings of the Conference and Workshop on Assistive Technologies for Vision and Hearing Impairment, 2004. https://www.semanticscholar.org/paper/SYNTHE TIC-SIGNING-FOR-THE-DEAF-%3A-eSIGN- Zwitserlood- Verlinden/df8bdfaabf2e043e22cf0c8b544b619f3 3d96e05