CANBLWO: A Novel Hybrid Approach for Semantic Text Generation

Author Abhishek Kumar Pandey, Sanjiban Sekhar Roy,

Keywords #Natural language processing #natural language generation #neural network #convolutional attention network #whale optimization algorithm #BERT #large language model

Abstract

Semantic text generation is critical in Natural Language Processing (NLP) as it faces challenges such as maintenance of coherence among texts, contextual relevance, and quality output. Traditional language models often produce grammatically inconsistent text. To address these issues, we introduce Convolutional Attention Bi-LSTM with Whale Optimization (CANBLWO), a novel hybrid model that integrates a Convolutional Attention Network (CAN), Bidirectional Long Short-Term Memory (Bi-LSTM), and Whale Optimization Algorithm (WOA). CANBLWO aims to generate semantically rich and coherent text and outperforms the traditional models like Long Short-Term Memory (LSTM), Recurrent Neural Networks (RNN), Bi-LSTM, and Bi-LSTM with attention, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre- trained Transformer 2 (GPT-2). Our model achieved 0.79, 0.78, 0.76, and 0.82 scores in Metric for Evaluation of Translation with Explicit Ordering (METEOR), Bi-Lingual Evaluation Understudy (BLEU), Consensus-based Image Description Evaluation (Ciders), and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics, respectively. The proposed model also demonstrates 97% and 96% accuracy on Wiki-Bio and Code/Natural Language Challenge (CoNaLa) datasets, highlighting its effectiveness against Large Language Models (LLMs). This study underscores the potential capability of hybrid approaches in enhancing semantic text generation.

References

[1] Abadi M., Barham P., Chen J., Chen Z., and Davis A., “TensorFlow: A System for Large-Scale Machine Learning,” in Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, Savannah, pp. 265- 283, 2016. https://dl.acm.org/doi/10.5555/3026877.3026899

[2] Abujar S., Masum A., Chowdhury S., Hasan M., and Hossain S., “Bengali Text Generation Using Bi-Directional RNN,” in Proceedings of the 10th International Conference on Computing, Communication and Networking Technologies, pp. 1-5, Kanpur, 2019. DOI:10.1109/ICCCNT45670.2019.8944784

[3] Alqarni M., “Embedding Search for Quranic Texts Based on Large Language Models,” The International Arab Journal of Information Technology, vol. 21, no. 2, pp. 243-256, 2024. https://doi.org/10.34028/iajit/21/2/7

[4] Ayana., Chen Y., Yang C., Liu Z., and Sun M., “Reinforced Zero-Shot Cross-Lingual Neural Headline Generation,” Transactions on Audio, Speech, and Language Processing, vol. 28, no. 12, pp. 2572-2584, 2020. DOI:10.1109/TASLP.2020.3009487

[5] Bai Y., Li Z., Ding N., Shen Y., and Zheng H., (71) CANBLWO: A Novel Hybrid Approach for Semantic Text Generation 707 “Infobox-to-Text Generation with Tree-Like Planning Based Attention Network,” in Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, pp. 3773-3779, 2021. https://dl.acm.org/doi/abs/10.5555/3491440.3491962

[6] Balas V., Roy S., Sharma D., and Samui P., Handbook of Deep Learning Applications, Springer, 2019. https://doi.org/10.1007/978-3- 030-11479-4

[7] Banerjee S. and Lavie A., “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments,” in Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Michigan, pp. 65-72, 2005. https://aclanthology.org/W05-0909

[8] Bao J., Tang D., Duan N., Yan Z., Zhou M., and Zhao T., “Text Generation from Tables,” IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 27, no. 2, pp. 311-320, 2019. DOI:10.1109/TASLP.2018.2878381

[9] Barros C., Vicente M., and Lloret E., “To What Extent does Content Selection Affect Surface Realization in the Context of Headline Generation?,” Computer Speech and Language, vol. 67, pp. 101179, 2021. https://doi.org/10.1016/j.csl.2020.101179

[10] Cao J., “Generating Natural Language Descriptions from Tables,” IEEE Access, vol. 8, pp. 46206-46216, 2020. DOI:10.1109/ACCESS.2020.2979115

[11] Chen X., Jin P., Jing S., and Xie C., “Automatic Detection of Chinese Generated Essayss Based on Pre-Trained BERT,” in Proceedings of the IEEE 10th Joint International Information Technology and Artificial Intelligence Conference, Chongqing, pp. 2257-2260, 2022. DOI:10.1109/ITAIC54216.2022.9836571

[12] Dethlefs N., Schoene A., and Cuayáhuitl H., “A Divide-and-Conquer Approach to Neural Natural Language Generation from Structured Data,” Neurocomputing, vol. 433, pp. 300-309, 2021. DOI:10.1016/j.neucom.2020.12.083

[13] Devlin J., Chang M., Lee K., and Toutanova K., “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the NAACL-HLT Association for Computational Linguistics, Minneapolis, pp. 4171-4186, 2019. https://aclanthology.org/N19- 1423.pdf

[14] Ding J., Li Y., Ni H., and Yang Z., “Generative Text Summary Based on Enhanced Semantic Attention and Gain-Benefit Gate,” IEEE Access, vol. 8, pp. 92659-92668, 2020. DOI:10.1109/ACCESS.2020.2994092

[15] Diwan C., Srinivasa S., Suri G., Agarwal S., and Ram P., “AI-based Learning Content Generation and Learning Pathway Augmentation to Increase Learner Engagement,” Computers and Education: Artificial Intelligence, vol. 4, pp. 100110, 2023. https://doi.org/10.1016/j.caeai.2022.100110

[16] Dixit U., Mishra A., Shukla A., and Tiwari R., “Texture Classification Using Convolutional Neural Network Optimized with Whale Optimization Algorithm,” SN Applied Sciences, vol. 1, no. 6, pp. 1-11, 2019. DOI:10.1007/s42452-019-0678-y

[17] Faille J., Gatt A., and Gardent C., “The Natural Language Generation Pipeline, Neural Text Generation and Explainability,” in Proceedings of the 2nd Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence, Dublin, pp. 16-21, 2020. https://hal.science/hal-03046206

[18] Gharehchopogh F. and Gholizadeh H., “A Comprehensive Survey: Whale Optimization Algorithm and its Applications,” Swarm and Evolutionary Computation, vol. 48, pp. 1-24, 2019. https://doi.org/10.1016/j.swevo.2019.03.004

[19] Jing L., Song X., Lin X., Zhao Z., Zhou W., and Nie L., “Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain,” ACM Transactions on Information Systems, vol. 42, no. 1, pp. 1-24, 2023. https://doi.org/10.1145/3603374

[20] Joseph F., Nonsiri S., and Monsakul A., Advanced Deep Learning for Engineers and Scientists: A Practical Approach, Springer, 2021. https://link.springer.com/chapter/10.1007/978-3- 030-66519-7_4

[21] Kumari S. and Pushphavati T., Computational Methods and Data Engineering, Springer, 2023. https://doi.org/10.1007/978-981-19-3015-7_8

[22] Lebret R., Grangier D., and Auli M., “Generating Text from Structured Data with Application to the Biography Domain,” arXiv Preprint, vol. arXiv:1603.07771, pp. 1-10, 2016. https://www.semanticscholar.org/reader/cc6ef7ce bf340606cb9c2f374474f567880fab38

[23] Lee J. and Hsiang J., “Patent Claim Generation by Fine-Tuning OpenAI GPT-2,” World Patent Information, vol. 62, pp. 101983, 2020. DOI:10.1016/j.wpi.2020.101983

[24] Lin C., “ROUGE: A Package for Automatic Evaluation of Summaries,” in Proceedings of the Workshop on Text Summarization Branches out, Barcelona, pp. 74-81, 2004. https://aclanthology.org/W04-1013

[25] Liu Y., Wang L., Shi T., and Li J., “Detection of Spam Reviews through a Hierarchical Attention Architecture with N-gram CNN and Bi-LSTM,” Information Systems, vol. 103, no. C, pp. 101865, 2021. DOI:10.1016/j.is.2021.101865

[26] McKeown K., Text Generation, Cambridge University Press, 1992. 708 The International Arab Journal of Information Technology, Vol. 21, No. 4, July 2024 https://books.google.jo/books/about/Text_Generat ion.html?hl=fr&id=Ex6xZlxvUywC&redir_esc=y

[27] Mirjalili S. and Lewis A., “The Whale Optimization Algorithm,” Advances in Engineering Software, vol. 95, pp. 51-67, 2016. https://doi.org/10.1016/j.advengsoft.2016.01.008

[28] Mukherjee V., Mukherjee A., and Prasad D., Handbook of Research on Predictive Modeling and Optimization Methods in Science and Engineering, IGI Global, 2018. DOI:10.4018/978-1-5225-4766-2.ch023

[29] Niculescu M., Ruseti S., and Dascalu M., “RoGPT2: Romanian GPT2 for Text Generation,” in Proceedings of the IEEE 33rd International Conference on Tools with Artificial Intelligence, Washington, pp. 1154-1161, 2021. DOI:10.1109/ICTAI52525.2021.00183

[30] Palivela H., “Optimization of Paraphrase Generation and Identification Using Language Models in Natural Language Processing,” International Journal of Information Management Data Insights, vol. 1, no. 2, pp. 100025, 2021. https://doi.org/10.1016/j.jjimei.2021.100025

[31] Papineni K., Roukos S., Ward T., and Zhu W., “BLEU: A Method for Automatic Evaluation of Machine Translation,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguisti, Philadelphia, pp. 311-318, 2002. DOI:10.3115/1073083.1073135

[32] Pawade D., Sakhapara A., Jain M., Jain N., and Gada K., “Story Scrambler-Automatic Text Generation Using Word Level RNN-LSTM,” International Journal of Information Technology and Computer Science, vol. 10, no. 6, pp. 44-53, 2018. DOI: 10.5815/ijitcs.2018.06.05

[33] Peters M., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., and Zettlemoyer L., “Deep Contextualized Word Representations,” arXiv Preprint, vol. arXiv:1802.05365, pp. 1-15, 2018. http://arxiv.org/abs/1802.05365

[34] Qu Y., Liu P., Song W., Liu L., and Cheng M., “A Text Generation and Prediction System: Pre- Training on New Corpora Using BERT and GPT- 2,” in Proceedings of the IEEE 10th International Conference on Electronics Information and Emergency Communication, Beijing, pp. 323-326, 2020. DOI:10.1109/ICEIEC49280.2020.9152352

[35] Rahman M., Watanobe Y., and Nakamura K., “A Bidirectional LSTM Language Model for Code Evaluation and Repair,” Symmetry, vol. 13, no. 2, pp. 1-15, 2021. https://doi.org/10.3390/sym13020247

[36] Rasheed A., San O., and Kvamsdal T., “Digital Twin: Values, Challenges and Enablers from a Modeling Perspective,” IEEE Access, vol. 8, pp. 21980-22012, 2020. DOI:10.1109/ACCESS.2020.2970143

[37] Ren Y., Hu W., Wang Z., Zhang X., Wang Y., and Wang X., “A Hybrid Deep Generative Neural Model for Financial Report Generation,” Knowledge-based System, vol. 227, pp. 107093, 2021. DOI: 10.1016/j.knosys.2021.107093

[38] Roy S., Mallik A., Gulati R., Obaidat M., and Krishna P., “A Deep Learning Based Artificial Neural Network Approach for Intrusion Detection,” in Proceedings of the 3rd International Conference on Mathematics and Computing, Haldia, pp. 44-53, 2017. https://link.springer.com/chapter/10.1007/978- 981-10-4642-1_5

[39] Santhanam S., “Context Based Text-Generation Using LSTM Networks,” arXiv Preprint, vol. arXiv:2005.00048, pp. 1-10, 2020. http://arxiv.org/abs/2005.00048

[40] Seifossadat E. and Sameti H., “Stochastic Data-to- Text Generation Using Syntactic Dependency Information,” Computer Speech and Language, vol. 76, pp. 101388, 2022. https://doi.org/10.1016/j.csl.2022.101388

[41] Selva Birunda S. and Kanniga Devi R., Innovative Data Communication Technologies and Application, Springer, 2021. https://doi.org/10.1007/978-981-15-9651-3_23

[42] Seo H., Jung S., Jung J., Hwang T., Namgoong H., and Roh Y., “Controllable Text Generation Using Semantic Control Grammar,” IEEE Access, vol. 11, pp. 26329-26343, 2023. DOI:10.1109/ACCESS.2023.3252017

[43] Sha L., Mou L., Liu T., Poupart P., Li S., Chang B., and Sui Z., “Order-Planning Neural Text Generation from Structured Data,” in Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, pp. 5414-5421, 2018. https://cdn.aaai.org/ojs/11947/11947-13-15475- 1-2-20201228.pdf

[44] Shedko A., “Semantic-Map-based Assistant for Creative Text Generation,” Procedia Computer Science, vol. 123, pp. 446-450, 2018. https://doi.org/10.1016/j.procs.2018.01.068

[45] Shukla A., Das T., and Roy S., “TRX Cryptocurrency Profit and Transaction Success Rate Prediction Using Whale Optimization-based Ensemble Learning Framework,” Mathematics, vol. 11, no. 11, pp. 1-27, 2023. DOI:10.3390/math11112415

[46] Sutskever I., Vinyals O., and Le Q., “Sequence to Sequence Learning with Neural Networks,” in Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, pp. 3104-3112, 2014. https://dl.acm.org/doi/10.5555/2969033.2969173

[47] Thoppilan R., De Freitas D., Shazeer J., and Kulshreshtha A., “LaMDA: Language Models for Dialog Applications,” arXiv Preprint, vol. CANBLWO: A Novel Hybrid Approach for Semantic Text Generation 709 arXiv:2201.08239, pp. 1-46, 2022. http://arxiv.org/abs/2201.08239

[48] Tiwari S., Khandelwal S., and Roy S., “E- Learning Tool for Japanese Language Learning through English, Hindi and Tamil: A Computer Assisted Language Learning Based Approach,” in Proceedings of the 3rd International Conference on Advanced Computing, Chennai, pp. 52-55, 2011. DOI:10.1109/ICoAC.2011.6165218

[49] Turki T. and Roy S., “Novel Hate Speech Detection Using Word Cloud Visualization and Ensemble Learning Coupled with Count Vectorizer,” Applied Sciences, vol. 12, no. 13, pp. 1-13, 2022. https://doi.org/10.3390/app12136611

[50] Van Deemter K., Theune M., and Krahmer E., “Real Versus Template-based Natural Language Generation: A False Opposition?,” Computational Linguistics, vol. 31, no. 1, pp. 15-24, 2005. https://doi.org/10.1162/0891201053630291

[51] Van der Lee C., Krahmer E., and Wubben S., “Automated Learning of Templates for Data-to- Text Generation: Comparing Rule-based, Statistical and Neural Methods,” in Proceedings of the 11th International Conference on Natural Language Generation, Tilburg, pp. 35-45, 2018. https://aclanthology.org/W18-6504

[52] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A., Kaiser L., and Polosukhin I., “Attention is all you Need,” in Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, pp. 5999-6010, 2017. https://dl.acm.org/doi/10.5555/3295222.3295349

[53] Vedantam R., Zitnick C., and Parikh D., “CIDEr: Consensus-based Image Description Evaluation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp. 4566-4575, 2015. DOI:10.1109/CVPR.2015.7299087

[54] Wang H., Hsiao W., and Chang S., “Automatic Paper Writing Based on a RNN and the TextRank Algorithm,” Applied Soft Computing, vol. 97, pp. 106767, 2020. DOI:10.1016/j.asoc.2020.106767

[55] Wang Z., Cuenca G., Zhou S., Xu F., and Neubig G., “MCoNala: A Benchmark for Code Generation from Multiple Natural Languages,” arXiv Preprint, vol. arXiv:2203.08388, pp. 1-9, 2023. https://doi.org/10.48550/arXiv.2203.08388

[56] Wei M. and Zhang Y., “Natural Answer Generation with Attention over Instances,” IEEE Access, vol. 7, pp. 61008-61017, 2019. DOI:10.1109/ACCESS.2019.2904337

[57] Yakhchi S., Behehsti A., Ghafari S., Razzak I., Orgun M., and Elahi M., “A Convolutional Attention Network for Unifying General and Sequential Recommenders,” Information Processing and Management, vol. 59, no. 1, pp. 102755, 2022. https://doi.org/10.1016/j.ipm.2021.102755

[58] Yang S., Liu Y., Feng D., and Li D., “Text Generation from Data with Dynamic Planning,” IEEE/ACM Trans, Audio, Speech, Lang Process, vol. 30, pp. 26-34, 2021. DOI:10.1109/TASLP.2021.3129346

[59] Yin P., Deng B., Chen E., Vasilescu B., and Neubig G., “Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow,” in Proceedings of the 15th International Conference on Mining Software Repositories, Gothenburg, pp. 476-486, 2018. https://doi.org/10.1145/3196398.319640

[60] Zhang R., Wang Z., Yin K., and Huang Z., “Emotional Text Generation Based on Cross- Domain Sentiment Transfer,” IEEE Access, vol. 7, pp. 100081-100089, 2019. DOI:10.1109/ACCESS.2019.2931036

[61] Zhao H., Lu J., and Cao J., “A Short Text Conversation Generation Model Combining BERT and Context Attention Mechanism,” International Journal of Computational Science and Engineering, vol. 23, no. 2, pp. 136-144, 2020. DOI: 10.1504/IJCSE.2020.110536

[62] Zhao J., Zhan Z., Li T., Li R., Hu C., Wang S., and Zhang Y., “Generative Adversarial Network for Table-to-Text Generation,” Neurocomputing, vol. 452, pp. 28-36, 2021. DOI:10.1016/j.neucom.2021.04.036 710