ERDAP: A Novel Method of Event Relation Data Augmentation Based on Relation Prediction
Event relation extraction is a key aspect in the fields of event evolutionary graph construction, knowledge question and answer, and intelligence analysis, etc., Currently, supervised learning methods that rely on large amounts of labeled data are mostly used; however, the size of existing event relation datasets is small and cannot provide sufficient training data for the models. To alleviate this challenging research question, this study proposes a novel data augmentation model, called Event Relation Data Augmentation based on relationship Prediction (ERDAP), that allows both semantic and structural features to be taken into account without changing the semantic relation label compatibility, uses event relation graph convolutional neural networks to predict event relations, and expands the generated high-quality event relation triples as new training data for the event relation texts. Experimental evaluation using event causality extraction method on Chinese emergent event dataset shows that our model significantly outperforms existing text augmentation methods and achieves desirable performance, which provides a new idea for event relation data augmentation.
[1] Balkus S. and Yan D., “Improving Short Text Classification with Augmented Data Using GPT- 3,” Natural Language Engineering, vol. 28, no. 6, pp. 1-30, 2022. doi:10.1017/S1351324923000438
[2] Cao Q., Trivedi H., Balasubramanian A., and Balasubramanian N., “DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering,” arXiv Preprint, arXiv: 2005.00697v1, pp. 1-11, 2020. https://doi.org/10.48550/arXiv.2005.00697
[3] Caselli T. and Vossen P., “The Event Story-Line Corpus: A New Benchmark for Causal and Temporal Relation Extraction,” in Proceedings of the Events and Stories in the News Workshop, Vancouver, pp. 77-86, 2017. https://aclanthology.org/W17-2711
[4] Dasgupta T., Saha R., Dey L., and Naska A., “Automatic Extraction of Causal Relations from Text Using Linguistically Informed Deep Neural Networks,” in Proceedings of the 19th Annual SIGDIAL Meeting on Discourse and Dialogue, Melbourne, pp. 306-316, 2018. DOI:10.18653/v1/W18-5035
[5] Devllin J., Chang M., Lee K., and Toutanova K., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the NAACL-HLT, Minneapolis, pp. 4171-4186, 2019. https://aclanthology.org/N19- 1423.pdf
[6] Han S., Hao X., and Huang H., “An Event- Extraction Approach for Business Analysis from Online Chinese News,” Electronic Commerce Research and Applications, vol. 28, no. 11, pp. 244-260, 2018. https://doi.org/10.1016/j.elerap.2018.02.006
[7] Hou Y., Liu Y., Che W., and Liu T., “Sequence-to- Sequence Data Augmentation for Dialogue Language Understanding,” in Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, pp. 1234-1245, 2018. https://aclanthology.org/C18-1105
[8] Hussain Z., Gimenez F., Yi D., and Rubin D., “Differential Data Augmentation Techniques for Medical Imaging Classification Tasks,” Annual Symposium Proceedings Archive, vol. 2017, pp. 979-984, 2017. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC 5977656/
[9] Kobayashi S., “Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations,” in Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Conference, New Orleans, pp. 452-457, 2018. DOI:10.18653/v1/N18-2072
[10] Krizhevsky A., Sutskever I., and Hinton G., “Imagenet Classification with Deep Convolutional Neural Networks,” Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2012. https://doi.org/10.1145/3065386
[11] Li D., Mallinson J., Reddy S., and Lapata M., “Learning to Paraphrase for Question Answering,” in Proceedings of the Empirical Methods in Natural Language Processing Conference, pp. 875-886, Copenhagen, 2017. DOI:10.18653/v1/D17-1091
[12] Li P. and Mao K., “Knowledge-Oriented Convolutional Neural Network for Causal Relation Extraction from Natural Language Texts,” Expert Systems with Application, vol. 115, no. 3, pp. 512-523, 2019. https://doi.org/10.1016/j.eswa.2018.08.009
[13] Li Z., Zhao S., and Ding X., “EEG: Knowledge Base for Event Evolutionary Principles and Patterns,” in Proceedings of the 6th Chinese National Conference on Social Media Processing, Beijing, pp. 40-52, 2017. https://doi.org/10.1007/978-981-10-6805-8_4
[14] Liu T., Cui Y., Yin Q., Zhang W., Wang S., and Hu G., “Generating and Exploiting Large-Scale Pseudo Training Data for Zero Pronoun Resolution,” in Proceedings of the 55th Annual 74 The International Arab Journal of Information Technology, Vol. 21, No. 1, January 2024 Meeting of the Association for Computational Linguistics, Vancouver, pp. 102-111, 2016. DOI:10.18653/v1/P17-1010
[15] Mirza P. and Tonelli S., “An Analysis of Causality between Events and its Relation to Temporal Information,” in Proceedings of the COLING the 25th International Conference on Computational Linguistics, Dublin, pp. 2097-2106, 2014. https://aclanthology.org/C14-1198
[16] Schlichtkrull M., Kipf T., Bloem P., Berg R., Titov I., and Welling M., “Modeling Relational Data with Graph Convolutional Networks,” in Proceedings of the 15th International Conference on European Semantic Web, Heraklion, pp. 593- 607, 2018. https://doi.org/10.1007/978-3-319- 93417-4_38
[17] Soares M. and Parreiras F., “A Literature Review on Question Answering Techniques, Paradigms and Systems,” Journal of King Saud University- Computer and Information Sciences, vol. 32, no. 6, pp. 635-646, 2020. https://doi.org/10.1016/j.jksuci.2018.08.005
[18] Wei J. and Zou K., “EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks,” in Proceedings of the Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, pp. 6382-6388, 2019. https://aclanthology.org/D19-1670
[19] Wu X., Lv S., Zang L., Han J., and Hu S., “Conditional BERT Contextual Augmentation,” in Proceedings of the 19th International Conference on Computational Science, Faro, pp. 84-95, 2019. https://doi.org/10.1007/978-3-030-22747-0_7
[20] Yang B., Yih W., He X., Gao J., and Deng L., “Embedding Entities and Relations for Learning and Inference in Knowledge Bases,” arXiv Preprint, arXiv:1412.6575v2, pp. 1-12, 2015. https://www.semanticscholar.org/reader/8641230 6b777ee35aba71d4795b02915cb8a04c3
[21] Yu A., Dohan D., Luong M., Zhao R., Chen K., Norouzi M., and Le Q., “QANET: Combining Local Convolution with Global Self-Attention for Reading Comprehension,” in Proceedings of the 6th International Conference on Learning Representations, Vancouve, pp. 1-16, 2018. https://openreview.net/pdf?id=B14TlG-RW
[22] Zavalsız M., Alhajj S., Sailunaz K., Ozyer T., and Alhajj R., “A Comparative Study of Different Pre- Trained DeepLearning Models and Custom CNN for Pancreatic Tumor Detection,” The International Arab Journal of Information Technology, vol. 20, no. 3A, pp. 515-526, 2023. https://iajit.org/upload/files/doi-71684824197- 24.pdf
[23] Zhang Y., Liu Z., Li Q., and Zhou W., “Construction of Event-Oriented Chinese Reference Corpus,” Journal of Shanghai University (Natural Science Edition), vol. 24, no. 6, pp. 900-911, 2018. doi:10.12066/j.issn.1007- 2861.1888
[24] Zheng Q., Wu Z., and Zuo J., “Event Causality Extraction Based on Two-Layer CNN-BiGRU- CRF,” Computer Engineering, vol. 47, no. 5, pp. 58-64, 2021. DOI:10.19678/j.issn.1000- 3428.0057361
[25] Zuo X., Cao P., Chen Y., and Lui K., “Learn DA: Learnable Knowledge-Guided Data Augmentation for Event Causality Identification,” arXiv Preprint, arXiv:2106.01649v1, pp. 1-14, 2021. file:///C:/Users/user/Downloads/LearnDA_Learn able_Knowledge-Guided_Data_Augmentati.pdf