The International Arab Journal of Information Technology (IAJIT)


Headnote Prediction Using Machine Learning Sarmad Mahar1, Sahar Zafar2, and Kamran Nishat1 1CoCIS, PAF-Karachi Institute of Economics and Technology, Pakistan 2Computer Science, Sindh Madressatul Islam University, Pakistan

Headnotes are the precise explanation and summary of legal points in an issued judgment. Law journals hire experienced lawyers to write these headnotes. These headnotes help the reader quickly determine the issue discussed in the case. Headnotes comprise two parts. The first part comprises the topic discussed in the judgment, and the second part contains a summary of that judgment. In this thesis, we design, develop and evaluate headnote prediction using machine learning, without involving human involvement. We divided this task into a two steps process. In the first step, we predict law points used in the judgment by using text classification algorithms. The second step generates a summary of the judgment using text summarization techniques. To achieve this task, we created a Databank by extracting data from different law sources in Pakistan. We labelled training data generated based on Pakistan law websites. We tested different feature extraction methods on judiciary data to improve our system. Using these feature extraction methods, we developed a dictionary of terminology for ease of reference and utility. Our approach achieves 65% accuracy by using Linear Support Vector Classification with tri- gram and without stemmer. Using active learning our system can continuously improve the accuracy with the increased labelled examples provided by the users of the system.

[1] Arewa O., “Open Access in a Closed Universe: Lexis, Westlaw, Law Schools, and the Legal Information Market,” Lewis and Clark Law Review, vol. 10, no. 4, pp. 797-836, 2006.

[2] Chieze E., Farzindar A., and Lapalme G., Semantic Processing of Legal Texts, Springer, 2010.

[3] Colas F. and Brazdil P., “On the Behaviour of SVM and Some Older Algorithms in Binary Text Classification Tasks,” in Proceeding of 684 The International Arab Journal of Information Technology, Vol. 18, No. 5, September 2021 International Conference on Text, Speech and Dialogue, Brno, pp. 45-52, 2006.

[4] Deesomsak R., Paudyal K., and Pescetto G., “Durham Research Online Woodlands,” Critical Studies on Security, vol. 2, no. 2, pp. 210-222, 2014.

[5] Farzindar A. and Lapalme G., “Legal Text Summarization by Exploration of the Thematic Structure and Argumentative Roles,” in Proceeding of Text Summarization Branches out Conference Held in Conjunction with ACL, Barcelona, pp. 27-38, 2004.

[6] Gui T., Ye J., Zhang Q., Li Z., Fei Z., Gong Y., and Huang X., “Uncertainty-Aware Label Refinement for Sequence Labelling,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 2316-2326, 2020.

[7] Headnote Legal Definition of Headnote. (n.d.) from https://legal-, Last Visited, 2021.

[8] Indeck I., Louis S., Us M., Indeck D., and Ranch H., (12) United States Patent (10) Patent No.: 2 (12), 2012.

[9] Kanapala A., Pal S., and Pamula R., “Text Summarization from Legal Documents: A Survey,” Artificial Intelligence Review, vol. 51, no. 1, pp. 371-402, 2019.

[10] Klug H. and Merry S., The New Legal Realism, Cambridge University Press, 2016.

[11] Koniaris M., Anagnostopoulos I., and Vassiliou Y., “Evaluation of Diversification Techniques for Legal Information Retrieval,” Algorithms, vol. 10, no. 1, pp. 1-24, 2017.

[12] Kumar S., Reddy P., Reddy V., and Singh A., “Similarity Analysis of Legal Judgments,” in Proceedings of the 4th Annual ACM Bangalore Conference, New York, pp. 1-4, 2011.

[13] Liu M., Li W., Wu M., and Hu J., “Event- based Extractive Summarization Using Event Semantic Relevance from External Linguistic Resource,” in Proceedings of 6th International Conference on Advanced Language Processing and Web Information Technology, Luoyang, pp. 117-122, 2007.

[14] Mahmoud A. and Zrigui M., “Semantic Similarity Analysis for Corpus Development and Paraphrase Detection in Arabic” The International Arab Journal of Information Technology, vol. 18, no. 1, pp. 1-7, 2021.

[15] Megala S., “A Legal Ontology based Judgement Summarization System Using Fuzzy Logic and Conditional Random Field Algorithm,” International Journal of Computer Networks and Wireless Communications, vol. 8, no. 4, pp. 93- 102, 2018.

[16] Moses R. and Mohamad M., “Challenges Faced by Students and Teachers on Writing Skills in ESL Contexts: A Literature Review,” Creative Education, vol. 10, no. 13, pp. 3385-3391, 2019.

[17] Nedelchev R., Lehmann J., and Usbeck R., “Language Model Transformers as Evaluators for Open-domain Dialogues,” in Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, pp. 6797-6808, 2020.

[18] Paulus R., Xiong C., and Socher R., “A Deep Reinforced Model for Abstractive Summarization,” in Proceeding of 6th International Conference on Learning Representations, ICLR-Conference Track Proceedings, pp. 1-12, 2018.

[19] Polsley S., Jhunjhunwala P., and Huang R., “Case Summarizer: A System for Automated Summarization of Legal Texts,” in Proceedings of COLING the 26th International Conference on Computational Linguistics: System Demonstrations, Osaka, pp. 258-262, 2016.

[20] Sulea O., Zampieri M., Malmasi S., Vela M., Dinu L., and Van Genabith J., “Exploring The Use of Text Classification in The Legal Domain,” in Proceedings of the 2nd Workshop on Automated Semantic Analysis of Information in Legal Texts, London, 2017.

[21] Xiao C., Zhong H., Guo Z., Tu C., Liu Z., Sun M., Feng Y., Han X., Hu Z., Wang H., and Xu J., “CAIL2018: A Large-Scale Legal Dataset for Judgment Prediction,” ArXiv, 2018.

[22] Yamada H., Teufel S., and Tokunaga T., “Building a Corpus of Legal Argumentation in Japanese Judgement Documents: Towards Structure-Based Summarisation” Artificial Intelligence and Law, vol. 27, no. 2, pp.141-170, 2019.

[23] Zhu K., Guo R., Hu W., Li Z., and Li Y., “Legal Judgment Prediction Based on Multiclass Information Fusion,” Complexity, 2020. Headnote Prediction Using Machine Learning 685 Sarmad Mahar Received MS (CS) Degree from PAF-Karachi Institute of Economics and Technology. His area of research interest includes Artificial intelligence, information processing, pattern recognition and Natural Language Processing. Sahar Zafar Jumani Pursuing PhD at the University of Karachi. Department of computer science. Currently working as Lecturer at public sector Sindh Madressatul Islam University (SMIU). Her area of research is Natural Language Processing, Artificial intelligence. Kamran Nishat Assistant Professor at PAF-Karachi Institute of Economics and Technology. Pursuing Postdoctoral Researcher at the University of Waterloo.