The International Arab Journal of Information Technology (IAJIT)


Induction of Co-existing Items Available in Distributed Version Control Systems for Software Development

Sibel Özyer,

Software development in Open-Source Software systems (OSS) allow developers to share their code and modify other developers' code. That leads to collaboration in the development. They can either discuss on the items to be developed, including the errors and technical problems that were faced. One popular OSS platform is github which already has a large number of developers and projects. The data residing in the issues part of github is sufficiently large, complex and unstructured. It could be processed to find novel discoveries. This work concentrates on one selected project to be analyzed systematically. Routine Extract, Transform and Load (ETL) steps have been identified to clean the data before applying natural language processing for prioritizing and taking actions for the requirements. In a collaborative environment. Our work uses terms and guides developers for tracking the co-occurrence of the terms used together to help them focus on the important issues.

[1] Androutsellis-Theotokis S., Spinellis D., Kechagia M., and Gousios G., “Open Source Software: A Survey from 10,000 Feet,” Foundations and Trends® in Technology, Information and Operations Management, vol. 4, no. 3-4, pp. 187-347, 2011. DOI:10.1561/0200000026

[2] Castro-Herrera C., Cleland-Huang J., and Mobasher B., “Enhancing Stakeholder Profiles to Improve Recommendations in Online Requirements Elicitation,” in Proceeding of the 17th IEEE International Requirements Engineering Conference, Atlanta, pp. 37-46, 2009. doi: 10.1109/RE.2009.20.

[3] Fox C., “A Stop List for General Text,” ACM SIGIR Forum, vol. 24, no. 1-2, pp. 19-21, 1989.

[4] Glinz M. and Wieringa R., “Guest Editors' Introduction: Stakeholders in Requirements Engineering,” IEEE Software, vol. 24, no. 2, pp. 18-20, 2007.

[5] Hars A. and Ou S., “Working for Free? Motivations for Participating in Open-Source Projects,” International Journal of Electronic Commerce, vol. 6, no. 3, pp. 25-39, 2002. doi: 10.1109/HICSS.2001.927045.

[6] Kaushik M., Sharma R., Peious S., Shahin M., Yahia SB., and Draheim D., “A Systematic Assessment of Numerical Association Rule Mining Methods,” SN Computer Science, vol. 2 Induction of Co-existing Items Available in Distributed Version Control Systems ... 879 no. 5. pp. 348, 2021.

[7] Khan H., Niazi M., El-Attar M., Ikram N., Khan S., and Gill A., “Empirical Investigation of Critical Requirements Engineering Practices for Global Software Development,” IEEE Access, vol. 9, pp. 93593-613, 2021. doi: 10.1109/ACCESS.2021.3092679.

[8] Kim J., Wi J., and Kim Y., “Sequential Recommendations on GitHub Repository,” Applied Sciences, vol. 11, no. 4, pp. 1585, 2021.

[9] Mahmoud A. and Zrigui M., “Semantic Similarity Analysis for Corpus Development and Paraphrase Detection,” The International Arab Journal of Information Technology, vol. 18, no. 1, pp. 1-7, 2021.

[10] Morisaki S., Monden A., Matsumura T., Tamada H., and Matsumoto KI., “Defect Data Analysis Based on Extended Association Rule Mining,” in Proceedings of the 4th International Workshop on Mining Software Repositories, Minneapolis, pp. 3-3, 2007. doi: 10.1109/MSR.2007.5.

[11] Nawaz S., Zai A., Imtiaz S., and Ashraf H., “Systematic Literature Review: Causes of Rework in GSD,” The International Arab Journal of Information Technology, vol. 19, no. 1, pp. 97- 109, 2022.

[12] Portugal R., Li T., Silva L., Almentero E., Leite J., “Nfrfinder: A Knowledge Based Strategy for Mining Non-Functional Requirements,” in Proceedings of the XXXII Brazilian Symposium on Software Engineering, Sao Carlos Brazil, pp. 102- 111, 2018.

[13] Ray B., Posnett D., Filkov V., and Devanbu P., “A Large Scale Study of Programming Languages and Code Quality in Github,” in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, pp. 155-165, 2014.

[14] Sharma M., Kumari M., and Singh V., “Bug Assignee Prediction Using Association Rule Mining,” in Proceedings of the International Conference on Computational Science and Its Applications, Banff, pp. 444-457, 2015.

[15] Sonbol R., Rebdawi G., and Ghneim N., “Towards a Semantic Representation for Functional Software Requirements,” in Proceedings of the IEEE 7th International Workshop on Artificial Intelligence for Requirements Engineering, Zurich, pp. 1-8, 2020. doi: 10.1109/AIRE51212.2020.00007.

[16] Xiao W., He H., Xu W., Tan X., Dong J., and Zhou M., “Recommending Good First Issues in GitHub OSS Projects,” in Proceedings of the 44th International Conference on Software Engineering, pp. 1830-1842, 2022.

[17] Yang Y., Xia X., Lo D., Bi T., Grundy J., and Yang X., “Predictive Models in Software Engineering: Challenges and Opportunities,” ACM Transactions on Software Engineering and Methodology, vol. 31, no. 3, pp. 1-72, 2022.

[18] Ziora L., “Natural Language Processing in the Support of Business Organization Management,” in Proceedings of SAI Intelligent Systems Conference, Amsterdam, pp. 76-83, 2021.

[19] Zolkifli N., Ngah A., and Deraman A., “Version Control System: A Review,” Procedia Computer Science, vol. 135, pp. 408-15, 2018. DOI:10.1016/j.procs.2018.08.191 Sibel Özyer received her BSc and MSc from Cankaya University, and PhD degree from Atilim University. She is currently assistant professor at Ankara Medipol University. Her research interests are social networks, computer networks, data mining, internet of things and cloud computing.