The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Review of Agile SDLC for Big Data Analytics Systems in the Context of Small Organizations Using Scrum-XP

Software development using agile System Development Life Cycles (SDLC), such as Scrum and XP, has gained important acceptance for small businesses. Agile approaches eliminate barriers to required organizational, technical, and economic resources usually necessary when rigorous software development approaches, through heavyweight methodologies (e.g., Rational Unified Process (RUP)) or heavyweight international standards (e.g., ISO/IEC 12207) are used. However, despite their high popularity in small businesses, their utilization is scarce in the emergent domain of Big Data Analytics Systems (BDAS). Consequently, small businesses interested in deploying BDAS lack systematic academic guidance regarding agile SDLC for BDAS. This research, thus, addresses this research gap, and reports an updated comparative study of three of the main proposed SDLCs for BDAS (Cross-Industry Standard Process for Data Mining CRISP-DM), Two mains were Microsoft Team Data Science Process (TDSP), and Domino Data Science Lifecycle (DDSL)) in the current BDAS development literature, against a Scrum and Extreme Programming (Scrum-XP) SDLC. For this aim, a Pro Forma of a generic Scrum-XP SDLC is used to examine the conceptual structure, i.e., roles, phases-activities, roles, and work products-of these two SDLCs. Hence, this comparative study provides theoretical and practical insights on agile SDLC for BDAS adequate for small businesses and calls for further conceptual and empirical research to advance toward an agile SDLC for BDAS supported by academia and used in practice.

[1] Abrahamsson P., Oza N., and Siponen M., Agile Software Development: Current Research and Future Directions, Springer, 2010. https://link.springer.com/chapter/10.1007/978-3- 642-12575-1_3

[2] Ahimbisibwe A., Daellenbach U., and Cavana R., “Empirical Comparison of Traditional Plan-Based and Agile Methodologies: Critical Success Factors for Outsourced Software Development Projects from Vendors’ Perspective,” Journal of Enterprise Information Management, vol. 30, no. 3, pp. 400-453, 2017. https://doi.org/10.1108/JEIM-06-2015-0056

[3] Ajah I. and Nweke H., “Big Data and Business Analytics: Trends, Platforms, Success Factors and Applications,” Big Data and Cognitive Computing, vol. 3, no. 2, pp. 1-30, 2019. https://www.mdpi.com/2504-2289/3/2/32

[4] Alsaqqa S., Sawalha S., and Abdel-Nabi H., “Agile Software Development: Methodologies and Trends,” International Journal of Interactive Mobile Technologies, vol. 14, no. 11, pp. 246-270, 2020. https://doi.org/10.3991/ijim.v14i11.13269

[5] Andoh-Baidoo F., Baker E., Susarapu S., and Kasper G., “A Review of IS Research Activities and Outputs Using Pro Forma Abstracts,” Information Resources Management Journal, vol. 20, no. 4, pp. 65-79, 2007. https://www.igi- 1106 The International Arab Journal of Information Technology, Vol. 21, No. 6, November 2024 global.com/article/review-research-activities- outputs-using/1

[6] Andoh-Baidoo F., Chavarria J., Jones M., Wang Y., and Takieddine S., “Examining the State of Empirical Business Intelligence and Analytics Research: A Poly-Theoretic Approach,” Information and Management, vol. 59, no. 6, pp. 103677, 2022. https://doi.org/10.1016/j.im.2022.103677

[7] Batra D., Xia W., VanderMeer D., and Dutta K., “Balancing Agile and Structured Development Approaches to Successfully Manage Large- Distributed Software Projects: A Case Study from the Cruise Line Industry,” Communications of the Association for Information Systems, vol. 27, pp 379-395, 2010. https://aisel.aisnet.org/cais/vol27/iss1/21/

[8] Beck K., “Embracing Change with extreme Programming,” Computer, vol. 32, no. 10, pp. 70- 77, 1999. https://dl.acm.org/doi/10.1109/2.796139

[9] Beck K., Beedle M., Van Bennekum A., Cockburn A., Cunningham W., Fowler M., Grenning J., Highsmith J., Hunt A., Jeffries R., Kern J., Marick B., Martin R., Mellor S., Schwaber K., Sutherland J., and Thomas D., The Agile Manifesto, 2001, https://agilemanifesto.org/, Last Visited, 2042.

[10] Beulke D., Big Data Impacts Data Management: The 5 Vs of Big Data, 2011, https://davebeulke.com/big-data-impacts-data- management-the-five-vs-of-big-data/, Last Visited, 2024.

[11] Boehm B. and Turner R., “Management Challenges to Implementing Agile Processes in Traditional Development Organizations,” IEEE Software, vol. 22, no. 5, pp. 30-39, 2005. https://ieeexplore.ieee.org/document/1504661

[12] Boehm B. and Turner R., “Using Risk to Balance Agile and Plan-Driven Methods,” Computer, vol. 36, no. 6, pp. 57-66, 2003. https://ieeexplore.ieee.org/document/1204376

[13] Boudali I., Chebaane S., and Zitouni Y., “A Predictive Approach for Myocardial Infarction Risk Assessment Using Machine Learning and Big Clinical Data,” Healthcare Analytics, vol. 5, pp. 100319, 2024. https://doi.org/10.1016/j.health.2024.100319

[14] Bourque P. and Fairly R., SWEBOK Version 3.0- Guide to the Software Engineering Body of Knowledge, IEEE, 2014. https://ieeecs- media.computer.org/media/education/swebok/sw ebok-v3.pdf

[15] Campanelli A. and Parreiras F., “Agile Methods Tailoring-A Systematic Literature Review,” Journal of Systems and Software, vol. 110, pp. 85- 100, 2015. https://doi.org/10.1016/j.jss.2015.08.035

[16] Chapman P., Clinton J., Kerber R., Khabaza T., Reinartz T., Shearer C., and Wirth R., CRISP-DM 1.0-Step-by-Step Data Mining Guide, SPSS Inc., 2000. https://mineracaodedados.wordpress.com/wp- content/uploads/2012/12/crisp-dm-1-0.pdf

[17] Conboy K., “Agility from First Principles: Reconstructing the Concept of Agility in Information Systems Development,” Information Systems Research, vol. 20, no. 3, pp. 329-354, 2009. https://pubsonline.informs.org/doi/10.1287/isre.1 090.0236

[18] Cox M. and Ellsworth D., “Managing Big Data for Scientific Visualization,” ACM Sig-Graph, MRJ/NASA Ames Res, Center, vol. 97, no. 1, pp. 21-38, 1997. https://www.researchgate.net/publication/238704 525_Managing_big_data_for_scientific_visualiza tion

[19] Data Science for all, Analytics Solutions Unified Method for Data Mining, IBM, 2015, https://datascienceforall.wordpress.com/data- mining-and-predictive-analytics/, Last Visited, 2024.

[20] Davenport T. and Bean R., Data and AI Leadership Executive Survey, Data and A Leader Ship Executive Survey 2022, https://wwa.wavestone.com/en/insight/data-ai- leadership-executive-survey-2022/, Last Visited, 2024.

[21] Davenport T. and Malone K., “Deployment as a Critical Business Data Science Discipline,” Harvard Data Science Review, vol. 3, no. 1, pp. 1- 11, 2021. https://doi.org/10.1162/99608f92.90814c32

[22] Digital.AI, 16th Annual State of Agile Report, 2022, https://digital.ai/resource-center/analyst- reports/16th-state-of-agile-report/, Last Visited, 2024.

[23] Dingsoyr T., Neru S., Balijepally V., and Moe N., “A Decade of Agile Methodologies: Towards Explaining Agile Software Development,” Journal of Systems and Software, vol. 85, no. 6, pp. 1213-1221, 2012. https://doi.org/10.1016/j.jss.2012.02.033

[24] Domino Data Lab, The Practical Guide to Managing Data Science at Scale (2017), https://domino.ai/resources/managingdatascienc, Last Visited, 2024.

[25] Dudziak T., “Extreme Programming an Overview,” Methoden und Werkzeuge der Softwareproduktion WS, vol. 1, no. 28, pp. 1-28, 2000. https://csis.pace.edu/~marchese/CS616/Agile/XP /XP_Overview.pdf

[26] Dyba T. and Dingsøyr T., “Empirical studies of Agile Software Development: A Systematic Review of Agile SDLC for Big Data Analytics Systems in the Context of Small ... 1107 Review,” Information and Software Technology, vol. 50, no. 9-10, pp. 833-859, 2008. https://doi.org/10.1016/j.infsof.2008.01.006

[27] Fayyad U., Haussler D., and Stolorz P., “KDD for Science Data Analysis: Issues and Examples,” in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, pp. 50-56, 1996. https://dl.acm.org/doi/abs/10.5555/3001460.3001 471

[28] Giray G., “A Software Engineering Perspective on Engineering Machine Learning Systems: State of the Art and Challenges,” Journal of Systems and Software, vol. 180, pp. 111031, 2021. https://doi.org/10.1016/j.jss.2021.111031

[29] Grey E., Jennings W., Farrall S., and Hay C., “Small Big Data: Using Multiple Data-Sets to Explore Unfolding Social and Economic Change,” Big Data and Society, vol. 2, no. 1, 2015. https://doi.org/10.1177/2053951715589418

[30] Haakman M., Cruz L., Huijgens H., and Van Deursen A., “AI Lifecycle Models Need to be Revised: An Exploratory Study in Fintech,” Empirical Software Engineering, vol. 26, no. 5, pp. 1-29, 2021. https://doi.org/10.1007/s10664- 021-09993-1

[31] Halper F., “Next-Generation Analytics and Platforms for Business Success,” TDWI, Research Report, 2015. https://tdwi.org/webcasts/2015/01/nextgeneration -analyticsand-platforms-for-business- success.aspx

[32] Highsmith J. and Cockburn A., “Agile Software Development: The Business of Innovation,” Computer, vol. 34, no. 9, pp. 120-27, 2001. https://ieeexplore.ieee.org/document/947100

[33] Hobbs B. and Petit Y., “Agile Methods on Large Projects in Large Organizations,” Project Management Journal, vol. 48, no. 3, 3-19, 2017. https://doi.org/10.1177/875697281704800301

[34] Hoda R., Salleh N., and Grundy J., “The Rise and Evolution of Agile Software Development,” IEEE Software, vol. 35, no. 5, pp. 58-63, 2018. https://ieeexplore.ieee.org/document/8409911

[35] Iranmanesh M., Lim K., Foroughi B., Hong M., and Ghobakhloo M., “Determinants of Intention to Adopt Big Data and Outsourcing among SMEs: Organizational and Technological Factors as Moderators,” Management Decision, vol. 61, no. 1, pp. 201-222, 2023. https://doi.org/10.1108/MD-08-2021-1059

[36] Jones M., Big Data is a ‘New Natural Resource’ IBM Says, 2012, http://www.govtech.com/policymanagement/Big DataIsaNewNaturalResourceIBMSays.html, Last Visited, 2024.

[37] Kitchenham B., Brereton O., Budgen D., Turner M., Bailey J., and Linkman S., “Systematic Literature Reviews in Software Engineering-A Systematic Literature Review,” Information and Software Technology, vol. 51, no. 1, pp. 7-15, 2009. https://doi.org/10.1016/j.infsof.2008.09.009

[38] Kitchin R. and Lauriault T., “Small Data in the Era of Big Data,” Geo Journal, vol. 80, no. 4, pp. 463- 475, 2015. https://www.jstor.org/stable/44076310

[39] Klotins E., Unterkalmsteiner M., Chatzipetrou P., Gorschek T., Prikladnicki R., Tripathi N., and Pompermaier L., “Use of Agile Practices in Start- Up Companies,” e-Informatica Software Engineering Journal, vol. 15, no. 1, 2021. DOI:10.37190/e-Inf210103

[40] Kumar V. and Alencar P., “Software Engineering for Big Data Projects: Domains, Methodologies and Gaps,” in Proceedings of the IEEE International Conference on Big Data, Washington (DC), pp. 2886-2895, 2016. DOI:10.1109/BigData.2016.7840938

[41] Kune R., Konugurthi P., Agarwal A., Chillarige R., and Buyya R., “The Anatomy of Big Data Computing,” Software: Practice and Experience, vol. 46, no. 1, pp. 79-105, 2016. https://onlinelibrary.wiley.com/doi/10.1002/spe.2 374

[42] Laigner R., Kalinowski M., Lifschitz S., Monteiro R., and De Oliveira D., “A Systematic Mapping of Software Engineering Approaches to Develop Big Data Systems,” in Proceedings of the 44th Euromicro Conference on Software Engineering and Advanced Applications, Prague, pp. 446-453, 2018. DOI:10.1109/SEAA.2018.00079

[43] Laporte C. and O’Connor R., “Systems and Software Engineering Standards for very Small Entities: Accomplishments and Overview,” Computer, vol. 49, no. 8, pp. 84-87, 2016. https://ieeexplore.ieee.org/document/7543423

[44] Larson D. and Chang V., “A Review and Future Direction of Agile, Business Intelligence, Analytics and Data Science,” International Journal of Information Management, vol. 36, no. 5, pp. 700-710, 2016. https://doi.org/10.1016/j.ijinfomgt.2016.04.013

[45] Lin Y. and Huang S., “The Design of a Software Engineering Lifecycle Process for Big Data,” IT Professional, vol. 20, no. 1, pp. 45-52, 2018. DOI:10.1109/MITP.2018.011291352

[46] Lukoianova T. and Rubin V, “Veracity Roadmap: Is Big Data Objective, Truthful and Credible?,” Advances in Classification Research Online, vol. 24, no. 1, pp. 4-15, 2014. https://journals.lib.washington.edu/index.php/acr o/article/view/14671

[47] Madhavji N., Miranskyy A., and Kontogiannis K., “Big Picture of Big Data Software Engineering: with Example Research Challenges,” in Proceedings of the IEEE/ACM 1st International 1108 The International Arab Journal of Information Technology, Vol. 21, No. 6, November 2024 Workshop on Big Data Software Engineering, Florence, pp. 11-14, 2015. DOI: 10.1109/BIGDSE.2015.10

[48] Magdaleno A., Werner C., and De Araujo R., “Reconciling Software Development Models: A Quasi-Systematic Review,” Journal of Systems and Software, vol. 85, no. 2, pp. 351-369, 2012. https://doi.org/10.1016/j.jss.2011.08.028

[49] Martinez I., Viles E., and Olaizola I., “Data Science Methodologies: Current Challenges and Future Approaches,” Big Data Research, vol. 24, pp. 100183, 2021. https://doi.org/10.1016/j.bdr.2020.100183

[50] Martinez-Plumed F., Contreras-Ochando L., Ferri C., Hernandez-Orallo J., Kull M., Lachiche N., Ramirez-Quintana M., and Flach P., “CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories,” IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 8, pp. 3048-3061, 2021. DOI:10.1109/TKDE.2019.2962680

[51] Microsoft Learn, What is the Team Data Science Process?, https://docs.microsoft.com/enus/azure/machinele arning/teamdatascienceprocess/overview, Last Visited, 2024.

[52] Montoya-Murillo D., Mora M., Galvan-Cruz S., and Munoz-Zavala A., Development Methodologies for Big Data Analytics Systems: Plan-driven, Agile, Hybrid, Lightweight Approaches, Springer, 2023. https://link.springer.com/chapter/10.1007/978-3- 031-40956-1_5

[53] Mora M., Adelakun O., Galvan-Cruz S., and Wang F., “Impacts of IDEF0-Based Models on the Usefulness, Learning, and Value Metrics of Scrum and XP Project Management Guides,” Engineering Management Journal, vol. 34, no. 4, pp. 574-590, 2021. https://doi.org/10.1080/10429247.2021.1958631

[54] Mora M., Adelakun O., Reyes-Delgado P., and Diaz O., “AVS_FD_MVITS: An Agile IT Service Design Workflow for Small Data Centers,” The Journal of Supercomputing, vol. pp. 17519- 17561, 2023. https://link.springer.com/article/10.1007/s11227- 023-05244-w

[55] Mora M., Reyes-Delgado P., Galvan-Cruz S., and Solano-Romo L., Development Methodologies for Big Data Analytics Systems: Plan-driven, Agile, Hybrid, Lightweight Approaches, Springer, 2024, https://link.springer.com/chapter/10.1007/978-3- 031-40956-1_1

[56] Mora M., Wang F., Gomez J., and Diaz O., Trends and Applications in Software Engineering, Springer, 2020. https://link.springer.com/chapter/10.1007/978-3- 030-33547-2_9

[57] Oussous A., Benjelloun F., Lahcen A., and Belfkih S., “Big Data Technologies: A Survey,” Journal of King Saud University-Computer and Information Sciences, vol. 30, no. 4, pp. 431-448, 2018. https://doi.org/10.1016/j.jksuci.2017.06.001

[58] Paakkonen P. and Pakkala D., “Reference Architecture and Classification of Technologies, Products and Services for Big Data Systems,” Big Data Research, vol. 2, no. 4, 166-186, 2015. https://doi.org/10.1016/j.bdr.2015.01.001

[59] Phillips-Wren G., Daly M., and Burstein F., “Reconciling Business Intelligence, Analytics and Decision Support Systems: More Data, Deeper Insight,” Decision Support Systems, vol. 146, pp. 113560, 2021. https://doi.org/10.1016/j.dss.2021.113560

[60] Pino F., Pedreira O., García F., Luaces M., and Piattini M., “Using Scrum to Guide the Execution of Software Process Improvement in Small Organizations,” Journal of Systems and Software, vol. 83, no. 10, 1662-1677, 2010. https://doi.org/10.1016/j.jss.2010.03.077

[61] Pollack J., Helm J., and Adler D., “What is the Iron Triangle, and how has it Changed?,” International Journal of Managing Projects in Business, vol. 11, no. 2, pp. 527-547, 2018. https://www.emerald.com/insight/content/doi/10. 1108/IJMPB-09-2017-0107/full/html

[62] Qumer A. and Henderson-Sellers B., “An Evaluation of the Degree of Agility in Six Agile Methods and its Applicability for Method Engineering,” Information and Software Technology, vol. 50, no. 4, pp. 280-295, 2008. https://doi.org/10.1016/j.infsof.2007.02.002

[63] Ransbotham S., Khodabandeh S., Kiron D., Candelon F., Chu M., and LaFountain B., “Expanding AI’s Impact with Organizational Learning,” MIT Sloan Management Review and Boston Consulting Group, pp. 1-15, 2020. https://sinnergiak.org/2021/01/18/ampliando-el- impacto-de-la-ia-con-el-aprendizaje- organizacional/?lang=en

[64] Rao T., Mitra P., Bhatt R., and Goswami A., “The Big Data System, Components, Tools, and Technologies: A Survey,” Knowledge and Information Systems, vol. 60, no. 3, pp. 1165- 1245, 2019. https://link.springer.com/article/10.1007/s10115- 018-1248-0

[65] Russom P., Big Data Analytics, 2011, https://origin-tableau- www.tableau.com/sites/default/files/whitepapers/ tdwi_bpreport_q411_big_data_analytics_tableau. pdf, Last Visited, 2024.

[66] Salazar-Salazar G., Mora M., Duran-Limon H., and Rodriguez F., Development Methodologies for Big Data Analytics Systems: Plan-driven, Review of Agile SDLC for Big Data Analytics Systems in the Context of Small ... 1109 Agile, Hybrid, Lightweight Approaches, Springer, 2023. https://link.springer.com/chapter/10.1007/978-3- 031-40956-1_6

[67] Saltz J. and Krasteva I., “Current Approaches for Executing Big Data Science Projects-A Systematic Literature Review,” PeerJ Computer Science, vol. 8, pp. 862, 2022. https://doi.org/10.7717/peerj-cs.862

[68] Saltz J. and Shamshurin I., “Big Data Team Process Methodologies: A Literature Review and the Identification of Key Factors for a Project’s Success,” in Proceedings of the IEEE International Conference on Big Data, Washington (DC), pp. 2872-2879, 2016. DOI:10.1109/BigData.2016.7840936

[69] Saltz J., Data Driven Scrum, 2022, https://www.datascience-pm.com/data-driven- scrum/, Last Visit, 2024.

[70] Schryen G., “Writing Qualitative IS Literature Reviews-Guidelines for Synthesis, Interpretation, and Guidance of Research,” Communications of the Association for Information Systems, vol. 37, no. 1, pp. 286-325, 2015. https://aisel.aisnet.org/cais/vol37/iss1/12/

[71] Schwaber K. and Mar K., Scrum with XP (2002), https://www.informit.com, Last Visited, 2024.

[72] Schwaber K. and Sutherland J., The Scrum Guide (2020), https://scrumguides.org/, Last Visited, 2024.

[73] Schwaber K., “Scrum Development Process,” in Proceedings of the Business Object Design and Implementation, Austin, pp. 117-134, 1997. https://link.springer.com/chapter/10.1007/978-1- 4471-0947-1_11

[74] Sutherland J., The Scrum Handbook, Scrum Training Institute Press, 2010. https://www.researchgate.net/publication/301685 699_Jeff_Sutherland's_Scrum_Handbook

[75] Taranum A., Metan J., Yogegowda P., and Krishnappa C., “Canine Disease Prediction using Multi-Directional Intensity Proportional Pattern with Correlated Textural Neural Network,” The International Arab Journal of Information Technology, vol. 21, no. 5, pp. 899-914, 2024. doi:10.34028/iajit/21/5/11

[76] Tell P., Klunder J., Kupper S., Raffo D., MacDonell S., Munch J., Pfahl D., Linssen O., and Kuhrmann M., “Towards the Statistical Construction of Hybrid Development Methods,” Journal of Software: Evolution and Process, vol. 33, no. 1, pp. 2315, 2021. https://doi.org/10.1002/smr.2315

[77] Todman L., Bush A., and Hood A., “Small Data’ for Big Insights in Ecology,” Trends in Ecology and Evolution, vol. 38, no. 7, pp. 615-622, 2023. https://www.cell.com/trends/ecology- evolution/fulltext/S0169-5347(23)00019-8

[78] Tsai C., Lai C., Chao H., and Vasilakos A., “Big Data Analytics: A Survey,” Journal of Big Data, vol. 2, no. 1, pp. 1-32, 2015. https://doi.org/10.1186/s40537-015-0030-3

[79] Tsoy M. Staples D., “What are the Critical Success Factors for Agile Analytics Projects?,” Information Systems Management, vol. 38, no. 4, pp. 324-341, 2021. https://doi.org/10.1080/10580530.2020.1818899

[80] Vallon R., Da Silva Estacio B., Prikladnicki R., and Grechenig T., “Systematic Literature Review on Agile Practices in Global Software Development,” Information and Software Technology, vol. 96, pp. 161-180, 2018. https://doi.org/10.1016/j.infsof.2017.12.004

[81] Walker J., Big Data Strategies Disappoint with 85 Percent Failure Rate, Digital Journal, 2017, https://www.digitaljournal.com/tech- science/big- data-strategies-disappoint-with-85-percent- failure- rate/article/508325, Last Visited, 2024.

[82] Watson H., “Update Tutorial: Big Data Analytics: Concepts, Technology, and Applications,” Communications of the Association for Information Systems, vol. 44, pp. 364-379, 2019. https://aisel.aisnet.org/cais/vol44/iss1/21/

[83] Wohlin C., Runeson P., Host M., Ohlsson M., Regnell B., and Wesslen A., Experimentation in Software Engineering, Springer, 2012. https://dl.acm.org/doi/book/10.5555/2349018

[84] Zdrenka W., “The Use and the Future of Big Data Analytics in Supply Chain Management,” Research in Logistics and Production, vol. 7, no. 2, pp. 91-102, 2017. https://sin.put.poznan.pl/publications/details/i321 87