The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Machine Learning Based Prediction of Complex Bugs in Source Code

During software development and maintenance phases, the fixing of severe bugs are mostly very challenging and needs more efforts to fix them on a priority basis. Several research works have been performed using software metrics and predict fault-prone software module. In this paper, we propose an approach to categorize different types of bugs according to their severity and priority basis and then use them to label software metrics’ data. Finally, we used labeled data to train the supervised machine learning models for the prediction of fault prone software modules. Moreover, to build an effective prediction model, we used genetic algorithm to search those sets of metrics which are highly correlated with severe bugs.


[1] Ahsan S. and Wotawa F., “Fault Prediction Capability of Program File’s Logical-Coupling Metrics,” in Proceedings of Software Measurement, Joint Conference of the 21st Int'l Workshop on and 6th Int'l Conference on Software Process and Product Measurement, Nara, pp. 257-262, 2011.

[2] Catala C. and Diri B., “A Systematic Review of Software Fault Prediction Studies,” Expert Systems with Applications, vol. 36, no. 4, pp. 7346-7354, 2009. 36 The International Arab Journal of Information Technology, Vol. 17, No. 1, January 2020

[3] Cotroneoa D., Pietrantuono R., Russo S., and Trivedi K., “How Do Bugs Surface? A Comprehensive Study on The Characteristics of,” Journal of Systems and Software, vol. 113, pp. 27-43, 2016.

[4] D’Ambros M., Lanza M., and Robbe R., “Evaluating Defect Prediction Approaches,” Empirical Software Engineering, An International Journal, vol. 17, no. 4-5, pp. 531- 577, 2012.

[5] D’Ambros M., Lanza M., and Robbe R., “An Extensive Comparison of Bug Prediction Approaches,” in Proceedings of 7th IEEE Working Conference on Mining Software Repositories, Cape Town, pp. 31-41, 2010.

[6] Gyimothy T., Ferenc R., and Siket I., “Empirical Validation Of Object-Oriented Metrics on Open Source Software for Fault Prediction,” IEEE Transactions on Software Engineering (IEEE Computer Society), vol. 31, no. 10, pp. 897-910, 2005.

[7] Hall T., Beecham S., Bowes D., Gray D., and Counsell S., “A Systematic Literature Review on Fault Prediction Performance in Software Engineering,” Software Engineering, IEEE Transactions (IEEE Computer Society), vol. 38, no. 6, pp. 1276-1304, 2012.

[8] Hassan A., “Predicting Faults Using the Complexity of Code Changes,” in Proceedings of the 31st International Conference on Software Engineering, Vancouver, pp. 78-88, 2009.

[9] Hassan A. and Holt R., “The Top Ten List: Dynamic Fault Prediction, ” in Proceedings of the 21st IEEE International Conference on Software Maintenance, Budapest, pp. 263-272, 2005.

[10] Jeon C., Kim N., and In H., “A Probabilistic Approach to Building Defect Prediction Model for Platform-based Product Lines,” The International Arab Journal of Information Technology, vol. 14, no. pp. 413-422, 2017.

[11] Jiang Y., Cukic B., Menzies T., and Lin J., “Incremental Development of Fault Prediction Models,” International Journal of Software Engineering and Knowledge Engineering (World Scientic Publishing Company), vol. 23, no. 10, pp. 1399-1425, 2013.

[12] Kamei Y. and Shihab E., “Defect Prediction: Accomplishments and Future Challenges,” in Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, Suita, pp. 33-45, 2016.

[13] Lamkanfi A., Demeyer S., Soetens Q., and Verdonck T., “Comparing Mining Algorithms for Predicting the Severity of a Reported Bug,” in Proceedings of the 15th European Conference on Software Maintenance and Reengineering, Oldenburg, pp. 249-258, 2011.

[14] Madeyski L. and Jureczko M., “Which Process Metrics Can Significantly Improve Defect Prediction Models? An Empirical Study,” Software Quality Journal, vol. 23, no. 3, pp. 393- 422, 2015.

[15] Menzies T. and Marcus A., “Automated Severity Assessment of Software Defect Reports,” in Proceedings of IEEE International Conference on Software Maintenance, ICSM. Software Maintenance, Beijing, pp. 346-355, 2008.

[16] Nagappan N. and Ball T., “Use of Relative Code Churn Measures to Predict System Defect Density,” in Proceedings of the 27th International Conference on Software Engineering, St. Louis, pp. 284-292, 2005.

[17] Nagappan N., Ball T., and Zeller A., “Mining Metrics to Predict Component Failures,” in Proceedings of the 28th International Conference on Software Engineering, Shanghai, pp. 452-461, 2006.

[18] Prusa J., Khoshgoftaar T., and Seliya N., “Enhancing Ensemble Learners with Data Sampling on High-Dimensional Imbalanced Tweet Sentiment Data,” in Proceedings of the 29th International Flairs Conference, Key Largo, pp. 322-327, 2016.

[19] Sharma M., Kumari M., and Singh V., “Understanding the Meaning of Bug Attributes and Prediction Models,” in Proceedings of the 5th IBM Collaborative Academia Research Exchange Workshop, New Delhi, 2013.

[20] Shatnawi R. and Li W., “An Empirical Investigation of Predicting Fault Count, Fix Cost and Effort Using Software Metrics,” International Journal of Advanced Computer Science and Applications, vol. 7, no. 2, pp. 484- 491, 2016.

[21] Subramanyam R. and Krishnan M., “Empirical Analysis of Ck Metrics for Object-Oriented Design Complexity: Implications for Software Defects,” IEEE Transactions on Software Engineering (IEEE Computer Society), vol. 29, no. 4, pp. 297-310, 2003.

[22] Thung F., Wang S., Lo D., and Jiang L., “An Empirical Study of Bugs in Machine Learning Systems,” in Proceedings of 23rd International Symposium on Software Reliability Engineering, Dallas, pp. 271-280, 2012.

[23] Tian Y., Lo D., and Sun C., “Information Retrieval Based Nearest Neighbor Classification for Fine-Grained Bug Severity Prediction,” in Proceedings of 19th Working Conference on Reverse Engineering, Kingston, pp. 215-224, 2012.

[24] Van Hulse J., Khoshgoftaar T., and Napolitano A., “Experimental Perspectives on Learning From Imbalanced Data. ” in Proceedings of the 24th International Conference on Machine Machine Learning Based Prediction of Complex Bugs in Source Code 37 Learning, Corvalis, pp. 935-942, 2007.

[25] Xuan J., Jiang H., Ren Z., and Zou W., “Developer Prioritization in Bug Repositories,” in Proceedings of the 34th International Conference on Software Engineering, Zurich, pp. 25-35, 2012.

[26] Yu L. and Mishra A., “Experience in Predicting Fault-Prone Software Modules Using Complexity Metrics,” Quality Technology and Quantitative Management, vol. 9, no. 4, pp. 421-433, 2012.

[27] Zhang W., Sun C., and Lu S., “ConMem: Detecting Severe Concurrency Bugs through an Effect-Oriented Approach,” in Proceedings of the 15th Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, Pittsburgh, pp. 179-192, 2010.

[28] Zimmermann T., Nagappan N., Gall H., Giger E., and Murphy B., “Cross-Project Defect Prediction: A Large Scale Experiment on Data Vs. Domain Vs. Process,” in Proceedings 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, Amsterdam, pp. 91-100, 2009.

[29] Zimmermann T., Nagappan N., Guo P., and Murphy B., “Characterizing and Predicting Which Bugs Get Reopened, ” in Proceedings 34th International Conference on Software Engineering, Zurich, pp. 1074-1083, 2012. Ishrat-Un-Nisa Uqaili is a Final year student of M.S (Computer Science) at Faculty of Engineering Science and Technology (FEST), Iqra University (IU), Defence View (Main Campus), Shaheed-e-Millat Road (Ext.) Karachi-75500, Pakistan. She did B.E in Computer Systems from Mehran University of Engineering & Technology, Jamshoro, Pakistan. Her research interests include machine learning application in Software Engineering, and build models for automatic software maintenance. She has recently completed her MS final year thesis on Fault Prediction Model for Software using Soft Computing Techniques. Syed Nadeem Ahsan did his Ph.D. in Computer Science from GRAZ University of Technology, Austria. Currently, he is doing R&D work in software engineering and machine learning applications, and also associated with FEST, IU, Main Campus, Karachi. His Research interest includes software maintenance & evolution, software testing, formal methods in software engineering, modeling and simulation of complex system, and computational intelligence.