The International Arab Journal of Information Technology (IAJIT)


An Efficient Intrusion Detection Framework Based

Network security has emerged as a crucial universal issue that affects enterprises, governments, and individuals. The strategies utilized by the attackers are continuing to evolve, and therefore the rate of attacks targeting the network system has expanded dramatically. An Intrusion Detection System (IDS) is one of the significant defense solutions against sophisticated cyberattacks. However, the challenge of improving the accuracy, detection rate, and minimal false alarms of the IDS continues. This paper proposes a robust and effective intrusion detection framework based on the ensemble learning technique using eXtreme Gradient Boosting (XGBoost) and an embedded feature selection method. Further, the best uniform feature subset is extracted using the up-to-date real-world intrusion dataset Canadian Institute for Cybersecurity Intrusion Detection (CICIDS2017) for all attacks. The proposed IDS framework has successfully exceeded several evaluations on a big test dataset over both multi and binary classification. The achieved results are promising on various measurements with an accuracy overall, precision, detection rate, specificity, F-score, false-negative rate, false-positive rate, error rate, and The Area Under the Curve (AUC) scores of 99.86%, 99.69%, 99.75%, 99.69%, 99.72%, 0.17%, 0.2%, 0.14%, and 99.72 respectively for abnormal class. Moreover, the achieved results of multi-classification are also remarkable and impressively great on all performance metrics.

[1] Abdulhammed R., Musafer H., Alessa A., Faezipour M., and Abuzneid A., “Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection,” Electronics, vol. 8, no. 3, pp. 332, 2019.

[2] Aksu D., Üstebay S., Aydin M., and Atmaca T., “Intrusion Detection With Comparative Analysis of Supervised Learning Techniques and Fisher Score Feature Selection Algorithm,” in Proceedings of International Symposium on Computer and Information Sciences, pp. 141- 149, 2018.

[3] Chen T. and Guestrin C., “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, pp. 785-794, 2016.

[4] CIC, “Intrusion Detection Evaluation Dataset (CICIDS2017),” Canadian Institute for Cybersecurity,, Last Visited, 2019.

[5] Drummond C. and Holte R., “C4.5, Class Imbalance, and Cost Sensitivity: Why Under- Sampling Beats Over-Sampling,” Workshop on Learning from Imbalanced Datasets II, ICML, Washington DC, pp. 1-8, 2003.

[6] Friedman J., “Greedy Function Approximation: A Gradient Boosting Machine,” The Annals of Statistics, vol. 29, no. 5, pp. 1189-1232, 2001.

[7] Galar M., Fernandez A., Barrenechea E., Bustince H., and Herrera F., “A Review on Ensembles for The Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 4, pp. 463-484, 2012.

[8] Gareth J., Daniela W., Trevor H., and Robert T., An Introduction to Statistical Learning with Applications in R, Springer, 2013.

[9] Ghanem W. and Jantan A., “Novel Multi- Objective Artificial Bee Colony Optimization for Wrapper Based Feature Selection in Intruction Detectoin,” International journal of advance soft computing applications, vol. 8, no. 1, pp. 70-81, 2016.

[10] Ivanciuc O., “Weka Machine Learning for Predicting the Phospholipidosis Inducing Potential,” Current Topics in Medicinal Chemistry, vol. 8, no. 18, pp. 1691-1709, 2008.

[11] Jayakumar K., Revathi T., and Karpagam S., “Intrusion Detection Using Artificial Neural Networks with Best Set of Features,” The International Arab Journal of Information Technology, vol. 12, no. 6A, pp. 728-734, 2015.

[12] Jiang J., Yu Q., Yu M., Li G., Chen J., Liu K., and Huang W., “ALDD: A Hybrid Traffic-User Behavior Detection Method for Application An Efficient Intrusion Detection Framework Based on Embedding Feature Selection and Ensemble… 247 Layer DDoS,” in Proceedings of 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering, New York, pp. 1565- 1569, 2018.

[13] Liao H., Lin C., Lin Y., and Tung K., “Intrusion Detection System: A Comprehensive Review,” Journal of Network and Computer Applications, vol. 36, no. 1, pp. 16-24, 2013.

[14] Luo B. and Xia J., “A Novel Intrusion Detection System Based on Feature Generation with Visualization Strategy,” Expert Systems with Applications, vol. 41, no. 9, pp. 4139-4147, 2014.

[15] Marir N., Wang H., Feng G., Li B., and Jia M., “Distributed Abnormal Behavior Detection Approach Based on Deep Belief Network and Ensemble SVM Using Spark,” IEEE Access, vol. 6, pp. 59657-59671, 2018.

[16] Mease D., Wyner A., and Buja A., “Boosted Classification Trees and Class Probability/Quantile Estimation,” Journal of Machine Learning Research, vol. 8, pp. 409-439, 2007.

[17] Mishra P., Varadharajan V., Tupakula U., and Pilli E., “A Detailed Investigation and Analysis of Using Machine Learning Techniques for Intrusion Detection,” IEEE Communications Surveys and Tutorials, vol. 21, no. 1, pp. 686-728, 2019.

[18] Moayedikia A., Ong K., Boo Y., Yeoh W., and Jensen R., “Feature Selection for High Dimensional Imbalanced Class Data Using Harmony Search,” Engineering Applications of Artificial Intelligence, vol. 57, pp. 38-49, 2017.

[19] Moustafa N. and Slay J., “The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of The UNSW-NB15 Data Set and The Comparison with The KDD99 Data Set,” Information Security Journal: A Global Perspective, vol. 25, no. 1-3, pp. 18-31, 2016.

[20] Nielsen D., Tree Boosting With XGBoost, Ntnu, 2016.

[21] Raman M., Somu N., Kirthivasan K., Liscano R., and Sriram V., “An Efficient Intrusion Detection System Based on Hypergraph-Genetic Algorithm for Parameter Optimization and Feature Selection in Support Vector Machine,” Knowledge-Based Systems, vol. 134, pp. 1-12, 2017.

[22] Sharafaldin I., Habibi Lashkari A., and Ghorbani A., “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization,” in Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal, pp. 108-116, 2018.

[23] Singh R., Kumar H., and Singla R., “An Intrusion Detection System Using Network Traffic Profiling and Online Sequential Extreme Learning Machine,” Expert Systems with Applications, vol. 42, no. 22, pp. 8609-8624, 2015.

[24] Tabash M., Allah M., and Tawfik B., “Intrusion Detection Model Using Naive Bayes and Deep Learning Technique,” The International Arab Journal of Information Technology, vol. 17, no. 2, pp. 215- 224, 2020.

[25] Tjhai G., Furnell S., Papadaki M., and Clarke N., “A Preliminary Two-Stage Alarm Correlation and Filtering System Using SOM Neural Network And K-Means Algorithm,” Computers and Security, vol. 29, no. 6, pp. 712-723, 2010.

[26] Ustebay S., Turgut Z., and Aydin M., “Intrusion Detection System with Recursive Feature Elimination by Using Random Forest and Deep Learning Classifier,” in Proceedings of International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism, Ankara pp. 71-76, 2019.

[27] Vijayanan R., Devaraj D., and Kannapiran B., “Intrusion Detection System for Wireless Mesh Network Using Multiple Support Vector Machine Classifiers with Genetic-Algorithm- Based Feature Selection,” Computers and Security, vol. 77, pp. 304-314, 2018.

[28] Wang H., Gu J., and Wang S., “An Effective Intrusion Detection Framework Based on SVM with Feature Augmentation,” Knowledge-Based Systems, vol. 136, pp. 130-139, 2017.

[29] WhiteHat, “2018 Application Security Statistics Report,” 2018.

[30] ZorarpacI E. and Özel S., “A Hybrid Approach of Differential Evolution and Artificial Bee Colony for Feature Selection,” Expert Systems with Applications, vol. 62, pp. 91-103, 2016. 248 The International Arab Journal of Information Technology, Vol. 19, No. 2, March 2022 Fawaz Mokbal received a B.S. degree in computer science from Thamar University, Yemen, an M.S degree in Information Technology from the University of Agriculture, Pakistan, and a Ph.D. degree in Computer Science and Technology from Beijing University of Technology, China. Currently, he is a Teacher and Researcher with Fan Gongxiu Honors College, Beijing University of Technology, China. For one and half years, he served as a Research Associate with the Faculty of Computer Science, ILMA University, Pakistan, for five years he was the Manager of Information Systems with the Ministry of Local Administration in Yemen, and for two years, he was the Head of the Technical Team of the Information Center Project for the local authority. He is the author and a reviewer of various SCI, EI, and Scopus indexed journals. His research interests include machine and deep learning, medical images, brain- computer interface, Web application security, and the IoT security issues. Wang Dan received the B.S. degree in computer application, the M.S. degree in computer software and theory, and the Ph.D. degree in computer software and theory from Northeastern University, China, in 1991, 1996, and 2002, respectively. She is currently a professor and doctoral supervisor in Computer Science and Technology. She has been engaged in teaching for over 20 years. She has presided several Beijing Municipal Natural Science Foundation and research projects commissioned by enterprises. She has published more than 50 papers in journals and conferences and finished two textbooks. She used to be visiting scholar at the University of California, Riverside, and the University of Illinois at Urbana Champaign in the U.S.A. Her major areas of interest include trusted software, web security, and big data. Musa Osman is a Ph.D. student at Beijing University of Technology (BJUT), China. He received his BSc in computer science at the University of Gazira, Sudan, and MSc in Information System at Osmania University, India. His main research interests are security issues in the Internet of Things mostly based on RPL protocol, Machine. Yang Ping received the Ph.D. degree in computer science and technology from Beijing University of Technology, in 2020, under the supervision of Professor Dan Wang. She is currently a Lecturer with the School of Economics and Management, Beijing Information Science and Technology University. Her research interests include artificial intelligence in biomedical engineering, intelligent information processing, machine learning, and information security. Saeed Alsamhi received the B.Eng. degree from the Department of Electronic Engineering (Communication Division), IBB University, Yemen, in 2009, and the M.Tech. degree in communication systems and the Ph.D. degree from the Department of Electronics Engineering, Indian Institute of Technology (Banaras Hindu University), IIT (BHU), Varanasi, India, in 2012 and 2015, respectively. In 2009, he worked as a Lecturer Assistant in Engineering's faculty at IBB University. He held a postdoctoral position with the School of Aerospace Engineering, Tsinghua University, Beijing, China, in optimal and smart wireless network research and its applications to enhance robotics technologies. Since 2019, he has been an Assistant Professor. He has published 30 articles in high reputation journals in IEEE, Elsevier, Springer, Wiley, and MDPI publishers. His areas of interest include green communication, green Internet of Things, QoE, QoS, multi-robot collaboration, blockchain technology, and space technologies (high altitude platform, drone, and tethered balloon technologies). He is currently MSCA SMART 4.0 FELLOW at the Athlone Institute of Technology, Athlone, Ireland.