The International Arab Journal of Information Technology (IAJIT)

Utilizing Artificial Bee Colony Algorithm as Feature Selection Method in Arabic Text Classification

A huge amount of crucial information is contained in documents. The vast increase in the number of E-documents available for user access makes the utilization of automated text classification essential. Classifying or arranging documents into predefined groups is called Text classification. Feature Selection (FS) is needed for minimizing the dimensionality of high- dimensional data and extracting only the features that are most pertinent to a particular task. One of the widely used algorithms for feature selection in text classification is the Evolutionary algorithm. In this paper, the filter method chi-square and the Artificial Bee Colony (ABC) algorithm were both used as FS methods. The chi-square method is a useful technique for reducing the number of features and removing those that are superfluous or redundant. The ABC technique considers the chi-square method's chosen features as viable solutions (food sources). The ABC algorithm searches for the most efficient selection of features that increase classification performance. Support Vector Machine and Naïve Bayes classifiers were used as a fitness function for the ABC algorithm. The experiment results demonstrated that the proposed feature selection method was able of decreasing the number of features by approximately 89.5%, and 94%, respectively when NB and SVM were used as fitness functions in comparison to the original dataset, while also enhancing classification performance.

 

  1. Adel A., Omar N., Albared M., and Al-Shabi A., “Feature Selection Method Based on Statistics of Compound Words for Arabic Text Classification,” The International Arab Journal of Information Technology, vol. 16, no. 2, pp. 178-185, 2019.
  2. Adel A., Omar N., and Al-Shabi A., “A Comparative Study of Combined Feature Selection Methods for Arabic Text Classification,” Journal of Computer Science, vol. 10, no. 11, pp. 2232-2239, 2014. doi:10.3844/jcssp.2014.2232.2239.
  3. Aghdam M. and Heidari S., “Feature Selection Using Particle Swarm Optimization in Text Categorization,” Journal of Artificial Intelligence and Soft Computing Research, vol. 5, no. 4, pp. 231-238, 2015. DOI: https://doi.org/10.1515/jaiscr-2015-0031
  4. Al-Dulaimi A. and Okkalioglu M., “Efficient Arabic Text Classification Using Feature Selection Techniques and Genetic Algorithm,” in Proceedings of the 3rd International Informatics and Software Engineering Conference (IISEC), Ankara, pp. 1-6, 2022.
  5. Alhaj Y.A., Dahou A., Al-qaness M., Abualigah L., Abbasi A., Almaweri N., Abd Elaziz M., Damaševičius R., “A Novel Text Classification Technique Using Improved Particle Swarm Optimization: A Case Study of Arabic Language,” Future Internet, vol. 14, no. 7, pp. 194, 2022. https://doi.org/10.3390/fi14070194.
  6. Alhutaish R. and Omar N., “Arabic Text Classification Using K-Nearest Neighbour Algorithm,” The International Arab Journal of Information Technology, vol. 12, no. 2, pp. 190-195, 2015.
  7. Alomari O.A., Elnagar A., Afyouni I. Shahin I., Bou Nassif A., Hashem I., and Tubishat M.,  “Hybrid Feature Selection Based on Principal Component Analysis and Grey Wolf Optimizer Algorithm for Arabic News Article Classification,” IEEE Access, vol. 10, pp. 121816-121830, 2022. DOI: 10.1109/ACCESS.2022.3222516
  8. Alshaer H.N., Otair M.A., Abualigah L., Alshinwan M., and Khasawneh A. M., “Feature Selection Method Using Improved CHI Square on Arabic Text Classifiers: Analysis and Application,” Multimedia Tools and Applications, vol. 80, no. 7, pp. 10373-10390, 2021.
  9. Al-Thubaity A., Abanumay N., Al-Jerayyed S., Alrukban A., and Mannaa Z., “The Effect of Combining Different Feature Selection Methods on Arabic Text Classification,” in Proceedings of 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, Honolulu, pp. 211-216, 2013. DOI: 10.1109/SNPD.2013.89.
  10. Arlot S. and Celisse A., “A Survey of Cross-Validation Procedures for Model Selection,” Statistics Surveys, vol. 4, pp. 40-79, 2010. DOI: 10.1214/09-SS054
  11. Ayadi R., Maraoui M., and Zrigui M., “Latent Topic Model for Indexing Arabic Documents,” International Journal of Information Retrieval Research, vol. 4, no. 2, pp. 57-72, 2014. DOI: 10.4018/ijirr.2014040104.
  12. Ayadi R., Maraoui M., and Zrigui M., “LDA and LSI as A Dimensionality Reduction Method in Arabic Document Classification,” in Proceedings of International Conference on Information and Software Technologies, Druskininkai, pp. 491-502, 2015.
  13. Bahassine S., Madani A., Al-Sarem M., and Kissi M., “Feature Selection Using an Improved Chi-Square for Arabic Text Classification,” Journal of King Saud University-Computer and Information Sciences, vol. 32, no. 2, pp. 225-231, 2020. https://doi.org/10.1016/j.jksuci.2018.05.010.
  14. Bahassine S., Madani A., and Kissi M., “An Improved Chi-Sqaure Feature Selection for Arabic Text Classification Using Decision Tree,” in Proceedings of 11th International Conference on Intelligent Systems: Theories and Applications (SITA), Mohammedia, pp. 1-5, 2016.
  15. Bao L. and Zeng J., “Comparison and Analysis of the Selection Mechanism in the Artificial Bee Colony Algorithm,” in Proceedings of 9th International Conference on Hybrid Intelligent Systems, Shenyang, pp. 411-416, 2009.
  16. Basir M.A., Yusof Y., and Saifullah M., “Optimization Of Attribute Selection Model Using Bio-Inspired Algorithms,” Journal of Information and Communication Technology, vol. 18, no. 1, pp. 35-55, 2019.
  17. Belkebir R. and Guessoum A., “A Hybrid BSO-Chi2-SVM Approach to Arabic Text Categorization,” in Proceedings of ACS International Conference on Computer Systems and Applications (AICCSA), Ifrane, pp. 1-7, 2013.
  18. Chandrashekar G. and Sahin F., “A Survey on Feature Selection Methods,” Computers and Electrical Engineering, vol. 40, no. 1, pp. 16-28, 2014.
  19. Chantar H. and Corne D., “Feature Subset Selection for Arabic Document Categorization Using BPSO-KNN,” in Proceedings of 3rd World Congress on Nature and Biologically Inspired Computing, Salamanca, pp. 546-551, 2011.
  20. Chantar H., Mafarja M., Alsawalqah H., Heidari A.A., Aljarah I., and Faris H., “Feature Selection Using Binary Grey Wolf Optimizer With Elite-Based Crossover for Arabic Text Classification,” Neural Computing and Applications, vol. 32, no. 16, pp. 12201-12220, 2020.
  21. Chantar H., Tubishat M., Essgaer M., and Mirjalili S., “Hybrid Binary Dragonfly Algorithm With Simulated Annealing For Feature Selection,” SN Computer Science, vol. 2, no. 4, pp. 1-11, 2021.
  22. Chantar H.K.H., New Techniques for Arabic Document Classification, PhD Thesis, Heriot-Watt University, 2013.
  23. Duwairi R.M., “Machine Learning for Arabic Text Categorization,” Journal of the American Society for Information Science and Technology, vol. 57, no. 8, pp. 1005-1010, 2006. https://doi.org/10.1002/asi.20360
  24. El-Hajj W. and Hajj H., “An Optimal Approach for Text Feature Selection,” Computer Speech and Language, vol. 74, 2022. https://doi.org/10.1016/j.csl.2022.101364.
  25. Elhassan R. and Ali M., “The Impact of Feature Selection Methods for Classifying Arabic Texts,” in Proceedings of the 2nd International Conference on Computer Applications and Information Security, Riyadh, pp. 1-6, 2019. DOI: 10.1109/CAIS.2019.8769526.
  26. Elnahas A., Elfishawy N., Nour M., and Tolba M., “Machine Learning and Feature Selection Approaches for Categorizing Arabic Text: Analysis, Comparison, and Proposal,” The Egyptian Journal of Language Engineering, vol. 7, no. 2, pp. 1-19, 2020.
  27. Ghareb A.S., Bakar A.A., and Hamdan A.R., “Hybrid Feature Selection Based on Enhanced Genetic Algorithm for Text Categorization,” Expert Systems with Applications, vol. 49, pp. 31-47, 2016. https://doi.org/10.1016/j.eswa.2015.12.004.
  28. Guru D., Ali M., Suhil M., and Hazman M., “A Study of Applying Different Term Weighting Schemes on Arabic Text Classification,” in Proceedings of Data Analytics and Learning, Singapore, pp. 293-305, 2019.
  29. Habeeb A., Otair M., Abualigah L., Alsoud A.R., Abd Elminaam D., Abu Zitar R., Ezugwu A., and Jia H., “Arabic Text Classification Using Modified Artificial Bee Colony Algorithm for Sentiment Analysis: The Case of Jordanian Dialect,” Classification Applications with Deep Learning and Machine Learning Technologies, pp. 243-288, 2022.
  30. Hadni M. and Hassane H., “A New Metaheuristic Approach Based Feature Selection for Arabic Text Categorization,” in Proceedings of the 23th International Arab Conference on Information Technology (ACIT), Abu Dhabi, pp. 1-7, 2022. DOI: 10.1109/ACIT57182.2022.9994102
  31. Hadni M. and Hjiaj H., “An Improved Chaotic Sine Cosine Firefly Algorithm for Arabic Feature Selection,” in Proceedings of International Conference on Big Data and Internet of Things, Tangier, pp. 84-94, 2022.
  32. Haralambous Y., Elidrissi Y., and Lenca P., “Arabic Language Text Classification Using Dependency Syntax-Based Feature Selection,” arXiv Prepr. arXiv1410.4863, 2014.
  33. Harrag F., El-Qawasmah E., and Al-Salman A., “Comparing Dimension Reduction Techniques for Arabic Text Classification Using BPNN Algorithm,” in Proceedings of 1st International Conference on Integrated Intelligent Computing, Bangalore, pp. 6-11, 2010.
  34. Harrag F., El-Qawasmeh E., and Pichappan P., “Improving Arabic Text Categorization Using Decision Trees,” in Proceedings of 1st International Conference on Networked Digital Technologies, Ostrava, pp. 110-115, 2009.
  35. Hijazi M., Zeki A., and Ismail A., “Arabic Text Classification Using Hybrid Feature Selection Method Using Chi-Square Binary Artificial Bee Colony Algorithm,” International Journal of Mathematics and Computer Science, vol. 16, no. 1, pp. 213-228, 2021.
  36. Hijazi M., Zeki A., and Ismail A., “Arabic Text Classification: Review Study,” Journal of Engineering and Applied Science, vol. 11, no. 3, pp. 528-536, 2016.
  37. Hijazi M., Zeki A., and Ismail A., “A Review Study on Arabic Text Classification,” in Proceedings of the 23th International Arab Conference on Information Technology, Abu Dhabi, pp. 1-13, 2022.
  38. Hijazi M., Zeki A., and Ismail A., “Arabic Text Classification: A Review Study on Feature Selection Methods,” in Proceedings of the 22nd International Arab Conference on Information Technology, Muscat, pp. 1-6, 2021.
  39. Jia D., Duan X., and Khan M., “Binary Artificial Bee Colony Optimization Using Bitwise Operation,” Computers and Industrial Engineering, vol. 76, pp. 360-365, 2014.
  40. Karaboga D. and Akay B., “A Survey: Algorithms Simulating Bee Swarm Intelligence,” Artificial Intelligence Review, vol. 31, no. 1-4, pp. 61-85, 2009.
  41. Karaboga D., “An Idea Based on Honey Bee Swarm for Numerical Optimization,” Technical report-tr06, 2005.
  42. Karaboga D., Gorkemli B., Ozturk C., and Karaboga N., “A Comprehensive Survey: Artificial Bee Colony (ABC) Algorithm and Applications,” Artificial Intelligence Review, vol. 42, no. 1, pp. 21-57, 2014.
  43. Khorsheed M. and Al-Thubaity A., “Comparative Evaluation of Text Classification Techniques Using A Large Diverse Arabic Dataset,” Language Resources and Evaluation, vol. 47, no. 2, pp. 513-538, 2013.
  44. Marie-Sainte S. and Alalyani N., “Firefly Algorithm Based Feature Selection for Arabic Text Classification,” Journal of King Saud University-Computer and Information Sciences, vol. 32, no. 3, pp. 320-328, 2020. https://doi.org/10.1016/j.jksuci.2018.06.004.
  45. Meena M.J., Chandran K.R., Karthik A., and Samuel A.V., “An Enhanced ACO Algorithm to Select Features for Text Categorization and its Parallelization,” Expert Systems with Applications, vol. 39, no. 5, pp. 5861-5871, 2012. https://doi.org/10.1016/j.eswa.2011.11.081.
  46. Mesleh A. and Kanaan G., “Arabic Text Categorization System-Using Ant Colony Optimization-Based Feature Selection,” in Proceedings of 3ed International Conference on Software and Data Technologies, Porto, pp. 384-387, 2008. DOI: 10.5220/0001892803840387.
  47. Moh’d Mesleh A., “Feature Sub-Set Selection Metrics for Arabic Text Classification,” Pattern Recognition Letters, vol. 32, no. 14, pp. 1922-1929, 2011. https://doi.org/10.1016/j.patrec.2011.07.010.
  48. Mohammad A., “Comparing Two Feature Selections Methods (Information Gain and Gain Ratio) on Three Different Classification Algorithms Using Arabic Dataset,” Journal of Theoretical and Applied Information Technology, vol. 96, no. 6, pp. 1561-1569, 2018.
  49. Mosa M., “Feature Selection Based on ACO and Knowledge Graph for Arabic Text Classification,” Journal of Experimental and Theoretical Artificial Intelligence, pp. 1-18, 2022. DOI: 10.1080/0952813X.2022.2125588.
  50. Naji H., Ashour W., and Al Hanjouri M., “Text Classification for Arabic Words Using BPSO/REP-Tree,” International Journal of Computational Linguistics Research, vol. 9, no. 1, 2018.
  51. Prasartvit T., Kaewkamnerdpong B., and Achalakul T., “Dimensional Reduction Based on Artificial Bee Colony,” in Proceedings of 7th International Conference on Intelligent Computing, Zhengzhou, pp. 168-175, 2012.
  52. Rahab H., Haouassi H., Souidi M., Bakhouche A., Mahdaoui R., and Bekhouche M., “A Modified Binary Rat Swarm Optimization Algorithm for Feature Selection in Arabic Sentiment Analysis,” Arabian Journal for Science and Engineering, pp. 1-28, 2022.
  53. Saad E., Awadalla M., and Alajmi A., “Dewy Index Based Arabic Document Classification with Synonyms Merge Feature Reduction,” International Journal of Computer Science Issues, vol. 8, no. 6, pp. 46-54, 2011.
  54. Schiezaro M. and Pedrini H., “Data Feature Selection Based on Artificial Bee Colony Algorithm,” EURASIP Journal on Image and Video processing, pp. 47, 2013.
  55. Shunmugapriya P. and Kanmani S., “A Hybrid Algorithm Using Ant and Bee Colony Optimization for Feature Selection and Classification (AC-ABC Hybrid),” Swarm and Evolutionary Computation, vol. 36, pp. 27-36, 2017. https://doi.org/10.1016/j.swevo.2017.04.002.
  56. Subkhi M., Fatichah C., and Arifin A., “Feature Selection Using Hybrid Binary Grey Wolf Optimizer for Arabic Text Classification,” IPTEK The Journal for Technology and Science, vol. 33, no. 2, pp. 105-116, 2022.
  57. Syiam M., Fayed Z., and Habib M., “An Intelligent System for Arabic Text Categorization,” International Journal of Intelligent Computing and Information Sciences, vol. 6, no. 1, pp. 1-19, 2006.
  58. Yousif S., Samawi V., Elkabani I., and Zantout R., “The Effect of Combining Different Semantic Relations on Arabic Text Classification,” World of Computer Science and Information Technology Journal, vol. 5, no. 1, pp. 12-118, 2015.
  59. Zahran B., Kanaan G., and Sciences F., “Text Feature Selection using Particle Swarm Optimization Algorithm,” World Applied Sciences Journal, vol. 7, pp. 69-74, 2009.