The International Arab Journal of Information Technology (IAJIT)


Modified Binary Bat Algorithm for Feature Selection in Unsupervised Learning

Feature selection is the process of selecting a subset of optimal features by removing redundant and irrelevant features. In supervised learning, feature selection process uses class label. But feature selection is difficult in unsupervised learning since class labels are not present. In this paper, we present a wrapper based unsupervised feature selection method with the modified binary bat approach with k-means clustering algorithm. To ensure diversification in the search space, mutation operator is introduced in the proposed algorithm. To validate the selected features by our method, classification algorithms like decision tree induction, Support Vector Machine and Naïve Bayesian classifier are used. The results show that the proposed method identifies a minimal number of features with improved accuracy when compared with the other methods.

[1] Bennet J., Ganaprakasam C., and Kumar N., A hybrid Approach for Gene Selection and Classification using Support Vector Machine, The International Arab Journal of Information Technology, vol. 12, no. 6A, pp. 695-700, 2015.

[2] Dash M., Choi K., Scheuermann P., and Liu H., Feature Selection for Clustering-a Filter Solution, in Proceedings of IEEE International Conference on Data Mining, Maebashi, pp. 115- 122, 2002.

[3] Diao R. and Shen Q., Nature Inspired Feature Selection Metaheuristics, Artificial Intelligence Review, vol. 44, no. 3, pp. 311-340, 2015.

[4] Dy J. and Brodley C., Feature Selection for Unsupervised Learning, Journal of Machine Learning Research, vol. 5, pp. 845-889, 2004.

[5] Frigui H. and Nasraoui O., Unsupervised Learning of Prototypes and Attribute Weights, Pattern Recognition, vol. 37, no. 3, pp. 567-581, 2004.

[6] Geem Z., Music-inspired Harmony Search Algorithm: Theory and Applications, Springer Publishing Company, 2009.

[7] Grozavu N., Bennani Y., and Lebbah M., From Variable Weighting to Cluster Characterization in Topographic Unsupervised Learning, in Fitness Valu Iterations 1066 The International Arab Journal of Information Technology, Vol. 15, No. 6, November 2018 Proceedings of IEEE International Joint Conference on Neural Network, Atlanta, pp. 1005-1010, 2009.

[8] Gu Q., Li Z., and Han J., Generalized Fisher Score for Feature Selection, in Proceedings of the International Conference on Uncertainty in Artificial Intelligence, Barcelona, pp. 266-273, 2011.

[9] Gullo F., Talukder A., Luke S., Domeniconi C., and Tagarelli A., Multiobjective Optimization of Co-clustering Ensembles, in Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, Philadelphia, pp. 1495-1496, 2012.

[10] Han J. and Kamber M., Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2006.

[11] Hong Y., Kwong S., Chang Y., and Ren Q., Consensus Unsupervised Feature Ranking from Multiple Views, Pattern Recognition Letters, vol. 29, no. 5, pp. 595-602, 2008.

[12] Hong Y., Kwong S., Chang Y., and Qingsheng R., Unsupervised Feature Selection using Clustering Ensembles and Population based Incremental Learning Algorithm, Pattern Recognition, vol. 41, no. 9, pp. 2742-2756, 2008.

[13] Huang J., Ng M., Rong H., and Li Z., Automated Variable Weighting in K-means Type Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 657-668, 2005.

[14] Inbarani H., Banu P., and Azar A., Feature Selection using Swarm-based Relative Reduct Technique for Fetal Heart Rate, Neural Computing and Applications, vol. 25, no. 3-4, pp. 793-806, 2014.

[15] Jing L., Ng M., and Huang J., An Entropy Weighting k-means Algorithm for Subspace Clustering of High-dimensional Sparse Data, IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 8, pp. 1026-1041, 2007.

[16] Kaur P. and Kaur T., A comparative Study of Various Metaheuristics Algorithm, International Journal of Computer Science and Information Technologies, vol. 5, no. 5, pp. 6701-6704, 2014.

[17] Kluger Y., Basri R., Chang J., and Gerstein M., Spectral Biclustering of Microarray Cancer Data: Co-clustering Genes and Conditions, Genome Research, vol. 13, no. 4, pp. 703-716, 2003.

[18] Lai C., Reinders M., and Wessels L., Random Subspace Method for Multivariate Feature Selection, Pattern Recognition Letters, vol. 27, no. 10, pp. 1067-1076, 2006.

[19] Li Y., Dong M., and Hua J., Localized Feature Selection for Clustering, Pattern Recognition Letters, vol. 29, no. 1, pp. 10-18, 2008.

[20] Liu H. and Yu L., Toward Integrating Feature Selection Algorithms for Classification, IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491-502, 2005.

[21] Mirjalili S., Mirjalili S., and Yang X., Binary Bat Algorithm, Neural Computing and Applications, vol. 25, no. 3-4, pp. 663-681, 2014.

[22] Mitra P., Murthy C., and Pal S., Unsupervised Feature Selection using Feature Similarity, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 301-312, 2002.

[23] Morita M., Sabourin R., Bortolozzi F., and Suen C., Unsupervised Feature Selection using Multi- objective Genetic Algorithms for Handwritten Word Recognition, in Proceedings of 7th International Conference on Document Analysis and Recognition, Edinburgh, pp. 666-670, 2003.

[24] Rashedi E., Nezamabadi-pour H., and Saryazdi S., BGSA: Binary Gravitational Search Algorithm, Natural Computing, vol. 9, no. 3, pp. 727-745, 2009.

[25] Rodrigues D., Pereira L., Nakamura R., Costa K., Yang X., Souza A., and Papa J., A wrapper Approach for Feature Selection based on Bat Algorithm and Optimum-Path Forest, Expert Systems with Applications, vol. 41, no. 5, pp. 2250-2258, 2014.

[26] Saeys Y., Inza I., and Larranaga P., A Review of Feature Selection Techniques in Bioinformatics, Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007.

[27] Saxena A., Pal N., and Vora M., Evolutionary Methods for Unsupervised Feature Selection Using Sammon s Stress Function, Fuzzy Information and Engineering, vol. 2, no. 3, pp. 229-247, 2010.

[28] Shamsinejadbabki P. and Saraee M., A new Unsupervised Feature Selection Method for Text Clustering based on Genetic Algorithms, Intelligent Information Systems, vol. 38, no. 3, pp. 669-684, 2012.

[29] Shi T. and Horvath S., Unsupervised Learning with Random Forest Predictors, Journal of Computational and Graphical Statistics, vol. 15, no. 1, pp. 118-138, 2006.

[30] Tabakhi S., Moradi P., and Akhlaghian F., An Unsupervised Feature Selection Algorithm based on Ant Colony Optimization, Engineering Applications of Artificial Intelligence, vol. 32, pp. 112-12, 2014.

[31] Velayutham C. and Thangavel K., Unsupervised Quick Reduct Algorithm Using Rough Set Theory, Journal of Electronic Science and Technology, vol. 9, no. 3, pp.193-201, 2011.

[32] Weka, available at:, Last Visited, 2015. Modified Binary Bat Algorithm for Feature Selection in Unsupervised Learning 1067

[33] Yang X., Swarm Intelligence based Algorithms: a Critical Analysis, Evolutionary Intelligence, vol. 7, no. 1, pp. 17-28, 2014.

[34] Yang X., Bat algorithm; Literature Review and Applications, International Journal of Bio- Inspired Computation, vol. 5, no. 3, pp. 141-149, 2013.

[35] Yang X., Firefly algorithm Stochastic Test Functions and Design Optimization, International Journal of Bio-inspired Computing, vol. 2, no. 2, pp. 78-84, 2011.

[36] Zhuo L., Zheng J., Wang F., Li X., Ai B., and Qian J., A Genetic Algorithm based Wrapper Feature Selection Method for Classification of Hyperspectral Images using Support Vector Machine, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pp. 397-402, 2008. Rajalaxmi Ramasamy received BE

[CSE] and ME

[CSE] from Bharathiar University during 1990 and 2001 respectively. She completed her Ph.D in Anna university, India during 2011. She is working as Professor in CSE, Kongu Engineering College, India. Her research interests include Data Mining, Nature Inspired Computing and Big Data Analytics. Sylvia Rani completed B.E

[CSE] during 2009. She obtained M.E

[CSE] from Kongu Engineering College, Tamilnadu, India in 2015.She has published two papers in national and international conference. Her research interests include Nature inspired computing and Parallel Processing.