Perception of Natural Scenes: Objects Detection and Segmentations using Saliency Map with

Author AlexNet, Abdulwahab Alazeb, Naif Al Mudawi, Touseef Sadiq, Bayan Alabdullah, Hammed ur Rahman, Asaad Algarni, Ahmad Jalal,

Keywords #Pattern recognition #alexNet #fish swarm algorithm #object detection

Abstract Object detection and classification play a crucial role in accurately tracking objects in complex environments. In recent years, there has been a significant increase in interest among researchers towards object analysis, fueled by the necessity to address challenges and explore opportunities across diverse technological domains. This study introduces a methodologically novel method for image classification through a custom-designed architecture inspired by AlexNet, tailored to process feature vectors for improved pattern recognition. The methodology incorporates Density-Based Spatial Clustering of Applications with Noise (DBSCAN) segmentation to partition images into meaningful regions, showcasing computational efficiency. Additionally, saliency mapping highlights visually significant areas within these segmented images. Various feature extraction methods, including Maximally Stable Extremal Regions (MSER), Binary Robust Invariant Scalable Keypoints (BRISK), and Wavelet transform, are employed to capture unique structures within the images. These features are then fused and optimized using the Fish Swarm Algorithm (FSA), a nature-inspired optimization technique. The refined features, enhanced through the FSA process, are input into a modified AlexNet architecture, enhancing image classification accuracy. The evaluation metrics used include accuracy, precision, recall, and F1-score, providing a comprehensive assessment of performance. The proposed model achieved a classification accuracy of 95.65% on the VOC 2012 dataset, outperforming contemporary methods by a margin of 2-5%, and 93.66% and 92.71% on Caltech-101 and Microsoft Common Objects in Context (MS COCO) datasets, respectively. This innovative blend of techniques harnesses the strengths of FSA and deep learning, yielding precise and robust classification outcomes, outperforming many contemporary methods on datasets like VOC 2012, Caltech 101, and MS COCO.

References [1] Ahmad M., Shabbir S., Roy S., Hong D., Wu X., and Yao J., “Hyperspectral Image Classification- Traditional to Deep Models: A Survey for Future Prospects,” IEEE Journal Selected Topics Applied Earth Observations Remote Sensing, vol. 15, pp. 968-999, 2022. DOI:10.1109/JSTARS.2021.3133021 [2] Ahmed A., Jalal A., and Kim K., “A Novel Statistical Method for Scene Classification Based On Multi-Object Categorization and Logistic Regression,” Sensors, vol. 20, no. 14, pp. 3871, 2020. https://doi.org/10.3390/s20143871 [3] Ahmed M., Almujally N., Alazeb A., Algarni A., and Park J., “Enhanced Object Detection and Classification via Multi-Method Fusion,” Computers, Materials and Continua, vol. 79, no. 2, pp. 3315-3331, 2024. https://doi.org/10.32604/cmc.2024.046501 [4] Ahmed M. and Jalal A., “Dynamic Adoptive Gaussian Mixture Model for Multi-Object Detection Over Natural Scenes,” in Proceedings of the 5th International Conference on Advancements in Computational Sciences, Lahore, pp. 1-8, 2024. DOI:10.1109/ICACS60934.2024.10473231 [5] Ahmed M. and Jalal A., “Robust Object Recognition with Genetic Algorithm and Composite Saliency Map,” in Proceedings of the 5th International Conference on Advancements in Computational Sciences, Lahore, pp. 1-7, 2024. DOI:10.1109/ICACS60934.2024.10473285 [6] Alkhatib M., Al-Saad M., Aburaed N., Almansoori S., Zabalza J., Marshall S., and Al-Ahmad H., “Tri- CNN: A Three Branch Model for Hyperspectral 472 The International Arab Journal of Information Technology, Vol. 22, No. 3, May 2025 Image Classification,” Remote. Sens, vol. 15, no. 2, pp. 316, 2023. https://doi.org/10.3390/rs15020316 [7] Bharadiya J., “Convolutional Neural Networks for Image Classification,” International Journal of innovative Science and Research Technology, vol. 8, no. 5, pp. 673-677, 2023. https://doi.org/10.5281/zenodo.8020781 [8] Bo L. and Sminchisescu C., “Efficient Match Kernel Between Sets of Features for Visual Recognition,” in Proceedings of the 23rd International Conference on Neural Information Processing Systems, Vancouver, pp. 135-143, 2009. [9] Borade J. and Lakshmi M., “Multi-Class Object Detection System Using Hybrid Convolutional Neural Network Architecture,” Multimedia Tools and Applications, vol. 81, pp. 31727-31751, 2022. https://doi.org/10.1007/s11042-022-13007-7 [10] Cheng B., Girshick R., Dollar P., Berg A., and Kirillov A., “Boundary IoU: Improving Object- Centric Image Segmentation Evaluation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Nashville, pp. 15334-15342, 2021. DOI:10.1109/CVPR46437.2021.01508 [11] Dai Y., Gieseke F., Oehmcke S., Wu Y., and Barnard K., “Attentional Feature Fusion,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, pp. 3560-3569, 2021. DOI:10.1109/WACV48630.2021.00360 [12] Dimitrovski I., Kitanovski I., Kocev D., and Simidjievski N., “Current Trends in Deep Learning for Earth Observation: An Open-Source Benchmark Arena for Image Classification,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 197, pp. 18-35, 2023. https://doi.org/10.1016/j.isprsjprs.2023.01.014 [13] Gokulalakshmi A., Karthik S., Karthikeyan N., and Kavitha M., “ICM-BTD: Improved Classification Model for Brain Tumor Diagnosis Using Discrete Wavelet Transform-Based Feature Extraction and SVM Classifier,” Soft Computing, vol. 24, pp. 18599-18609, 2020. https://doi.org/10.1007/s00500-020-05096-z [14] Guo Y. and Sengur A., “A Novel Color Image Segmentation Approach Based on Neutrosophic Set and Modified Fuzzy C-Means,” Circuits, Systems, and Signal Processing, vol. 32, pp. 1699- 1723, 2013. https://doi.org/10.1007/s00034-012- 9531-x [15] Hosny K., Kassem M., and Fouad M., “Classification of Skin Lesions into Seven Classes Using Transfer Learning with AlexNet,” Journal of Digital Imaging, vol. 33, pp. 1325-1334, 2020. https://doi.org/10.1007/s10278-020-00371-9 [16] Ibrahim R., Abualigah L., Ewees A., Al-qaness M., Yousri D., Alshathri S., and Abd Elaziz M., “An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection,” Entropy, vol. 23, no. 9, pp. 1189, 2021. https://doi.org/10.3390/e23091189 [17] Iqball T. and Wani M., “Weighted Ensemble Model for Image Classification,” International Journal of Information Technology, vol. 15, pp. 557-564, 2023. https://doi.org/10.1007/s41870- 022-01149-8 [18] Jalal A., Ahmed A., Rafique A., and Kim K., “Scene Semantic Recognition Based on Modified Fuzzy C-Mean and Maximum Entropy Using Object-To-Object Relations,” IEEE Access, vol. 9, pp. 27758-27772, 2021. DOI:10.1109/ACCESS.2021.3058986 [19] Kuan K., Manek G., Lin J., Fang Y., and Chandrasekhar V., “Region Average Pooling for Context-Aware Object Detection,” in Proceedings of the IEEE International Conference on Image Processing, Beijing, pp. 1347-1351, 2017. DOI:10.1109/ICIP.2017.8296501 [20] Li S., Wang L., Li J., and Yao Y., “Image Classification Algorithm Based on Improved AlexNet,” Journal of Physics: Conference Series, vol. 1813, no. 012051, pp. 1-9, 2021. DOI:10.1088/1742-6596/1813/1/012051 [21] Liu Y., Zhang H., Guo H., and Xiong N., “A Fast- Brisk Feature Detector with Depth Information,” Sensors, vol. 18, no. 11, pp. 3908, 2018. https://doi.org/10.3390/s18113908 [22] Madhukar B., Bharathi S., and Ashwin M., “Classification of Breast Cancer using Ensemble Filter Feature Selection with Triplet Attention Based Efficient Net Classifier,” The International Arab Journal of Information Technology, vol. 21, no. 1, pp. 17-31, 2024. 23 https://doi.org/10.34028/iajit/21/1/2 [23] Martins P., Carvalho P., and Gatta C., “On the Completeness of Feature-Driven Maximally Stable Extremal Regions,” Pattern Recognition Letters, vol. 74, pp. 9-16, 2016. https://doi.org/10.1016/j.patrec.2016.01.003 [24] Mukherjee P., Lall B., and Shah A., “Saliency Map Based Improved Segmentation,” in Proceedings of the IEEE International Conference on Image Processing, Quebec City, pp. 1290-1294, 2015. DOI:10.1109/ICIP.2015.7351008 [25] Muralidharan R. and Chandrasekar C., “Object Recognition Using SVM-KNN based on Geometric Moment Invariant,” International Journal of Emerging Trends and Technology in Computer Science, vol. 1, no. 3, pp. 215-220, 2011. [26] Narayanan L., Krishnan S., and Robinson H., “A Hybrid Deep Learning Based Assist System for Detection and Classification of Breast Cancer from Mammogram Images,” The International Arab Journal of Information Technology, vol. 19, no. 6, pp. 965-974, 2022. 22 Perception of Natural Scenes: Objects Detection and Segmentations using Saliency ... 473 https://doi.org/10.34028/iajit/19/6/15 [27] Naseer A., Alzahrani H., Almujally N., Al- Nowaiser K., Al-Mudawi N., Algarni A., and Park J., “Efficient Multi-Object Recognition Using GMM Segmentation Feature Fusion Approach,” IEEE Access, vol. 12, pp. 37165-37178, 2024. DOI:10.1109/ACCESS.2024.3372190 [28] Naseer A., Almujally N., Alotaibi S., Alazeb A., and Park J., “Efficient Object Segmentation and Recognition Using Multi-Layer Perceptron Networks,” Computers, Materials and Continua, vol. 78, no. 1, pp. 1381-1398, 2024. https://doi.org/10.32604/cmc.2023.042963 [29] Othman G. and Zeebaree D., “The Applications of Discrete Wavelet Transform in Image Processing: A Review,” Journal of Soft Computing and Data Mining, vol. 1, no. 2, pp. 31-43, 2020. [30] Ouadiay F., Bouftaih H., Bouyakhf E., and Himmi M., “Simultaneous Object Detection and Localization Using Convolutional Neural Networks,” in Proceedings of the International Conference on Intelligent Systems and Computer Vision, Fez, pp. 1-8, 2018. DOI:10.1109/ISACV.2018.8354045 [31] Pourpanah F., Wang R., Lim C., Wang X., and Yazdani D., “A Review of Artificial Fish Swarm Algorithms: Recent Advances and Applications,” Artificial Intelligence Review, vol. 56, pp. 1867- 1903, 2023. https://doi.org/10.1007/s10462-022- 10214-4 [32] Pramanik A., Pal S., Maiti J., and Mitra P., “Granulated RCNN and Multi-Class Deep Sort for Multi-Object Detection and Tracking,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 1, pp. 171- 181, 2022. DOI:10.1109/TETCI.2020.3041019 [33] Rafique A., Gochoo M., Jalal A., and Kim K., “Maximum Entropy Scaled Super Pixels Segmentation for Multi-Object Detection and Scene Recognition via Deep Belief Network,” Multimedia Tools and Applications, vol. 82, pp. 13401-13430, 2023. https://doi.org/10.1007/s11042-022-13717-y [34] Riaz F., Rehman S., Ajmal M., Hafiz R., Hassan A., and Aljohani N., “Gaussian Mixture Model Based Probabilistic Modeling of Images for Medical Image Segmentation,” IEEE Access, vol. 8, pp. 16846-16856, 2020. DOI:10.1109/ACCESS.2020.2967676 [35] Sengupta D., Ali S., Bhattacharya A., Mustafi J., Mukhopadhyay A., and Sengupta K., “A Deep Hybrid Learning Pipeline for Accurate Diagnosis of Ovarian Cancer Based on Nuclear Morphology,” PLoS One, vol. 17, no. 1, pp. 1-20, 2022. DOI:10.1371/journal.pone.0261181 [36] Shen J., Hao X., Liang Z., Liu U., Wang W., and Shao L., “Real-Time Superpixel Segmentation by DBSCAN Clustering Algorithm,” IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5933-5942, 2016. DOI:10.1109/TIP.2016.2616302 [37] Shetty S., “Application of Convolutional Neural Network for Image Classification on Pascal Voc Challenge 2012 Dataset,” arXiv ؛reprint, vol. arXiv:1607.03785, pp. 1-6, 2016. https://doi.org/10.48550/arXiv.1607.03785 [38] Song L., Gao M., Wang S., and Wang S., “An Image Segmentation Method by combining Fuzzy C-Means Clustering and Graph Cuts Optimization for Multiphase Level Set Algorithms,” in Proceedings of the 2nd International Conference on Information Science and Control Engineering, Shanghai, pp. 611-615, 2015. DOI:10.1109/ICISCE.2015.141 [39] Song S., Jia Z., Yang J., and Kasabov N., “A Fast Image Segmentation Algorithm Based on Saliency Map and Neutrosophic Set Theory,” IEEE Photonics Journal, vol. 12, no. 5, pp. 1-16, 2020, DOI:10.1109/JPHOT.2020.3026973 [40] Srikar M. and Malathi K., “A Supervised Stable Object Detection with Image Feature Extraction Using Image Segmentation by Comparing Histogram of Oriented Gradients (HOG) Algorithm Over Scale Invariant Feature Transform (SIFT) Algorithm Model,” Journal of Pharmaceutical Negative Results, vol. 13, no. 4, pp. 1708-1714, 2022. https://doi.org/10.47750/pnr.2022.13.S04.205 [41] Sun K., Sun L., Zhao Y., Chen Y., Hao X., Liu H., Liu X., and Chen J., “XGBG: A Novel Method for Identifying Ovarian Carcinoma Susceptible Genes Based on Deep Learning,” Frontiers in Oncology, vol. 12, pp. 1-7, 2022. DOI:10.3389/fonc.2022.89750. [42] Al-Mudawi N., Tayyab M., Ahmed M., and Jalal A., “Machine learning Based on Body Points Estimation for Sports Event Recognition,” in Proceedings of the IEEE International Conference on Autonomous Robot Systems and Competitions, Paredes de Coura, pp. 120-125, 2024. DOI:10.1109/ICARSC61747.2024.10535954 [43] Wang C., Ji M., Wang J., Wen W., Li T., and Sun Y., “An Improved DBSCAN Method for Lidar Data Segmentation with Automatic Eps Estimation,” Sensors, vol. 19, no. 1, pp. 172, 2019. https://doi.org/10.3390/s19010172 [44] Wangsaputra D., Anam C., Adi K., and Naufal A., “Impact of Adaptive Mean Filter as the Preprocessing Stage of Histopathological Image Classification of Breast Tumor Using Transfer Learning VGG16 for Various Magnifications,” International Journal of Scientific Research in Science and Technology, vol. 10, no. 2, pp. 274- 280, 2023. https://doi.org//10.32628/IJSRST52310239 [45] Ahmed M., Alshahrani A., Almjally A., Al- 474 The International Arab Journal of Information Technology, Vol. 22, No. 3, May 2025 Mudawi N., Algarni A., and Al-Nowaiser K., “Remote Sensing Image Interpretation: Deep Belief Networks for Multi-Object Analysis,” IEEE Access, vol. 12, pp. 142360-142379, 2024. DOI:10.1109/ACCESS.2024.3466220 [46] Wei H., Yang C., and Yu Q., “Contour Segment Grouping for Object Detection,” Journal of Visual Communication and Image Representation, vol. 48, pp. 292-309, 2017. https://doi.org/10.1016/j.jvcir.2017.07.003 [47] Wu X., Sahoo D., and Hoi S., “Recent Advances in Deep Learning for Object Detection,” Neurocomputing, vol. 396, pp. 39-64, 2020. https://doi.org/10.1016/j.neucom.2020.01.085 [48] Xia Y., Tian Z., Yu J., Zhang Y., Liu S., Du S., and Lan X., “A Review of Object Detection Based on Deep Learning,” Multimedia Tools and Applications, vol. 79, pp. 23729-23791, 2020. https://doi.org/10.1007/s11042-020-08976-6 [49] Yan S. and Dong Y., “GMM Based Simultaneous Reconstruction and Segmentation in X-Ray CT Application,” in Proceedings of the International Conference on Scale Space and Variational Methods in Computer Vision, Cabourg, pp. 503- 515, 2021. https://doi.org/10.1007/978-3-030- 75549-2_40 [50] Yang H., Tian J., and Yang J., “New Medical Image Segmentation Algorithm Based on Gaussian-Mixture Model,” Biomedical Photonics and Optoelectronic Imaging, vol. 4224, pp. 40-44 2000. https://doi.org/10.1117/12.403921 [51] Yang R. and Li D., “Adaptive Wavelet Transform Based on Artificial Fish Swarm Optimization and Fuzzy C-Means Method for Noisy Image Segmentation,” Computer Science and Information Systems, vol. 19, no. 3, pp. 1389-1408, 2022. https://doi.org/10.2298/CSIS220321039Y [52] Yao R., Lin, G., Xia S., Zhao J., and Zhou Y., “Video Object Segmentation and Tracking: A Survey,” ACM Transactions on Intelligent Systems and Technology, vol. 11, no. 4, pp. 1-47, 2020. https://doi.org/10.1145/3391743 [53] Yu L., Chen X., and Zhou S., “Research of Image Main Objects Detection Algorithm Based on Deep Learning,” in Proceedings of the IEEE 3rd International Conference on Image, Vision and Computing, Chongqing, pp. 70-75, 2018. DOI:10.1109/ICIVC.2018.8492803 [54] Yugander P., Tejaswini C., Meenakshi J., Kumar K., Varma B., and Jagannath M., “MR Image Enhancement Using Adaptive Weighted Mean Filtering and Homomorphic Filtering,” Procedia Computer Science, vol. 167, pp. 677-685, 2020. https://doi.org/10.1016/j.procs.2020.03.334 [55] Zhao H., Zhou W., Hou X., and Zhu H., “Double Attention for Multi-Label Image Classification,” IEEE Access, vol. 8, pp. 225539-225550, 2020. DOI:10.1109/ACCESS.2020.3044446 Muhammad Waqas Ahmed received his MS degree in Computer Sciences from COMSATS. He is currently pursuing his Ph.D. in computer science from Air University, Islamabad, Pakistan. His research interests include Artificial Intelligence, Computer Vision, Machine Learning Algorithms, Deep Learning, Image, Video Processing, and Intelligent Systems. Abdulwahab Alazeb is currently Assistant Professor at Department of Computer Science and Information system, Najran University. He received the B.S. degree in computer science from King Khalid University, Abha, Saudi Arabia, in 2007, and the M.S. degree in computer science from the Department of Computer Science, University of Colorado Denver, USA in 2014. He holds a PhD degree as well as a Graduate Certificate in cybersecurity from the University of Arkansas, USA, in 2021. His research interests include Cybersecurity, Cloud and Edge Computing Security, Machine Learning and the Internet of Things. Naif Al Mudawi is assistant Professor, Department of Computer Science and Information system, Najran University. He holds a PhD from the Collage of Engineering and Informatics at University of Sussex in Brighton, UK in 2018. He graduated from the Australian La Trobe University with a master's degree in computer science in 2011 during his academic journey to obtain a master’s degree, he was a member of the Australian Computer Science committee. Dr. Naif has has many published research and scientific papers in many prestigious journals in various disciplines of computer science. Touseef Sadiq is a PhD researcher at University of Agder, Norway. His current research focuses on deep multimodal learning for descriptive object identification in urban environments. He obtained his B.E degree in computer engineering from Bahria University Islamabad, Pakistan and completed his MS degree in communications and computer networks engineering from Polytechnic university of Turin, Italy on fully funded scholarship. His primary research interests include machine learning, computer vision, deep multimodal learning, and their applications. Perception of Natural Scenes: Objects Detection and Segmentations using Saliency ... 475 Bayan Alabduallah received the Ph.D. degree in informatics from the University of Sussex, Brighton, U.K., in May 2022. She is currently an Assistant Professor with the Department of Information System, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University. She teaches several courses with the Information System Department, such as data governance, system security, and database system. Her research interests include machine learning, data science, privacy, and security. Hameed ur Rahman chair of the Department of Computer Games Development at Air University, Islamabad, boasts a robust research profile. With a Ph.D. in Computer Vision and expertise in augmented reality, virtual reality, image processing, and more, he demonstrates a commitment to cutting-edge technology. As a pivotal member since 2018, Dr. Rahman contributes to AI/Data Science, Cybersecurity, Computer Science, and Gaming Departments, Mentoring Students and Fostering Interdisciplinary Research. Notably, his leadership in the Ignite (Pakistan) Project showcases practical applications of his research, emphasizing his dedication to knowledge dissemination and skill development in emerging fields. Asaad Algarni is working as Assistant Professor at the Department of Computer Sciences in the College of Computing and Information Technology, Northern Borders University, Kingdom of Saudi Arabia. He holds a PhD in Software Engineering from North Dakota State University, USA. His research interests revolve around Software Engineering, Computer Vision applications and Machine Learning. Ahmad Jalal is currently an Associate Professor from Department of Computer Science and Engineering, Air University, Pakistan. He received his Ph.D. degree in the Department of Biomedical Engineering at Kyung Hee University, Republic of Korea. Now, he was working as Postdoctoral Research fellowship at POSTECH. His research interest includes Multimedia Contents, Artificial Intelligence and Machine Learning.

Abstract: Object detection and classification play a crucial role in accurately tracking objects in complex environments. In recent years, there has been a significant increase in interest among researchers towards object analysis, fueled by the necessity to address challenges and explore opportunities across diverse technological domains. This study introduces a methodologically novel method for image classification through a custom-designed architecture inspired by AlexNet, tailored to process feature vectors for improved pattern recognition. The methodology incorporates Density-Based Spatial Clustering of Applications with Noise (DBSCAN) segmentation to partition images into meaningful regions, showcasing computational efficiency. Additionally, saliency mapping highlights visually significant areas within these segmented images. Various feature extraction methods, including Maximally Stable Extremal Regions (MSER), Binary Robust Invariant Scalable Keypoints (BRISK), and Wavelet transform, are employed to capture unique structures within the images. These features are then fused and optimized using the Fish Swarm Algorithm (FSA), a nature-inspired optimization technique. The refined features, enhanced through the FSA process, are input into a modified AlexNet architecture, enhancing image classification accuracy. The evaluation metrics used include accuracy, precision, recall, and F1-score, providing a comprehensive assessment of performance. The proposed model achieved a classification accuracy of 95.65% on the VOC 2012 dataset, outperforming contemporary methods by a margin of 2-5%, and 93.66% and 92.71% on Caltech-101 and Microsoft Common Objects in Context (MS COCO) datasets, respectively. This innovative blend of techniques harnesses the strengths of FSA and deep learning, yielding precise and robust classification outcomes, outperforming many contemporary methods on datasets like VOC 2012, Caltech 101, and MS COCO.
URL: https://iajit.org/paper/5216

,abstract={Object detection and classification play a crucial role in accurately tracking objects in complex environments. In recent years, there has been a significant increase in interest among researchers towards object analysis, fueled by the necessity to address challenges and explore opportunities across diverse technological domains. This study introduces a methodologically novel method for image classification through a custom-designed architecture inspired by AlexNet, tailored to process feature vectors for improved pattern recognition. The methodology incorporates Density-Based Spatial Clustering of Applications with Noise (DBSCAN) segmentation to partition images into meaningful regions, showcasing computational efficiency. Additionally, saliency mapping highlights visually significant areas within these segmented images. Various feature extraction methods, including Maximally Stable Extremal Regions (MSER), Binary Robust Invariant Scalable Keypoints (BRISK), and Wavelet transform, are employed to capture unique structures within the images. These features are then fused and optimized using the Fish Swarm Algorithm (FSA), a nature-inspired optimization technique. The refined features, enhanced through the FSA process, are input into a modified AlexNet architecture, enhancing image classification accuracy. The evaluation metrics used include accuracy, precision, recall, and F1-score, providing a comprehensive assessment of performance. The proposed model achieved a classification accuracy of 95.65% on the VOC 2012 dataset, outperforming contemporary methods by a margin of 2-5%, and 93.66% and 92.71% on Caltech-101 and Microsoft Common Objects in Context (MS COCO) datasets, respectively. This innovative blend of techniques harnesses the strengths of FSA and deep learning, yielding precise and robust classification outcomes, outperforming many contemporary methods on datasets like VOC 2012, Caltech 101, and MS COCO.},
keywords={Pattern recognition, alexNet, fish swarm algorithm, object detection},
ISSN={2413-9351},
month={Jan}}

AB - Object detection and classification play a crucial role in accurately tracking objects in complex environments. In recent years, there has been a significant increase in interest among researchers towards object analysis, fueled by the necessity to address challenges and explore opportunities across diverse technological domains. This study introduces a methodologically novel method for image classification through a custom-designed architecture inspired by AlexNet, tailored to process feature vectors for improved pattern recognition. The methodology incorporates Density-Based Spatial Clustering of Applications with Noise (DBSCAN) segmentation to partition images into meaningful regions, showcasing computational efficiency. Additionally, saliency mapping highlights visually significant areas within these segmented images. Various feature extraction methods, including Maximally Stable Extremal Regions (MSER), Binary Robust Invariant Scalable Keypoints (BRISK), and Wavelet transform, are employed to capture unique structures within the images. These features are then fused and optimized using the Fish Swarm Algorithm (FSA), a nature-inspired optimization technique. The refined features, enhanced through the FSA process, are input into a modified AlexNet architecture, enhancing image classification accuracy. The evaluation metrics used include accuracy, precision, recall, and F1-score, providing a comprehensive assessment of performance. The proposed model achieved a classification accuracy of 95.65% on the VOC 2012 dataset, outperforming contemporary methods by a margin of 2-5%, and 93.66% and 92.71% on Caltech-101 and Microsoft Common Objects in Context (MS COCO) datasets, respectively. This innovative blend of techniques harnesses the strengths of FSA and deep learning, yielding precise and robust classification outcomes, outperforming many contemporary methods on datasets like VOC 2012, Caltech 101, and MS COCO.