The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Improved YOLOv3-tiny for Silhouette Detection Using Regularisation Techniques

Although recent advances in Deep Learning (DL) algorithms have been developed in many Computer Vision (CV) tasks with a high accuracy level, detecting humans in video streams is still a challenging problem. Several studies have, therefore, focused on the regularisation techniques to prevent the overfitting problem which is one of the most fundamental issues in the Machine Learning (ML) area. Likewise, this paper thoroughly examines these techniques, suggesting an improved you Only Look Once (YOLO) v3-tiny based on a modified neural network and an adjusted hyperparameters file configuration. The obtained experimental results, which are validated on two experimental tests, show that the proposed method is more effective than the YOLOv3-tiny predecessor model. The first test which includes only the data augmentation techniques indicates that the proposed approach reaches higher accuracy rates than the original YOLOv3-tiny model. Indeed, Visual Object Classes (VOC) test dataset accuracy rate increases by 32.54 % compared to the initial model. The second test which combines the three tasks reveals that the adopted combined method wins a gain over the existing model. For instance, the labelled crowd_human test dataset accuracy percentage rises by 22.7 % compared to the data augmentation model.

[1] Ayadi S., Ben Said A., Jabbar R., Aloulou C., Chabbouh A., and Achballah A., “Dairy Cow Rumination Detection: A Deep Learning Approach,” in Proceedings of International Workshop on Distributed Computing for Emerging Smart Networks, Bizerte, pp. 123-139, 2020.

[2] Al-Sa’d M., Al-Ali A., Mohamed A., Khattab T., and Erbad A., “RF-Based Drone Detection And Identification Using Deep Learning Approaches: An Initiative Towards A Large Open Source Drone Database,” Future Generation Computer Systems, vol. 100, pp. 86-97, 2019.

[3] Ammous D., kallel A., Kammoun F., and Masmoudi N., “Analysis of Coding and Transfer of Arien Video Sequences from H. 264 Standard,” in Proceedings of 5th International Conference on Advanced Technologies for Signal and Image Processing, Sousse, pp. 1-5, 2020.

[4] David B. and Rangasamy D., “Spatial-Contextual Texture and Edge Analysis Approach for Unsupervised Change Detection of Faces in Counterfeit Images,” International Journal of Computers and Applications, vol. 37, no. 3-4, pp. 143-159, 2015.

[5] Dong X., Han Y., Li W., and Li B., “Pedestrian Detection in Metro Station Based on Improved SSD,” in Proceedings of IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering, Dalian, pp. 936-939, 2019.

[6] Everingham M., Van Gool L., Williams C., Winn J., and Zisserman A., “The Pascal Visual Object Classes (Voc) Challenge,” International Journal of Computer Vision, vol. 88, pp. 303-338, 2010.

[7] Ghalleb A., Boumaiza S., and Amara N., “Demographic Face Profiling Based on Age, Gender and Race,” in Proceedings of 5th International Conference on Advanced Technologies for Signal and Image Processing, Sousse, pp. 1-6, 2020.

[8] Gollapudi S., “Object Detection and Recognition,” in Proceedings of Learn Computer Vision Using OpenCV, Berkeley, pp. 97-117, 2019.

[9] Girshick R., Donahue J., Darrell T., and Malik J., “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, pp. 580-587, 2014.

[10] Girshick R., “Fast R-Cnn,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, 2015.

[11] Felzenszwalb P., McAllester D., and Ramanan D., “A Discriminatively Trained, Multiscale, �'�H�I�R�U�P�D�E�O�H��3�D�U�W��0�R�G�H�O��´�in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp. 1-8, 2008.

[12] Huang M. and Wu Y., “GCS-YOLOV4-Tiny: A Lightweight Group Convolution Network for Multi-Stage Fruit Detection,” Mathematical Biosciences and Engineering, vol. 20, no. 1, pp. 241-268, 2022.

[13] He K., Zhang X., Ren S., and Sun J., “Deep Residual Learning for Image Recognition,” in Proceedings IEEE Conference on Computer Vision Pattern Recognition, Las Vegas, pp. 770- 778, 2016.

[14] Jiang Z., Zhao L., Li S., and Jia Y., “Real-Time Object Detection Method for Embedded Devices,” Computer Vision and Pattern Recognition, vol. 3, pp. 1-11, 2020.

[15] Jamiya S. and Rani E., “LittleYOLO-SPP: A Delicate Real-Time Vehicle Detection Algorithm,” Optik, vol. 225, pp. 165818, 2021.

[16] Kessentini Y., Besbes M., Ammar S., and Chabbouh A., “A Two-Stage Deep Neural Network for Multi-Norm License Plate Detection and Recognition,” Expert Systems with Applications, vol. 136, pp. 159-170, 2019.

[17] Kong W., Hong J., Jia M., Yao J., Cong W., Hu H., and Zhang H., “YOLOv3-DPFIN: A Dual- Path Feature Fusion Neural Network for Robust Real-Time Sonar Target Detection,” IEEE Sensors Journal, vol. 20, no. 7, pp. 3745-3756, 2019.

[18] Lin Y., Cai R., Lin P., and Cheng S., “A Detection Approach for Bundled Log Ends Using K-Median Clustering and Improved Yolov4-Tiny Improved YOLOv3-tiny for Silhouette Detection Using Regularisation Techniques 279 Network,” Computers and Electronics in Agriculture, vol. 194, pp. 106700, 2022.

[19] Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C., and Berg A., “Ssd: Single Shot Multibox Detector,” in Proceedings of European Conference on Computer Vision, Amsterdam, pp. 21-37, 2016.

[20] Lin T., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollár P., and Zitnick C., “Microsoft Coco: Common Objects Incontext,” in Proceedings of Computer Vision-ECCV, Zurich, pp. 740-755, 2014.

[21] Mzoughi H., Njeh I., Wali A., Slima M., BenHamida A., Mhiri C., and Mahfoudhe K., “Deep Multi-Scale 3D Convolutional Neural Network (CNN) For MRI Gliomas Brain Tumor Classification,” Journal of Digital Imaging, vol. 33, no. 4, pp. 903-915, 2020.

[22] Nasri M., Hmani M., Mtibaa A., Petrovska- Delacretaz D., Slima M., and Hamida A., “Face Emotion Recognition From Static Image Based on Convolution Neural Networks,” in Proceedings of 5th International Conference on Advanced Technologies for Signal and Image Processing, Sousse, pp. 1-6, 2020.

[23] Niu J., Chen Y., Yu X., Li Z., and Gao H., “Data Augmentation on Defect Detection of Sanitary Ceramics,” in Proceedings of IECON the 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore, pp. 5317-5322, 2020.

[24] Ogundoyin S., “An Autonomous Lightweight Conditional Privacy-Preserving Authentication Scheme with Provable Security for Vehicular Ad-Hoc Networks,” International Journal of Computers and Applications, vol. 42, no. 2, pp. 1-16, 2018.

[25] Pokkuluri K., Nedunuri S., “Crop Disease Prediction with Convolution Neural Network (CNN) Augmented with Cellular Automata,” The International Arab Journal of Information Technology, vol. 19, no. 5, pp. 765-773, 2022.

[26] Felzenszwalb P., Girshick R., McAllester D., and Ramanan D., “Object Detection with Discriminatively Trained Part-Based Models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627- 1645, 2009.

[27] Prasetyo E., Suciati N., and Fatichah C., “Yolov4-Tiny and Spatial Pyramid Pooling for Detecting Head and Tail of Fish,” in Proceedings of International Conference on Artificial Intelligence and Computer Science Technology, Yogyakarta, pp. 157-161, 2021.

[28] Piotrowski A. and Napiorkowski J., “A Comparison of Methods to Avoid Overfitting in Neural Networks Training in The Case of Catchment Runoff Modelling,” Journal of Hydrology, vol. 476, pp. 97-111, 2013.

[29] Qi H., Xu T., Wang G., Cheng Y., and Chen C., MYOLOv3-Tiny: “A New Convolutional Neural Network Architecture for Real-Time Detection of Track Fasteners,” Computers in Industry, vol. 123, pp. 103303, 2020.

[30] Ren S., He K., Girshick R., and Sun J., “Faster r- Cnn: Towards Real-Time Object Detection with Region Proposal Networks,” Advances in Neural Information Processing Systems, vol. 28, 2015.

[31] Ren C., Kim D., and Jeong D., “A Survey of Deep Learning in Agriculture: Techniques and Their Applications,” Journal of Information Processing Systems, vol. 16, no. 5, pp. 1015- 1033, 2020.

[32] Redmon J., Divvala S., Girshick R., and Farhadi A., “You Only Look Once: Unified, Real-Time Object Detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, pp. 779-788, 2016.

[33] Redmon J. and Farhadi A, “YOLO9000: Better, Faster, Stronger,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, pp. 6517-6525, 2017.

[34] Redmon, J., and Farhadi, A, “YOLOv3: An �L�Q�F�U�H�P�H�Q�W�D�O� �L�P�S�U�R�Y�H�P�H�Q�W��´� ������ arXiv:1804.02767.

[Online]. Available: /web/20230216105855/https://arxiv.org/abs/1804 .02767, Last Visited, 2023.

[35] Ranjbar M., Mori G., and Wang Y., “Optimizing Complex Loss Functions in Structured Prediction,” in Proceedings of European Conference on Computer Vision, Heraklion, pp. 580-593, 2010.

[36] Redmon J., Darknet: Open source neural networks /web/20221224110653/https://pjreddie. com/darknet/, Last Visited, 2021.

[37] Srivastava N., Hinton G., Krizhevsky A., Sutskever I., and Salakhutdinov R., “Dropout:A Simple Way to Prevent Neural Networks From Overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014.

[38] Shao S., Zhao Z., Li B., Xiao T., Yu G., Zhang X., and Sun J., “Crowdhuman: A Benchmark for Detecting Human in A Crowd,” arXiv preprint arXiv:1805.00123, 2018.

[39] Wang Y., Jia K., and Liu P., “Impolite Pedestrian Detection by Using Enhanced Yolov3- Tiny,” Journal of Artificial Intelligence, vol. 2, no. 3, pp. 113-124, 2020.

[40] /web/20221224110042/https://www.anavid.co/, Last Visited, 2021.

[41] /web/20221224110203/https://github.com/Cartu cho/mAP, Last Visited, 2021.

[42] /web/20221224110324/https://github.com/Alexe yAB/darknet, Last Visited, 2021. 280 The International Arab Journal of Information Technology, Vol. 20, No. 2, March 2023

[43] /web/20221224110506/http://host.robots.ox.ac.u k/pascal/VOC/, Last Visited, 2021.

[44] /web/20221224110807/https://cocodataset.org/

[45] /web/20221224111240/https://www.crowdhuman .org/, Last Visited, 2021.

[46] /web/20221224111209/https://www.cis.upenn.ed u/~jshi/ped_html/, Last Visited, 2021.

[47] Wang L., Shi J., Song G., and Shen I., “Object Detection Combining Recognition and Segmentation,” in Proceedings of Asian Conference on Computer Vision, Tokyo, pp. 189- 199, 2007.

[48] Xun Z., Wang L., and Liu Y., “Improved Face Detection Algorithm Based on Multitask Convolutional Neural Network for Unmanned Aerial Vehicles View,” Journal of Electronic Imaging, vol. 31, no. 6, pp. 061804, 2022.

[49] Yang Z., Xu W., Wang Z., He X., Yang F., and Yin Z., “Combining YOLOV3-Tiny Model with Dropblock for Tiny-Face Detection,” in Proceedings of IEEE 19th International Conference on Communication Technology, Xi'an, pp. 1673-1677, 2019.

[50] Yolo: Open Source Neural Networks in C. Availableonline: /web/20221224105904/https://pjreddie.com/dark net/yolo/, Last Visited, 2021.

[51] Ying X., “An Overview of Overfitting and Its Solutions,” in Journal of Physics: Conference Series, vol. 1168, no. 2, pp. 022022, 2019.

[52] Yi Z., Yongliang S., and Jun Z., “An Improved Tiny-Yolov3 Pedestrian Detection Algorithm,” Optik, vol. 183, pp. 17-23, 2019.

[53] Zhang P., Zhong Y., and Li X., “SlimYOLOv3: Narrower, Faster And Better for Real-Time UAV Applications,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, 2019.

[54] Zhang S., Wen L., Bian X., Lei Z., and Li S., “Single-shot Refinement Neural Network For Object Detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 4203-4212, 2018.