The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Computational Intelligence Based Point of Interest Detection by Video Surveillance Implementations

Latest advancement of the computer vision literature and Convolutional Neural Networks (CNN) reveal many opportunities that are being actively used in various research areas. One of the most important examples for these areas is autonomous vehicles and mapping systems. Point of interest detection is a rising field within autonomous video tracking and autonomous mapping systems. Within the last few years, the number of implementations and research papers started rising due to the advancements in the new deep learning systems. In this paper, our aim is to survey the existing studies implemented on point of interest detection systems that focus on objects on the road (like lanes, road marks), or objects on the roadside (like road signs, restaurants or temporary establishments) so that they can be used for autonomous vehicles and automatic mapping systems. Meanwhile, the roadside point of interest detection problem has been addressed from a transportation industry perspective. At the same time, a deep learning based point of interest detection model based on roadside gas station identification will be introduced as proof of the anticipated concept. Instead of using an internet connection for point of interest retrieval, the proposed model has the capability to work offline for more robustness. A variety of models have been analysed and their detection speed and accuracy performances are compared. Our preliminary results show that it is possible to develop a model achieving a satisfactory real-time performance that can be embedded into autonomous cars such that streaming video analysis and point of interest detection might be achievable in actual utilisation for future implementations.

[1] Ahmad T., Ilstrup D., Emami E., and Bebis G., “Symbolic Road Marking Recognition Using Convolutional Neural Networks,” in Proceedings of the IEEE Intelligent Vehicles Symposium, Los Angeles, pp. 1428-1433, 2017. doi: 10.1109/IVS.2017.7995910.

[2] Bailo O., Lee S., Rameau F., Yoon J., and Kweon I., “Robust Road Marking Detection and Recognition Using Density-Based Grouping and Machine Learning Techniques,” in Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Santa Rosa, pp. 760-768, 2017. doi: 10.1109/WACV.2017.90.

[3] Bhatt D., Patel C., Talsania H., Patel J., Vaghela R., Pandya S., Modi K., and Ghayvat H., “CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope,” Electronics, vol. 10, no. 20, pp. 2470, 2021. DOI:10.3390/electronics10202470

[4] Cai Z. and Vasconcelos N., “Cascade R-CNN: Delving into High Quality Object Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 6154-6162, 2018. doi: 10.1109/CVPR.2018.00644.

[5] Debie E. and Shafi K., “Implications of the Curse of Dimensionality for Supervised Learning Classifier Systems: Theoretical and Empirical Analyses,” Pattern Analysis Application, vol. 22, no. 21, pp. 519-536, 2019. https://doi.org/10.1007/s10044-017-0649-0

[6] Deep Residual Network, http://primo.ai/index.php?title=(Deep)_Residual_ Network_(DRN)_-_ResNet, Last Visited, 2021.

[7] Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., and Unterthiner T., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” arXiv preprint, 2020. arXiv:2010.11929.

[8] Feng C., Zhong Y., Gao Y., Scott M., and Huang W., “TOOD: Task-Aligned One-Stage Object Detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, pp. 3490-3499, 2021. DOI:10.1109/ICCV48922.2021.00349

[9] Greenhalgh J. and Mirmehdi M., “Automatic Detection and Recognition of Symbols and Text on the Road Surface,” in Proceedings of the International Conference on Pattern Recognition Applications and Methods, Lisbon, pp. 124-140, 2015.

[10] Hao J., Wang G., Seo B., and Zimmermann R., “Point of Interest Detection and Visual Distance Estimation for Sensor-Rich Video,” IEEE Transactions on Multimedia, vol. 16, no. 7, pp. 1929-1941, 2014.

[11] Haykin S., Neural Networks and Learning Machines, Pearson, 2009.

[12] Huang H., Gartner G., Krisp J., Raubal M., and Weghe N., “Location Based Services: Ongoing Evolution and Research Agenda,” Journal of Location Based Services, vol. 12, no. 2, pp. 63-93, 2018. https://doi.org/10.1080/17489725.2018.1508763

[13] ImageNet Challenge, http://www.image- net.org/challenges/LSVRC/, Last Visited, 2021.

[14] Kheyrollahi A. and Breckon T., “Automatic Real- Time Road Marking Recognition Using a Feature Driven approach,” Machine Vision and Applications, vol. 23, no. 1, pp. 123-133, 2012. DOI 10.1007/s00138-010-0289-5

[15] LeCun Y., Bottou L., Bengio Y., and Haffner P., “Gradient-based Learning Applied To Document Recognition,” Proceedings of the IEEE, vol. 86, no. 11, 2278-2324, 1998. DOI: 10.1109/5.726791

[16] Lee Y., Lee J., Hong Y., Ko Y., and Jeon M., “Unconstrained Road Marking Recognition with Generative Adversarial Networks,” in Procceedings of the IEEE Intelligent Vehicles Symposium, Paris, pp. 1414-1419, 2019. https://doi.org/10.48550/arXiv.1910.04326

[17] Li H., Feng M., and Wang X., “Inverse Perspective Mapping Based Urban Road Markings Detection,” in Proceedings of the IEEE 2nd International Conference on Cloud Computing and Intelligence Systems, Hangzhou, pp. 1178- 1182, 2012. DOI:10.1109/CCIS.2012.6664569

[18] Liang J., Homayounfar N., Ma W., Xiong Y., Hu R., and Urtasun R., “Polytransform: Deep Polygon Transformer for Instance Segmentation,” in Proceedings of the IEEE/CVF Conference on 908 The International Arab Journal of Information Technology, Vol. 20, No. 6, November 2023 Computer Vision and Pattern Recognition, p Seattle, p. 9131-9140, 2020. https://doi.org/10.48550/arXiv.1912.02801

[19] Lin T., Goyal P., Girshick R., He K., and Dollár P., “Focal Loss for Dense Object Detection,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2980-2988, 2017. https://doi.org/10.48550/arXiv.1708.02002

[20] Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C., and Berg A., “SSD: Single Shot Multibox Detector,” in Proceedings of the European Conference on Computer Vision, Amsterdam, pp. 21-37, 2016.

[21] Liu Z., Lin Y., Cao Y., Hu H., Wei Y., Zhang Z., and Guo B., “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012- 10022, 2021.

[22] Pascanu R., Mikolov T., and Bengio Y., “On the Difficulty of Training Recurrent Neural Networks,” in Proceedings of the 30th International Conference on Machine Learning, Atlanta, pp. 1310-1318, 2013. https://doi.org/10.48550/arXiv.2103.14030

[23] Pratt L., “Discriminability-Based Transfer Between Neural Networks,” in Proceedings of the 5th International Conference on Neural Information Processing Systems, San Francisco, pp. 204-211, 1992.

[24] Ramakrishnan D. and Radhakrishnan K., “Applying Deep Convolutional Neural Network (DCNN) Algorithm in the Cloud Autonomous Vehicles Traffic Model,” The International Arab Journal of Information Technology, vol. 19, no. 2, pp. 186-194, 2022. https://doi.org/10.34028/iajit/19/2/5

[25] Redmon J., Divvala S., Girshick R., and Farhadi A., “You Only Look Once: Unified, Real-Time Object Detection,” in Procceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, pp. 779-788, 2016. 10.1109/CVPR.2016.91

[26] Redmon J. and Farhadi A., “Yolov3: An Incremental Improvement, arXiv preprint, arXiv:1804.02767.

[Online], 2018. https://doi.org/10.48550/arXiv.1804.02767

[27] Renwei T., Zhongjie Z., Yongqiang B., Ming G., and Zhifeng G., “Key Parts of Transmission Line Detection Using Improved YOLO V3,” The International Arab Journal of Information Technology, vol. 18, no. 6, pp. 747-754, 2021. https://doi.org/10.34028/iajit/18/6/1

[28] ResNet50, https://keras.io/api/applications/resnet/#resnet50- function, Last Visited, 2022.

[29] Rohella A. and Singh S., “Path Independent Real Time Points of Interest Detection In Road Networks,” in Proccedings of the 2nd International Conference on Contemporary Computing and Informatics, Greater Noida, pp. 633-638, 2016. doi: 10.1109/IC3I.2016.7918040.

[30] Ruta M., Scioscia F., Filippis D., Ieva S., Binetti M. and Di Sciascio E., “A Semantic-Enhanced Augmented Reality Tool for Openstreetmap POI Discovery,” Transportation Research Procedia, vol. 3, pp. 479-488, 2014. https://doi.org/10.1016/j.trpro.2014.10.029

[31] Sankaran S., “Pattern Matching Based Vehicle Density Estimation Technique For Traffic Monitoring Systems,” The International Arab Journal of Information Technology, vol. 19, no. 4, pp. 575-581, 2022. 2021 https://doi.org/10.34028/iajit/19/4/1

[32] Sarker I., “Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions,” SN Computer Science, vol. 2, no. 6, pp. 420, 2021. https://doi.org/10.1007/s42979-021-00815-1

[33] Shu Z., Xin S., Xu X., Liu L., and Kavan L., “Detecting 3D Points of Interest Using Multiple Features and Stacked Auto-Encoder,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 8, pp. 2583-2596, 2019. https://doi.org/10.1109/TMM.2021.3070977

[34] Sreenu G. and Saleem M., “Intelligent Video Surveillance: A Review Through Deep Learning Techniques for Crowd Analysis,” Journal of Big Data, vol. 6, no. 1, pp. 48, 2019. https://doi.org/10.1186/s40537-019-0212-5

[35] Tian Z., Shen C., Chen H., and He T., “FCOS: Fully Convolutional One-Stage Object Detection,” in Procceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9626-9635, 2019. https://doi.org/10.48550/arXiv.1904.01355

[36] Tsakanikas V. and Dagiuklas T., “Video Surveillance Systems-Current Status and Future Trends,” Computers and Electrical Engineering, vol. 70, pp. 736-753, 2018. https://doi.org/10.1016/j.compeleceng.2017.11.0 11

[37] Venkatesan R. and Li B., Convolutional Neural Networks in Visual Computing: A Concise Guide, CRC Press, 2017.

[38] Vosooghi R., Puchinger J., Jankovic M., and Vouillon A., “Shared Autonomous Vehicle Simulation And Service Design,” Transportation Research Part C: Emerging Technologies, vol. 107, pp. 15-33, 2019. https://doi.org/10.1016/j.trc.2019.08.006 Computational Intelligence Based Point of Interest Detection by Video ... 909

[39] Wolpert D. and Macready G., “Coevolutionary Free Lunches,” IEEE Transactions on Evolutionary Computation, vol. 9, no. 6, pp. 721- 735, 2005. doi: 10.1109/TEVC.2005.856205.

[40] Wu S., Li X., and Wang X., “IoU-Aware Single- Stage Object Detector for Accurate Localization,” Image and Vision Computing, vol. 97, pp. 103911, 2020. https://doi.org/10.48550/arXiv.1912.05992

[41] Wu T. and Ranganathan A., “A Practical System for Road Marking Detection and Recognition,” in Procceedings of the IEEE Intelligent Vehicles Symposium, Madrid, pp. 25-30, 2012. doi: 10.1109/IVS.2012.6232144.

[42] Yang W. and Ai T., “POI Information Enhancement Using Crowdsourcing Vehicle Trace Data and Social Media Data: A Case Study Of Gas Station,” ISPRS International Journal of Geo-Information, vol. 7, no. 5, pp. 178, 2018. https://doi.org/10.3390/ijgi7050178

[43] YOLOv5. https://github.com/ultralytics/yolov5 Last Visited, 2022.

[44] Yu Q., Jiang H., Liu C., and Wu M., “The Application of Data Mining in Multi-Supplier Points of Interest Processing,” in Procceedings of the 9th International Conference on Natural Computation, Shenyang, pp. 984-989, 2013. DOI:10.1109/ICNC.2013.6818119

[45] Zhang H., Li F., Liu S., Zhang L., Su H., Zhu J., and Shum H., “Dino: Detr With Improved Denoising Anchor Boxes for End-to-End Object Detection,” arXiv preprint, arXiv:2203.03605.

[Online], 2022.

[46] Zhang H., Wang Y., Dayoub F., and Sunderhauf N., “Varifocalnet: An Iou-Aware Dense Object Detector,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, pp. 8514-8523, 2021. https://doi.org/10.48550/arXiv.2008.13367

[47] Zheng Z., Ye R., Wang P., Ren D., Zuo W., Hou Q., and Cheng M. M., “Localization Distillation for Dense Object Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9407-9416, 2022. https://doi.org/10.48550/arXiv.2102.12252

[48] Zhou S., Bi Y., Wei X., Liu J., Ye Z., Li F., and Du Y., “Automated Detection and Classification of Spilled Loads on Freeways Based on Improved YOLO Network,” Machine Vision and Applications, vol. 32, no. 2, pp. 44, 2021. https://doi.org/10.1007/s00138-021-01171-z

[49] Zhu X., Su W., Lu L., Li B., Wang X., and Dai J., “Deformable Detr: Deformable Transformers for End-to-End Object Detection,” arXiv preprint, arXiv:2010.04159.

[Online], 2020. https://doi.org/10.48550/arXiv.2010.04159