Downloads 685

..............................

Views 2k

..............................

Cited by

..............................

Received date April 29, 2024

Accepted date October 14, 2024

Application of Decomposition Expression in Digital Video Object Segmentation

Author Jianfu Kong,

Keywords #Video object segmentation #unsupervised #deep learning #decomposing expression #feature #bottleneck operator #foreground segmentation

Abstract

In the field of actual Video Object Segmentation (VOS), traditional techniques have poor adaptability and insufficient segmentation results. Therefore, based on existing problems, an Unsupervised Video Object Segmentation (UVOS) technique based on convolutional networks is proposed. Firstly, the method of decomposing expressions is used to handle the spatiotemporal relationship between the reference frame and the target frame, and video object reconstruction is achieved through similarity calculation. For target segmentation in motion scenes, a Single Linear Bottleneck Operator (SLBO) is introduced for feature extraction, and pooling compensation is used to optimize feature information loss. For general scene segmentation, a spatiotemporal similarity segmentation technique is introduced to achieve target video segmentation for complex scenes. In the foreground segmentation test of sports scenes, the Change Detection Benchmark Dataset 2014 (CDNet.20I4SM) dataset was selected to test the model's loss performance in different scenarios. In adverse weather scenario training, the proposed model tends to converge after 40 iterations, with a loss value of 0.276, which is superior to the Foreground image Segmentation (FgSegNet_), the Convolutional Networks for Biomedical Image Segmentation (MU Net), Cascade Convolutional Neural Network (Cascade CNN) models; In the accuracy test, the proposed FS-LBPC model tended to converge after 50 iterations, with a precision P-value of 0.963. It performed the best among the four segmentation models the FgSegNet_, MU Net, Cascade CNN, and a real-time Foreground Segmentation network based on single Linear Bottleneck and Pooling Compensation (FS-LBPC). Usually, the Densely Annotated VIdeo Segmentation (DAVIS16) dataset is selected for video scene segmentation, which has the best segmentation performance in horse racing and animal flight scenes, with segmentation accuracy of 0.976 and 0.965, respectively. In summary, the VOS technology has excellent application effects in practical scenarios, providing important technical references for the improvement of image and video processing and segmentation technology.

References

[1] Ahmad J., Muhammad K., Lloret J., and Baik S., “Efficient Conversion of Deep Features to Compact Binary Codes Using Fourier Decomposition for Multimedia Big Data,” IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3205-3215, 2018. DOI:10.1109/TII.2018.2800163

[2] Ammar S., Bouwmans T., Zaghden N., and Neji M., “Deep Detector Classifier (DeepDC) for Moving Objects Segmentation and Classification in Video Surveillance,” IET Image Process, vol. 14, no. 8, pp. 1490-1501, 2020. https://doi.org/10.1049/iet-ipr.2019.0769

[3] Bian J., Zhan H., Wang N., Li Z., Zhang L., Shen C., Chen M., and Reid I., “Unsupervised Scale- Consistent Depth Learning from Video,” International Journal of Computer Vision, vol. 129, no. 9, pp. 2548-2564, 2021. https://link.springer.com/article/10.1007/s11263- 021-01484-6

[4] Chan S., Huang C., Bai C., Ding W., and Chen S., “Res2-UNeXt: A Novel Deep Learning Framework for Few-Shot Cell Image Segmentation,” Multimedia Tools and Applications, vol. 81, no. 10, pp. 13275-13288, 2022. https://link.springer.com/article/10.1007/s11042- 021-10536-5

[5] Das P., Karaoglu S., and Gevers T., “Intrinsic Image Decomposition Using Physics-Based Cues and CNNs,” Computer Vision and Image Understanding, vol. 223, pp. 103538, 2022. https://doi.org/10.1016/j.cviu.2022.103538 Application of Decomposition Expression in Digital Video Object Segmentation 33

[6] Deepak K., Chandrakala S., and Mohan C., “Residual Spatiotemporal Autoencoder for Unsupervised Video Anomaly Detection,” Signal Image Video Process, vol. 15, no. 1, pp. 215-222, 2021. https://link.springer.com/article/10.1007/s11760- 020-01740-1

[7] Falaschetti L., Manoni L., and Turchetti C., “A Low-Rank CNN Architecture for Real-Time Semantic Segmentation in Visual SLAM Applications,” IEEE Open Journal of Circuits and Systems, vol. 3, pp. 115-133, 2022. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnu mber=9773325

[8] Fan J., Liu B., Zhang K., and Liu Q., “Semi- Supervised Video Object Segmentation Via Learning Object-Aware Global-Local Correspondence,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 12, pp. 8153-8164, 2022. DOI:10.1109/TCSVT.2021.3098118

[9] Fu Y., Yang L., Liu D., Huang T., and Shi H., “Compfeat: Comprehensive Feature Aggregation for Video Instance Segmentation,” in Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtual, pp. 1361-1369, 2021. https://doi.org/10.1609/aaai.v35i2.16225

[10] Giraldo J., Javed S., and Bouwmans T., “Graph Moving Object Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 5, pp. 2485-2503, 2020. DOI:10.1109/TPAMI.2020.3042093

[11] Huang P., Han J., Liu N., Ren J., and Zhang D., “Scribble-Supervised Video Object Segmentation,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 2, pp. 339-353, 2022. DOI:10.1109/JAS.2021.1004210

[12] Hussein M., Puyal J., and Lines D., Sehgal V., Toth D., Ahmad O., Kader R., Everson M., Lipman G., Fernandez‐Sordo J., Ragunath K., Esteban J., Bisschops R., Banks M., Haefner M., Mountney P., Stoyanov D., Lovat L., and Haidry R., “A New Artificial Intelligence System Successfully Detects and Localises Early Neoplasia in Barrett’s Esophagus by Using Convolutional Neural Networks,” United European Gastroenterology Journal, vol. 10, no. 6, pp. 528-537, 2022. https://onlinelibrary.wiley.com/doi/epdf/10.1002/ ueg2.12233

[13] Khan R., Kifayat Ullah., Pamucar D., and Bari M., “Performance Measure Using a Multi-Attribute Decision-Making Approach Based on Complex T- Spherical Fuzzy Power Aggregation Operators,” Journal of Computational and Cognitive Engineering, vol. 1, no. 3, pp. 138-146, 2022. https://doi.org/10.47852/bonviewJCCE696205514

[14] Lee Y., Seong H., and Kim E., “Iteratively Selecting an Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier,” in Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, pp. 1245-1253, 2022. https://doi.org/10.1609/aaai.v36i2.20011

[15] Li D., Li R., Wang L., Wang Y., Qi J., Zhang L., Liu T., Xu Q., and Lu H. C., “You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation,” in Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, pp. 1297-1305, 2022. https://doi.org/10.1609/aaai.v36i2.20017

[16] Lin F., Xie H., Liu C., and Zhang Y., “Bilateral Temporal Re-Aggregation for Weakly-Supervised Video Object Segmentation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 7, pp. 4498-4512, 2022. DOI:10.1109/TCSVT.2021.3127562

[17] Liu W., Lin G., Zhang T., and Liu Z., “Guided Co- Segmentation Network for Fast Video Object Segmentation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 4, pp. 1607-1617, 2020. DOI:10.1109/TCSVT.2020.3010293

[18] Logeshwaran J., Kiruthiga T., Aravindarajan V., and Ravi S., “SVPA-the Segmentation Based Visual Processing Algorithm (SVPA) for Illustration Enhancements in Digital Video Processing (DVP),” ICTACT Journal on Image and Video Processing, vol. 12, no. 3, pp. 2669- 2673, 2022. DOI:10.21917/ijivp.2022.0379

[19] Lu X., Wang W., Shen J., Crandall D., and Luo J., “Zero-Shot Video Object Segmentation with Co- Attention Siamese Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 4, pp. 2228-2242, 2020. DOI:10.1109/TPAMI.2020.3040258

[20] Luo H., Sun B., Zhou H., and Cao W., “Image Segmentation with Multi-feature Fusion in Compressed Domain based on Region-Based Graph,” The International Arab Journal of Information Technology, vol. 20, no. 2, pp. 159- 169, 2023. https://doi.org/10.34028/iajit/20/2/2

[21] Pu S., Zhao W., Chen W., Yang S., Xie D., and Pan Y., “Unsupervised Object Detection with Scene-Adaptive Concept Learning,” Frontiers of Information Technology and Electronic Engineering, vol. 22, no. 5, pp. 638-651, 2020. https://link.springer.com/article/10.1631/FITEE.2 000567

[22] Qi J., Gao Y., Hu Y., Wang X., Liu X., Bai X., Belongie S., Yuille A., Torr P., and Bai S., “Occluded Video Instance Segmentation: A Benchmark,” International Journal of Computer Vision, vol. 130, no. 8, pp. 2022-2039, 2022. 34 The International Arab Journal of Information Technology, Vol. 22, No. 1, January 2025 https://link.springer.com/article/10.1007/s11263- 022-01629-1

[23] Raman N., Wahab A., and Chandrasekaran S., “Computation of Workflow Scheduling Using Backpropagation Neural Network in Cloud Computing: A Virtual Machine Placement Approach,” The Journal of Supercomputing, vol. 77, no. 9, pp. 9454-9473, 2021. https://link.springer.com/article/10.1007/s11227- 021-03648-0

[24] Shahrezaei I. and Kim H., “Fractal Analysis and Texture Classification of High-Frequency Multiplicative Noise in SAR Sea-Ice Images Based on a Transform-Domain Image Decomposition Method,” IEEE Access, vol. 8, pp. 40198-40223, 2020. DOI:10.1109/ACCESS.2020.2976815

[25] Shakeel N. and Shakeel S, “Context-Free Word Importance Scores for Attacking Neural Networks,” Journal of Computational and Cognitive Engineering, vol. 1, no. 4, pp. 187-192, 2022. https://doi.org/10.47852/bonviewJCCE2202406

[26] Tan Z., Liu B., Chu Q., Zhong H., Wu Y., Li W., and Yu N., “Real Time Video Object Segmentation in Compressed Domain,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 1, pp. 175-188, 2021. DOI:10.1109/TCSVT.2020.2971641

[27] Vecchio G., Palazzo S., Giordano D., Rundo F., and Spampinato C., “MASK-RL: Multiagent Video Object Segmentation Framework through Reinforcement Learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 12, pp. 5103-5115, 2020. DOI:10.1109/TNNLS.2019.2963282

[28] Vinayaraj P., Sugimoto R., Nakamura R., and Yamaguchi Y., “Transfer Learning with CNNs for Segmentation of PALSAR-2 Power Decomposition Components,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, no. 5, pp. 6352- 6361, 2020. DOI:10.1109/JSTARS.2020.3031020

[29] Wang W., Shen J., Lu X., Hoi S., and Ling H., “Paying Attention to Video Object Pattern Understanding,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 7, pp. 2413-2428, 2021. DOI:10.1109/TPAMI.2020.2966453

[30] Zheng M., Huang Y., Chen Q., and Liu Y., “Weakly Supervised Video Moment Localization with Contrastive Negative Sample Mining,” in Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, pp. 3517-3525, 2022. https://doi.org/10.1609/aaai.v36i3.20263

[31] Zhou T., Porikli F., Crandall D., Gool L., and Wang W., “A Survey on Deep Learning Technique for Video Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, no. 6, pp. 7099-7122, 2023. DOI:10.1109/TPAMI.2022.3225573

[32] Zhou T., Wang S., Zhou Y., Yao Y., Li J., and Shao L., “Motion-Attentive Transition for Zero- Shot Video Object Segmentation,” in Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, pp. 13066-13073, 2020. https://doi.org/10.1609/aaai.v34i07.7008

[33] Zhou Y., Xu X., Shen F., Zhu X., and Shen H., “Flow-Edge Guided Unsupervised Video Object Segmentation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 12, pp. 8116-8127, 2022. DOI:10.1109/TCSVT.2021.3057872

[34] Zhu W., Li J., Lu J., and Zhou J., “Separable Structure Modeling for Semi-Supervised Video Object Segmentation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 1, pp. 330-344, 2021. DOI:10.1109/TCSVT.2021.3060015

Abstract:

URL: https://iajit.org/paper/5150

,abstract={

},
keywords={Video object segmentation,unsupervised,deep learning,decomposing expression,feature,bottleneck operator,foreground segmentation},
ISSN={2413-9351},
month={Jan}}

AB -