UTILIZATION OF TEMPORAL DIMENSION IN SATELLITE IMAGERY: BETTER SEMANTIC SEGMENTATION WITH LOW DATA RESOURCES
- Details
- Hits: 7
Volume 7, Article e2025.02, 2025, Pages 1-12
Mirakram Aghalarov
Baku Higher Oil School, Baku, Azerbaijan, This email address is being protected from spambots. You need JavaScript enabled to view it.
Abstract
Time series image processing, a subfield of computer vision, enhances the accuracy of applications by leveraging temporal context. While this advantage is commonly utilized in video-based tasks, satellite imagery can also be treated as time series data when geospatial coordinates and timestamps are considered. Semantic segmentation, a key task in remote sensing, can benefit significantly from this temporal information. However, acquiring high-quality labeled datasets for such tasks remains a major challenge. In this study, we propose a novel temporal-aware domain adaptation framework for semantic segmentation, specifically targeting the detection of oil spills in the Caspian Sea. Our approach integrates time series information to improve cross-domain generalization. We evaluate our method on the synthetic SynthOil dataset, and a custom-labeled real-world dataset provided by Azercosmos and ArcGIS. Furthermore, we enhance the backbone of the Segformer model using a super-resolution dataset curated from Azercosmos and open data from the Esri ArcGIS platform. Experimental results demonstrate the effectiveness of our approach in improving segmentation performance across domains.
Keywords:
Semantic Segmentation, Satellite Imagery, Deep Learning, Computer Vision, Tem- poral Dimension, Spatio-temporal Processing.
DOI: https://doi.org/10.32010/26166127.2025.02
Reference
Ajibola, S., & Cabral, P. (2024). A systematic literature review and bibliometric analysis of semantic segmentation models in land cover mapping. Remote Sensing, 16(12), 2222.
Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587.
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020, November). A simple framework for contrastive learning of visual representations. In International conference on machine learning (pp. 1597-1607). PmLR.
Dosovitskiy, A. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929..
Ettedgui, S., Abu-Hussein, S., & Giryes, R. (2022). ProCST: Boosting semantic segmentation using progressive cyclic style-transfer. arXiv preprint arXiv:2204.11891..
Graham, B., El-Nouby, A., et al. (2021). Levit: a vision transformer in convnet's clothing for faster inference. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 12259-12269).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
Hoyer, L., Dai, D., & Van Gool, L. (2022). Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9924-9935).
Hsu, C. C., Lee, C. M., & Chou, Y. S. (2024). Drct: Saving image super-resolution away from information bottleneck. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6133-6142).
Lee, G., Eom, C., Lee, W., Park, H., & Ham, B. (2022, October). Bi-directional contrastive learning for domain adaptive semantic segmentation. In European Conference on Computer Vision (pp. 38-55). Cham: Springer Nature Switzerland.
Liang, J., Cao, J., et al. (2021). Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1833-1844).
Poudel, R. P., Liwicki, S., & Cipolla, R. (2019). Fast-scnn: Fast semantic segmentation network. arXiv preprint arXiv:1902.04502.
Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Cham: Springer international publishing.
Schwonberg, M., Niemeijer, J., & Termöhlen, J. A. (2023). Survey on unsupervised domain adaptation for semantic segmentation for visual perception in automated driving. IEEE Access, 11(54296-54336), 1-2.
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021, July). Training data-efficient image transformers & distillation through attention. In International conference on machine learning (pp. 10347-10357). PMLR.
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., & Luo, P. (2021). SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems, 34, 12077-12090.
Yang, Y., & Soatto, S. (2020). Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4085-4095).
Yao, S., Guan, R., et al. (2023). Radar-camera fusion for object detection and semantic segmentation in autonomous driving: A comprehensive review. IEEE Transactions on Intelligent Vehicles, 9(1), 2094-2128.
Yu, C., Gao, C., Wang, J., Yu, G., Shen, C., & Sang, N. (2021). Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. International journal of computer vision, 129(11), 3051-3068.
Zhou, T., Porikli, F., Crandall, D. J., Van Gool, L., & Wang, W. (2022). A survey on deep learning technique for video segmentation. IEEE transactions on pattern analysis and machine intelligence, 45(6), 7099-7122.
