Self-supervised spatio-temporal representation learning of Satellite Image Time Series

Iris Dumeur; Silvia Valero; Jordi Inglada

Pré-Publication, Document De Travail Année : 2023

Self-supervised spatio-temporal representation learning of Satellite Image Time Series

(1) , (1) , (1, 2)

1
2

Iris Dumeur

Fonction : Auteur

Centre d'études spatiales de la biosphère

Silvia Valero

Fonction : Auteur

Centre d'études spatiales de la biosphère

Jordi Inglada

Fonction : Auteur
PersonId : 893095
ORCID : 0000-0001-6896-0049
IdRef : 177262710

Centre d'études spatiales de la biosphère

Centre National d'Études Spatiales [Toulouse]

Résumé

In this paper, a new self-supervised strategy for learning meaningful representations of complex optical Satellite Image Time Series (SITS) is presented. The methodology proposed named U-BARN, a Unet-BERT spAtio-temporal Representation eNcoder, exploits irregularly sampled SITS. The designed architecture allows learning rich and discriminative features from unlabeled data, enhancing the synergy between the spatio-spectral and the temporal dimensions. To train on unlabeled data, a time series reconstruction pretext task inspired by the BERT strategy is proposed. A Sentinel-2 large-scale unlabeled data-set is used to pre-train U-BARN. To demonstrate its feature learning capability, representations of SITS encoded by U-BARN are then fed into a shallow classifier to generate semantic segmentation maps. Experimental results are conducted on a labeled data-set (PASTIS). Two ways of exploiting U-BARN pre-training are considered: either U-BARN weights are frozen (named U-BARN FR) or fine-tuned (U-BARN FT). The obtained results demonstrate that representations of SITS given by U-BARN FR are more efficient for land cover classification than those of a supervised-trained linear layer. Then, we observe in scenarios with scarce reference data-set that the fine-tuning brings a significative performance gain compared to fully-supervised approaches. We also investigate the influence of the percentage of element masked during pre-training on the quality of the SITS representation. Eventually, semantic segmentation performances show that the fully supervised U-BARN architecture reaches slightly better performances than the spatio-temporal baseline (U-TAE).

Mots clés

Satellite Image Time Series SITS Transformer Self-supervised learning Spatio-Temporal Network Unet Representation Learning

Domaines

Intelligence artificielle [cs.AI] Interfaces continentales, environnement Méthodologie [stat.ME] Machine Learning [stat.ML]

Fichier principal

article_dumeur_valero_inglada.pdf (33.68 Mo)

Origine : Fichiers produits par l'(les) auteur(s)
licence : CC BY ND - Paternité - Pas de modifications

Iris Dumeur : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04084839

Soumis le : mercredi 3 mai 2023-18:59:32

Dernière modification le : lundi 20 novembre 2023-11:44:18

Dates et versions

hal-04084839 , version 1 (28-04-2023)

hal-04084839 , version 2 (03-05-2023)

hal-04084839 , version 3 (13-07-2023)

hal-04084839 , version 4 (02-10-2023)

Identifiants

HAL Id : hal-04084839 , version 2

Citer

Iris Dumeur, Silvia Valero, Jordi Inglada. Self-supervised spatio-temporal representation learning of Satellite Image Time Series. 2023. ⟨hal-04084839v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

440 Consultations

29 Téléchargements

Self-supervised spatio-temporal representation learning of Satellite Image Time Series

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager