|
Dataset Persistent ID
|
doi:10.26165/JUELICH-DATA/2YP2PZ |
|
Publication Date
|
2026-05-07 |
|
Title
|
Biomazon: A Multimodal Dataset for 3D Forest Structure and Biomass Modeling in the Amazon Basin
|
|
Author
|
Mandal, Sayan (Jülich Supercomputing Centre) - ORCID: https://orcid.org/0000-0003-3637-6029
Sedona, Rocco (Jülich Supercomputing Centre) - ORCID: https://orcid.org/0000-0003-4089-972X
Besnard, Simon (GFZ Helmholtz Centre for Geosciences) - ORCID: https://orcid.org/0000-0002-1137-103X
Urbazaev, Mikhail (GFZ Helmholtz Centre for Geosciences) - ORCID: https://orcid.org/0000-0002-0327-6278
Zandi, Ehsan (Jülich Supercomputing Centre) - ORCID: https://orcid.org/0000-0003-0135-257X
Cavallaro, Gabriele (Jülich Supercomputing Centre) - ORCID: https://orcid.org/0000-0002-3239-9904
|
|
Contact
|
Use email button above to contact.
Mandal, Sayan (Jülich Supercomputing Centre)
Sedona, Rocco (Jülich Supercomputing Centre)
|
|
Description
|
Biomazon is a large-scale, multi-modal remote sensing benchmark dataset covering the Amazon basin, designed for wall-to-wall estimation of forest vertical structure and aboveground biomass from dense multi-sensor Earth Observation data. The dataset fuses six complementary input modalities -- Sentinel-2 L2A optical surface reflectance, ALOS-2 PALSAR-2 L-band SAR, Sentinel-1 C-band SAR (each in ascending and descending orbits), Dynamic World V1 land-use/land-cover, Copernicus GLO-30 DEM elevation, and AlphaEarth Embeddings (AEX) foundation-model features -- totalling 88 input bands. The prediction targets are GEDI spaceborne LiDAR Relative Height profiles (RH0--RH100 in meter, 101 percentile bands capturing the full canopy height distribution) and GEDI Aboveground Biomass Density (AGBD, in Mg/ha), together comprising 102 label bands at 20 m resolution. All data span the period April 2019 to March 2023. Biomazon is intended for developing and benchmarking models that predict these 3D forest structure and biomass targets from the combined optical, SAR, elevation, land-cover, and foundation-model embedding inputs.
The dataset is tiled on the Harmonized Landsat-Sentinel (HLS) MGRS tiling grid and delivered as 20m 256x256-pixel patches (5.12 km x 5.12 km) with 50% overlap, stored in HDF5 files (one per split: train / val / test). (2026-04-27)
|
|
Subject
|
Earth and Environmental Sciences
|
|
Keyword
|
benchmark dataset
GEDI
full RH profile
relative height
canopy height
aboveground biomass
multimodal
Amazon basin
Sentinel-2
Sentinel-1
ALOS-2 PALSAR-2
Copernicus GLO30 DEM
deep learning
joint modeling
|
|
Related Publication
|
Biomazon: A Multimodal Dataset for 3D Forest Structure and Biomass Modeling in the Amazon Basin
To be submitted to IEEE JSTARS Special Issue "AI-Driven Multimodal Remote Sensing for Forestry Monitoring and Management"
|
|
Grant Information
|
3D-ABC, Helmholtz Foundation Model Initiative
GCS/NIC: 3d-abc: 61954
|
|
Depositor
|
Mandal, Sayan
|
|
Deposit Date
|
2026-04-27
|
|
Time Period Covered
|
Start: 2019-04-01 ; End: 2023-03-30
|
|
Software
|
Python, Version: 3.11.9
earthengine-api, Version: 1.6.3
gediDB, Version: 2025.4.25
|
|
Data Sources
|
(A) Sentinel-2 L2A SR
- Contains modified Copernicus Sentinel data 2019-2023.
- European Space Agency, “Copernicus Sentinel-2 (processed by ESA), 2021, MSI Level-2A BOA Reflectance Product. Collection 1,” 2021. [Online]. Available: https://doi.org/10.5270/S2_-znk9xsj
- Google Earth Engine Data Catalog, “Harmonized Sentinel-2 MSI: Mul- tiSpectral Instrument, Level-2A (SR),” https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED, [Accessed 20-10-2025].; (B) Cloud Score+
- V. J. Pasquarella, C. F. Brown, W. Czerwinski, and W. J. Rucklidge, “Comprehensive quality assessment of optical satellite imagery using weakly supervised video learning,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023, pp. 2125–2135.
- Google Earth Engine Data Catalog, “Cloud Score+ S2 HARMONIZED V1,” https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_CLOUD_SCORE_PLUS_V1_S2_HARMONIZED, [Accessed 20-10-2025].; (C) Sentinel-1
- Contains modified Copernicus Sentinel data 2019-2023.
- Google Earth Engine Data Catalog, “Sentinel-1 SAR GRD: C- band Synthetic Aperture Radar Ground Range Detected, log scaling,” https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD, [Accessed 07-10-2025]; (D) ALOS-2 PALSAR-2 ScanSAR Level 2.2
- This data have been provided by Earth Observation Research Center (EORC) of Japan Aerospace Exploration Agency (JAXA) via Google Earth Engine
- Google Earth Engine Data Catalog, “PALSAR-2 ScanSAR Level 2.2,” https://developers.google.com/earth-engine/datasets/catalog/JAXA_ALOS_PALSAR-2_Level2_2_ScanSAR, [Accessed 06-10-2025].; (E) Copernicus GLO-30 DEM
- © DLR e.V. 2010-2014 and © Airbus Defence and Space GmbH 2014-2018 provided under COPERNICUS by the European Union and ESA; all rights reserved.
- Google Earth Engine Data Catalog, “Copernicus DEM GLO-30: Global 30m Digital Elevation Model ,” https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_DEM_GLO30, [Accessed 06-10-2025]; (F) Dynamic World V1 LULC
- C. F. Brown, S. P. Brumby, B. Guzder-Williams, T. Birch, S. B. Hyde, J. Mazzariello, W. Czerwinski, V. J. Pasquarella, R. Haertel, S. Ilyushchenko et al., “Dynamic world, near real-time global 10 m land use land cover mapping,” Scientific data, vol. 9, no. 1, p. 251, 2022.
- Google Earth Engine Data Catalog, “Dynamic World V1,” https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_DYNAMICWORLD_V1, [Accessed 04-10-2025].; (G) AlphaEarth Embedding
- The AlphaEarth Foundations Satellite Embedding dataset is produced by Google and Google DeepMind.
- C. F. Brown, M. R. Kazmierski, V. J. Pasquarella, W. J. Rucklidge, M. Samsikova, C. Zhang, E. Shelhamer, E. Lahera, O. Wiles, S. Ilyushchenko, N. Gorelick, L. L. Zhang, S. Alj, E. Schechter, S. Askay, O. Guinan, R. Moore, A. Boukouvalas, and P. Kohli, “Alphaearth foundations: An embedding field model for accurate and efficient global mapping from sparse label data,” 2025. [Online]. Available: https://arxiv.org/abs/2507.22291
- Google Earth Engine Data Catalog, “Satellite Embedding V1,” https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_SATELLITE_EMBEDDING_V1_ANNUAL, [Accessed 17-09-2025].; (H) GEDI L2A Relative Height and L4A Above Ground Biomass Density
- Besnard et al., (2025). gediDB: A toolbox for processing and providing Global Ecosystem Dynamics Investigation (GEDI) L2A-B and L4A-C data. Journal of Open Source Software, 10(113), 8593, https://doi.org/10.21105/joss.08593
|