TransWikia.com

Large-scale cloud masking, grouping and mosaic in Descartes Labs platform

Geographic Information Systems Asked on August 26, 2021

I am looking to create a stack of cloud-free sentinel-2 imagery by day for a large portion of California for all of 2020 in the Descartes Lab platform. I realize that this is a very large amount of imagery and will require significant memory to store and so I’m looking for methods to mosaic imagery by day. I’m new to Descartes and I think using the groupby and mosaic methods could work but not sure how to implement the groupby function on a masked image collection.

First, I establish a small area of interest in Northern California and dates from January to March of 2020 (I am using a small area and small time range to test out the workflow):

import descarteslabs as dl
import numpy as np

# Define a bounding box in N CA, in a GeoJSON
aoi = {
    "type": "Polygon",
    "coordinates": [
        [
            [-121.8430, 38.9676], 
            [-121.6991, 38.9676], 
            [-121.6991, 38.8590], 
            [-121.8430, 38.8590], 
        ]
    ],
}

start_datetime = "2019-01-01"
end_datetime = "2019-03-30"

Next, I create a Sentinel-2 scene stack with desired bands:

sentinel_scenes, sentinel_ctx = dl.scenes.search(
    aoi,
    products="sentinel-2:L1C",
    start_datetime = start_datetime,
    end_datetime = end_datetime,
#    cloud_fraction=0.7,
    limit = 500,
)

sentinel_all_stack = sentinel_scenes.stack('red green blue red-edge nir swir1 derived:ndvi', 
                                           sentinel_ctx, 
                                           processing_level='surface')

Then I create a Descartes Labs Sentinel-2 cloudmask stack for masking:

dlcloud_scenes, dlcloud_ctx = dl.scenes.search(
    aoi,
    products=["sentinel-2:L1C:dlcloud:v1"],
    start_datetime=start_datetime, end_datetime=end_datetime,
    limit=None
)


dlcloud_valid_cloudfree_stack = dlcloud_scenes.stack('valid_cloudfree', 
                                                     sentinel_ctx , 
                                                     data_type='Byte')

Next, I mask the Sentinel-2 collection with the Descartes Labs cloudmask:

dlcloud_valid_cloudfree_stack_rep = np.repeat(a=dlcloud_valid_cloudfree_stack,
                                          repeats=7,
                                          axis=1)

# Mask the pixel values where valid_cloud_free is 0
cloudfree_stack = np.ma.masked_where(dlcloud_valid_cloudfree_stack_rep==0, sentinel_all_stack)

From here, I would like to group the imagery by day. I tried:

for (year, month, day), day_scenes in imagery.groupby(
    "properties.date.year", "properties.date.month", "properties.date.day"):
    print("{}: {} scenes".format(month, np.array(day_scenes.stack("blue green red red-edge nir swir1 derived:ndvi", ctx)).shape))

And I get the error: ‘MaskedArray’ object has no attribute ‘groupby’

I would like to mosaic all images grouped by same day afterwards. All ideas are very welcome!

One Answer

I believe the issue here may be that you are mixing the Workflows API (which has a groupby method) and the Scenes API (which does not). Full disclosure, I am a Descartes Labs employee.

Workflows is a bit different than Scenes in that is uses lazy proxy objects to define a series of processing steps, but does not actually do any of the data processing until the .compute(aoi) method is called for an aoi. More details are available here.

Here is an example showing how to create the image stack you are interested in using Workflows:

First we create a geocontext from the geojson of your aoi, this is used later when we run the workflow using .compute().

import descarteslabs as dl
import descarteslabs.workflows as wf
import numpy as np

# Define a bounding box in N CA, in a GeoJSON
aoi_geo = {
    "type": "Polygon",
    "coordinates": [
        [
            [-121.8430, 38.9676], 
            [-121.6991, 38.9676], 
            [-121.6991, 38.8590], 
            [-121.8430, 38.8590], 
        ]
    ],
}

aoi = wf.GeoContext(
    geometry=aoi_geo,
    resolution=10.0,
    crs='EPSG:3857',
    bounds_crs='EPSG:4326')

start_datetime = "2019-01-01"
end_datetime = "2019-02-28"

Here we define the Sentinel 2 imagery stack and cloud mask, then mask out cloudy pixels in each image.

s2_stack = (wf.ImageCollection.from_id('sentinel-2:L1C', 
                               start_datetime=start_datetime, 
                               end_datetime=end_datetime)
            .pick_bands('red green blue red-edge nir swir1 derived:ndvi')
           )

s2_cloud_mask = (wf.ImageCollection.from_id('sentinel-2:L1C:dlcloud:v1', 
                               start_datetime=start_datetime, 
                               end_datetime=end_datetime)
                .pick_bands('valid_cloudfree')
                )

s2_stack = s2_stack.concat_bands(s2_cloud_mask)

s2_masked = s2_stack.map(lambda img: img.mask(img.pick_bands('valid_cloudfree')==0))

Then we use the groupby and mosaic methods to create the daily image mosaic.

s2_daily = (s2_masked
            .groupby(dates=('year', 'month', 'day'))
            .mosaic()
            .pick_bands('red green blue red-edge nir swir1 derived:ndvi')
)

Finally we use the aoi to apply the data processing steps above to all the Sentinel 2 imagery in the aoi during the time period of interest.

s2_data = s2_daily.compute(aoi)

Correct answer by John Shriver on August 26, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP