TransWikia.com

Georeferencing Sentinel-3 LST product

Geographic Information Systems Asked by TheOGMalan on June 2, 2021

I have been stuck attempting to convert Sentinel-3 LST netCDF data to geotiffs for a while now as part of my master’s thesis.

Firstly, an introduction to the data. It is packaged as follow:
image

The data is packaged as several .nc files, as well as a xml file containing metadata. The metadata suggests that the data should be in EPSG:4346:

     <metadataObject ID="measurementFrameSet" classification="DESCRIPTION" category="DMD">
        <metadataWrap mimeType="text/xml" vocabularyName="Sentinel-SAFE" textInfo="Frame Set">
           <xmlData>
              <sentinel-safe:frameSet>
                 <sentinel-safe:footPrint srsName="http://www.opengis.net/def/crs/EPSG/0/4326">
                    <gml:posList>-29.1992 8.81452 -29.1684 9.00931 -29.0806 9.52497 -28.9909 10.035 -28.896 10.5474 -28.801 11.0545 -28.7054 11.5734 -28.605 12.0778 -28.5082 12.5866 -28.4088 13.0983 -28.2994 13.6035 -28.197 14.1087 -28.0927 14.6147 -27.9761 15.1199 -27.8675 15.6236 -27.7564 16.1214 -27.6423 16.627 -27.525 17.1239 -27.4036 17.6145 -27.2806 18.1169 -27.1611 18.6177 -27.0344 19.1076 -26.9124 19.605 -26.781 20.0911 -26.6548 20.5894 -26.5286 21.0864 -26.3885 21.5669 -26.2565 22.0561 -26.1192 22.5398 -25.9879 23.0344 -25.8475 23.5168 -28.389 24.4405 -30.9637 25.4388 -33.5273 26.5043 -36.0695 27.6424 -36.078 27.6463 -36.2329 27.1108 -36.4066 26.5843 -36.5511 26.0457 -36.7063 25.5069 -36.8573 24.9628 -37.0105 24.4346 -37.1568 23.8819 -37.2966 23.3343 -37.4393 22.7945 -37.5805 22.24 -37.7181 21.6969 -37.8487 21.1329 -37.9835 20.584 -38.1138 20.0259 -38.2315 19.464 -38.3634 18.9072 -38.4821 18.3398 -38.5973 17.7783 -38.7109 17.2074 -38.8288 16.6385 -38.9366 16.0638 -39.0387 15.4947 -39.148 14.9205 -39.2443 14.3444 -39.3456 13.7696 -39.4421 13.1931 -39.5308 12.6058 -39.6274 12.0296 -39.7142 11.4424 -39.7416 11.2224 -39.7345 11.2088 -37.1002 10.5838 -34.4547 9.9742 -31.8069 9.37884 -29.1992 8.81452</gml:posList>
                 </sentinel-safe:footPrint>
              </sentinel-safe:frameSet>
           </xmlData>
        </metadataWrap>
     </metadataObject>

Now I attempted to load the data into xarray using this:

import rioxarray
import xarray
from pyproj import CRS
root = r'G:MScCodeJupyterLabtempS3B_SL_2_LST____20201214T214826_20201214T215126_20201216T025309_0179_046_385_5400_LN2_O_NT_004.SEN3*.nc'
xds = xarray.open_mfdataset(root,combine = 'by_coords', decode_times=False)

This did not work, and output the following error:

ValueError: arguments without labels along dimension 'columns' cannot be aligned because they have different dimension sizes: {130, 1500}

Now I believe the problem with this is that the _in files have a 1200 x 1500 grid, whereas the _tx files have a 500 x 1500 grid. Working around this problem, we can open up only the _in files (or the _tx files, but I am using _in for the example):
image

If I am correct in interpreting the information, it appears that the netCDF does not have any spatial dimensions defined, despite that I have attempted to set the CRS using xds.rio.set_crs(4326) and exporting that to a geotiff using xds.to_netcdf(r'G:MScTestImagetestouttestout.tiff'), but that did not work and seemed to only map the row and column values as coordinates. This did not work.

Now the Sentinel-3 LST product does have some form of spatial referencing in the data that I assume I can work with – the longitude_in/_tx and latitude_in/_tx bands. These are gridded values containing the geographic longitude/latitude coordinates for every cell in the netCDF. I have used this to extract point values from the netCDF grid in the past and have attempted to solve this problem by turning every cell into a point feature, but this proved far too slow to be of much use.

Anyway, I have a feeling that this information could be used somehow to georeference the netCDFs, either by creating a new set of spatial dimensions, or failing that they should be able to act as GCPs, but I fear that this second option may also be too slow. I have also tried using SNAP, but unfortunately it proved too slow and unstable to work for this problem. Speed is a bit of an issue, as I am aiming to process a year’s worth of information over an AOI.

One Answer

Short answer:

  1. use GDAL tools to extract lat and long data from an .nc file.
  2. Merge it with the data from any band you want to see to make an x,y,z dataset.
  3. Convert the irregular x,y,z data to a shapefile and apply a CRS.
  4. Grid the shapefile at desired resolution.

The lat/long data are not regularly sampled, so the data can't be loaded as an XYZ raster directly, and need to be gridded.

This flow works on the one dataset that I tried.

   gdal_translate -of XYZ NETCDF:"S4_radiance_an.nc":S4_radiance_an S4_radiance.csv
   gdal_translate -of XYZ NETCDF:"geodetic_an.nc":latitude_an latitude.csv
   gdal_translate -of XYZ NETCDF:"geodetic_an.nc":longitude_an longitude.csv

   echo X,Y,Z >merged.csv
   paste longitude.csv latitude.csv S4_radiance.csv|gawk 'BEGIN{OFS=","}{print($3/1000000,$6/1000000,$9)}' >>merged.csv
   ogr2ogr -a_srs "EPSG:4326"  -mapFieldType All=Integer -oo X_POSSIBLE_NAMES=X* -oo Y_POSSIBLE_NAMES=Y* -oo KEEP_GEOM_COLUMNS=NO newS4_radiance.shp merged.csv

enter image description here

Long Answer

Use GDAL tools from the command line. For example, file S4_radiance_an.nc

gdalinfo S4_radiance_an.nc

At the bottom of the output, there are lines like:

Subdatasets:
  SUBDATASET_1_NAME=NETCDF:"S4_radiance_an.nc":S4_exception_an
  SUBDATASET_1_DESC=[2400x3000] S4_exception_an (8-bit unsigned integer)
  SUBDATASET_2_NAME=NETCDF:"S4_radiance_an.nc":S4_exception_orphan_an
  SUBDATASET_2_DESC=[2400x374] toa_radiance_status_flag (8-bit unsigned integer)
  SUBDATASET_3_NAME=NETCDF:"S4_radiance_an.nc":S4_radiance_an
  SUBDATASET_3_DESC=[2400x3000] toa_radiance (16-bit integer)
  SUBDATASET_4_NAME=NETCDF:"S4_radiance_an.nc":S4_radiance_orphan_an
  SUBDATASET_4_DESC=[2400x374] toa_radiance (16-bit integer)

So resolution is 2400 x 3000 for bands that contain 'an' (nadir c.f. oblique).

The same information from geodetic_an.nc yields:

Subdatasets:
  SUBDATASET_1_NAME=NETCDF:"geodetic_an.nc":elevation_an
  SUBDATASET_1_DESC=[2400x3000] surface_altitude (16-bit integer)
  SUBDATASET_2_NAME=NETCDF:"geodetic_an.nc":elevation_orphan_an
  SUBDATASET_2_DESC=[2400x374] surface_altitude (16-bit integer)
  SUBDATASET_3_NAME=NETCDF:"geodetic_an.nc":latitude_an
  SUBDATASET_3_DESC=[2400x3000] latitude (32-bit integer)
  SUBDATASET_4_NAME=NETCDF:"geodetic_an.nc":latitude_orphan_an
  SUBDATASET_4_DESC=[2400x374] latitude (32-bit integer)
  SUBDATASET_5_NAME=NETCDF:"geodetic_an.nc":longitude_an
  SUBDATASET_5_DESC=[2400x3000] longitude (32-bit integer)
  SUBDATASET_6_NAME=NETCDF:"geodetic_an.nc":longitude_orphan_an
  SUBDATASET_6_DESC=[2400x374] longitude (32-bit integer)

Resolution is again 2400 x 3000, so we assume they are related.

The NAME and DESCRIPTION of set 3 are:

  SUBDATASET_3_NAME=NETCDF:"geodetic_an.nc":latitude_an
  SUBDATASET_3_DESC=[2400x3000] latitude (32-bit integer)

We can use the NAME to load using GDAL tools. First, translate to XYZ format with:

gdal_translate -of XYZ NETCDF:"geodetic_an.nc":latitude_an latitude.csv

Similarly for the longitude file. In each case, records are like:

2990.5 2399.5 7575671
2991.5 2399.5 7574636
2992.5 2399.5 7573601

So the format is I, J, Latitude, and latitude is in (degrees * 1000000).

We can merge the XYZ files representing Latitude, Longitude and, say, Radiance (from S4_radiance_an.nc, for example, which is also 2400 x 3000 in the band "S4_radiance_an.nc":S4_radiance_an"). The first and last records of these three files are:

S4_radiance.csv
0.5 0.5 -32768
1.5 0.5 -32768
2.5 0.5 -32768
2997.5 2399.5 -32768
2998.5 2399.5 -32768
2999.5 2399.5 -32768
latitude.csv
0.5 0.5 -73571
1.5 0.5 -74567
2.5 0.5 -75563
2997.5 2399.5 7568425
2998.5 2399.5 7567390
2999.5 2399.5 7566355
longitude.csv
0.5 0.5 95229130
1.5 0.5 95233511
2.5 0.5 95237892
2997.5 2399.5 110768320
2998.5 2399.5 110772730
2999.5 2399.5 110777141

So they have the same limits and are therefore related. Merge them using any method you like. They have the same number of rows (wc <S4_radiance.csv =>> 7200000 21600000 127768857), so on Unix paste and AWK will work:

paste longitude.csv latitude.csv S4_radiance.csv|gawk '{print($3/1000000,$6/1000000,$9)}' >merged.csv

This gives a file like:

X,Y,Z
95.2291,-0.073571,-32768
95.2335,-0.074567,-32768
.
.
110.773,7.56739,-32768
110.777,7.56635,-32768

The Z values are -32768, which means nodata, and is normal at the edge of a raster. X and Y values are in degrees These can be loaded as point data and gridded as required.

Correct answer by wingnut on June 2, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP