Sentinel-5p SO2-COBRA data processing

This chapter describes the tasks performed for processing Sentinel-5p SO2 data from the COBRA processing.

See also the chapter on the operational Sentinel-5p SO2 data processing.

Product description

The product is available from the S5P-PAL Data Portal; see the specific information on the SO2CBR page.

According to the Product Readme File (PRF) and what is found in the files:

  • The retrieval product is a column density (mol/m2), which will be treated by CSO as a profile with \(n_r=1\) layers:

    \[\mathbf{y}_r\]

    Note that the product also contains sub-columns for layers at the surface (1 km thick?), and centered around 7 and 15 km; exact definition of the sub-columns is unclear.

  • The simulation of a retrieval product from a model state does not require an apriori profile, and should be computed from:

    \[\mathbf{y}_s ~=~ \mathbf{A}\ \mathbf{V}\mathbf{G}\ \mathbf{x}\]

    where:

    • \(\mathbf{y}_s\) is the simulated retrieval (mol/m2) defined on \(n_r=1\) layers;

    • \(\mathbf{A}\) is the averaging kernel matrix with shape \((n_r,n_a)\); with \(n_a\) the number of a priori layers;

    • \(\mathbf{x}\) is the atmospheric state, which probably consists of a 3D array of SO2 concentrations;

    • operators \(\mathbf{G}\) and \(\mathbf{V}\) together compute a simulated profile at the \(n_a\) a priori layers from the state, using horizontal (\(\mathbf{G}\)) and vertical (\(\mathbf{V}\)) mappings; units should be the same as the retrieval product (mol/m2).

    In case \(\mathbf{x}^{true}\) is the true atmoshperic state, the retrieval error is quantified by the retrieval error covariance \(\mathbf{R}\) (in this scalar product a variance):

    \[\mathbf{y}_s ~-~ \mathbf{A}\ \mathbf{V}\mathbf{G}\ \mathbf{x}^{true} ~\sim~ \mathcal{N}\left(\mathbf{o},\mathbf{R}\right)\]
  • The retrieval status and quality is indicated by the qa_value. The recommended minimum is 0.5, this excludes cloudy scenes and other problematic retrievals.

  • A detection flag is present with values:

    • 0 = no detection

    • 1 = detection

    • 2 = clear detection close to known volcano

    • 3 = clear detection close to known anthropogenic source

    • 4 = detection at high SZA

References

  • Theys, N., Fioletov, V., Li, C., De Smedt, I., Lerot, C., McLinden, C., Krotkov, N., Griffin, D., Clarisse, L., Hedelt, P., Loyola, D., Wagner, T., Kumar, V., Innes, A., Ribas, R., Hendrick, F., Vlietinck, J., Brenot, H., and Van Roozendael, M.:
    A sulfur dioxide Covariance-Based Retrieval Algorithm (COBRA): application to TROPOMI reveals new emission sources,
    Atmos. Chem. Phys., 21, 16727-16744, doi:10.5194/acp-21-16727-2021, 2021.

CSO processing

(See Tutorial chapter for introduction to CSO scripts and configuration)

An example configuration of the CSO processing of the S5p/SO2 data is available via the following settings:

Start the job-tree using:

./bin/cso  config/Copernicus/cso.rc

Selected sub-steps in the processing are described below.

Inquire Sentinel-5p/SO2-COBRA archive

S5p/SO2-COBRA retreievals are available from the Product Algorithm Laboratory, or more specif, the S5P-PAL Data Portal; see the cso_pal module module for a detailed description.

Data is available for a single processing stream only, identified by a 4-character key:

  • PAL_ : processed data stored on the Product Algorithm Laboratory portal.

There might be data available from more than one processor version. It is therefore necessary to inquire the archive first to see which data is available, and what the version numbers are.

The CSO_PAL_Inquire class is available to inquire the remote archive. The settings used by this class allow selection on for example time range and intersection area. The result is a csv file which with columns for keywords such as orbit number and processor version, as well as the filename of the data and the url that should be used to actually download the data:

orbit;start_time;end_time;processing;collection;processor_version;filename;href
24688;2022-07-19 12:17:52;2022-07-19 13:59:21;PAL_;03;010001;S5P_PAL__L2__SO2CBR_20220719T121752_20220719T135921_24688_03_010001_20221020T082900.nc;https://data-portal.s5p-pal.com/cat/sentinel-5p/download/88c15681-db43-4219-b391-c8567e39cccf
24689;2022-07-19 13:59:21;2022-07-19 15:40:51;PAL_;03;010001;S5P_PAL__L2__SO2CBR_20220719T135921_20220719T154051_24689_03_010001_20221020T083203.nc;https://data-portal.s5p-pal.com/cat/sentinel-5p/download/c3a4df41-8fb1-417e-9ccb-191c0c777658
:

See the section on File name convention in the Product User Manual for the meaning of all parts of the filename.

To visualize what is available from the various portals, the CSO_Inquire_Plot could be used to create an overview figure:

Overview of available SO2-COBRA processings.

The jobtree configuration to inquire the portals and create the overview figure could look like:

! single step:
cso.s5p.so2-cobra.inquire.class                 :  utopya.UtopyaJobStep
! two tasks:
cso.s5p.so2-cobra.inquire.tasks                 :  table-pal plot
! inquire task:
cso.s5p.so2-cobra.inquire.table-pal.class       :  cso.CSO_PAL_Inquire
cso.s5p.so2-cobra.inquire.table-pal.args        :  '${PWD}/config/Copernicus/cso-s5p-so2-cobra.rc', \
                                                      rcbase='cso.s5p.so2-cobra.inquire-table-pal'
!~ create plot of available versions:
cso.s5p.so2-cobra.inquire.plot.class            :  cso.CSO_Inquire_Plot
cso.s5p.so2-cobra.inquire.plot.args             :  '${PWD}/config/Copernicus/cso-s5p-so2-cobra.rc', \
                                                      rcbase='cso.s5p.so2-cobra.inquire-plot'

Conversion to CSO format

The ‘cso.s5p.so2-cobra.convert’ task converts orbit files downloaded from a portal into a CSO format.

Files are downloaded from a portal if not present locally yet; eventually they are also removed after conversion to avoid that the portal is completely mirrored.

To save storage, only selected pixels are included in the converted files, for example only those within some region or cloud free pixels. The selection criteria are defined in the settings, and added to the ‘history’ attribute of the created files as reminder.

The work is done by the CSO_S5p_Convert class, which is initialized using the settings file:

! task initialization:
cso.s5p.so2-cobra.convert.class     :  cso.CSO_S5p_Convert
cso.s5p.so2-cobra.convert.args      :  '${PWD}/config/Copernicus/cso-s5p-so2-cobra.rc', rcbase='cso.s5p.so2-cobra.convert'

See the class documentation for the general configuration, below some specific choices are described. The example is based on the S5p SO2 file from which the header is available in:

Orbit file selection

Based on the inquiry the download and conversion could be limitted to files created with the most recent processor versions.

For the S5P files a useful property is also the collection number, a 2-digit number that defines a collection of files (or actually processor versions) that together form a contineous series. The collection number is extracted from the filename, and stored as a column of the listing file.

The following setting is used to select specific files from the archive based on the properities stored in the listing file:

! Provide ';' seperated list of to decide if a particular orbit file should be processed.
! If more than one file is available for a particular orbit (from "OFFL" and "RPRO" processing),
! the file with the first match will be used.
! The expressions should include templates '%{header}' for the column values.
! Example to select files from collection '03', preferably from processing 'RPRO' but otherwise from 'OFFL':
!   (%{collection} == '03') and (%{processing} == 'RPRO') ; \
!   (%{collection} == '03') and (%{processing} == 'OFFL')
!
cso.s5p.so2-cobra.convert.selection        :  (%{collection} == '03') and (%{processing} == 'PAL_')

Pixel selection

The CSO_S5p_Convert class calls the S5p_File.SelectPixels() method to create a pixel selection mask for the input file. The selection is done using one or more filters. First provide a list of filter names:

cso.s5p.so2-cobra.convert.filters   :  lons lats valid quality

Then provide for each filter the the input variable to be used for testing, as a path name in the input file. The next settings is the type of filter to be used, see the S5p_File.SelectPixels() for supported types, and the other settings required by the type. The following is an example of a selection on longitude:

cso.s5p.so2-cobra.convert.filter.lons.var                :  Geolocation Fields/Longitude
cso.s5p.so2-cobra.convert.filter.lons.type               :  minmax
cso.s5p.so2-cobra.convert.filter.lons.minmax             :  -30.0 45.0
cso.s5p.so2-cobra.convert.filter.lons.units              :  degrees_east

Variable specification

The target file is created as an CSO_S5p_File object. It’s AddSelection method is called with the input object as argument, and this will copy the selected pixels for variables specified in the settings.

The variable specification starts with a list with variable names to be created in the target file:

cso.s5p.so2-cobra.convert.output.vars    : longitude longitude_bounds \
                                           latitude latitude_bounds \
                                           track_longitude track_longitude_bounds \
                                           track_latitude  track_latitude_bounds \
                                           time \
                                           pressure kernel qa_value \
                                           vcd vcd_errvar \
                                           detection_flag cloud_fraction solar_zenith_angle ground_pixel

For each variable settings should be specified that describe the shape of the variable and how it should be filled from the input. See the AddSelection description for all options, here we show some examples.

The longitude and latitude variables are copied almost directly out of the source files, the only change that is applied is the selection of pixels. All original attributes are copied, except for the bound attribite since that would give warnings from the CF-compliance checker:

cso.s5p.so2-cobra.convert.output.var.longitude.dims                   :   pixel
cso.s5p.so2-cobra.convert.output.var.longitude.from                   :   PRODUCT/longitude
cso.s5p.so2-cobra.convert.output.var.longitude.attrs                  :   { 'bounds' : None }

cso.s5p.so2-cobra.convert.output.var.latitude.dims                    :   pixel
cso.s5p.so2-cobra.convert.output.var.latitude.from                    :   PRODUCT/latitude
cso.s5p.so2-cobra.convert.output.var.latitude.attrs                   :   { 'bounds' : None }

Also the locations of the pixels in the original track are copied, since these are useful when creating plots. These cannot be copied directly but require special processing:

cso.s5p.so2-cobra.convert.output.var.track_longitude.dims             :   track_scan track_pixel
cso.s5p.so2-cobra.convert.output.var.track_longitude.special          :   track_longitude
cso.s5p.so2-cobra.convert.output.var.track_longitude.from             :   PRODUCT/longitude
cso.s5p.so2-cobra.convert.output.var.track_longitude.attrs            :   { 'bounds' : None }

cso.s5p.so2-cobra.convert.output.var.track_latitude.dims              :   track_scan track_pixel
cso.s5p.so2-cobra.convert.output.var.track_latitude.special           :   track_latitude
cso.s5p.so2-cobra.convert.output.var.track_latitude.from              :   PRODUCT/latitude
cso.s5p.so2-cobra.convert.output.var.track_latitude.attrs             :   { 'bounds' : None }

The observattion times are constructed from time steps relative to a reference time; this requires special processing too:

cso.s5p.so2-cobra.convert.output.var.time.dims                        :   pixel
cso.s5p.so2-cobra.convert.output.var.time.special                     :   time-delta
cso.s5p.so2-cobra.convert.output.var.time.tref                        :   PRODUCT/time
cso.s5p.so2-cobra.convert.output.var.time.dt                          :   PRODUCT/delta_time

The observed vertical column density could be copied directly. The target shape is (pixel,retr) where retr is the number of layers in the retrieval product (1 in this case):

! vertical column density:
cso.s5p.so2-cobra.convert.output.var.vcd.dims                         :   pixel retr
cso.s5p.so2-cobra.convert.output.var.vcd.from                         :   PRODUCT/sulfurdioxide_total_vertical_column

In the converted files, the retrieval error is always expressed as a (co)variance matrix, to facilitate (future) conversion of profile products. In this example, it is filled from the square of the error standard deviation:

! error variance in vertical column density (after application of kernel),
! fill with single element 'covariance matrix', from square of standard error:
! use dims with different names to avoid that cf-checker complains:
cso.s5p.so2-cobra.convert.output.var.vcd_errvar.dims                  :   pixel retr retr0
cso.s5p.so2-cobra.convert.output.var.vcd_errvar.special               :   square
cso.s5p.so2-cobra.convert.output.var.vcd_errvar.from                  :   PRODUCT/sulfurdioxide_total_vertical_column_precision
!~ skip standard name, modifier "standard_error" is not valid anymore:
cso.s5p.so2-cobra.convert.output.var.vcd_errvar.attrs                 :   { 'standard_name' : None }

The averaging kernel is applied on atmospheric layers, defined by pressure levels. In this product the pressure levels are defined using hybride-sigma-pressure coordinates, and this requires special processing:

! Convert from hybride coefficient bounds in (2,nlev) aray to 3D half level pressure:
cso.s5p.so2-cobra.convert.output.var.pressure.dims                    :   pixel layeri
cso.s5p.so2-cobra.convert.output.var.pressure.special                 :   hybounds_to_pressure
cso.s5p.so2-cobra.convert.output.var.pressure.sp                      :   PRODUCT/SUPPORT_DATA/INPUT_DATA/surface_pressure
cso.s5p.so2-cobra.convert.output.var.pressure.hyab                    :   PRODUCT/tm5_constant_a
cso.s5p.so2-cobra.convert.output.var.pressure.hybb                    :   PRODUCT/tm5_constant_b
cso.s5p.so2-cobra.convert.output.var.pressure.units                   :   Pa

Averaging kernels are converted to matrices with shape (layer,retr):

! description:
cso.s5p.so2-cobra.convert.output.var.kernel.dims                      :   pixel layer retr
cso.s5p.so2-cobra.convert.output.var.kernel.from                      :   PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/averaging_kernel

Other variables can be copied directly:

! quality flag:
cso.s5p.so2-cobra.convert.output.var.qa_value.dims                   :   pixel
cso.s5p.so2-cobra.convert.output.var.qa_value.from                   :   PRODUCT/qa_value
!~ skip some attributes, cf-checker complains ...
cso.s5p.so2-cobra.convert.output.var.qa_value.attrs                  :   { 'valid_min' : None, 'valid_max' : None }

! cloud property:
cso.s5p.so2-cobra.convert.output.var.cloud_fraction.dims             :   pixel
cso.s5p.so2-cobra.convert.output.var.cloud_fraction.from             :   PRODUCT/SUPPORT_DATA/INPUT_DATA/cloud_fraction_crb
cso.s5p.so2-cobra.convert.output.var.cloud_fraction.attrs            :   { 'coordinates' : None, 'source' : None }

! detection flag, for observations near known source locations:
cso.s5p.so2-cobra.convert.output.var.detection_flag.dims             :   pixel
cso.s5p.so2-cobra.convert.output.var.detection_flag.from             :   PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/sulfurdioxide_detection_flag
cso.s5p.so2-cobra.convert.output.var.detection_flag.attrs            :   { 'coordinates' : None }
cso.s5p.so2-cobra.convert.output.var.detection_flag.dtype            :   i1

Output files

The name of the target files should be specified with a directory and filename; the later could include a template for the orbit number:

! output directory and filename:
! - times are taken from mid of selection, rounded to hours
! - use '%{orbit}' for orbit number
cso.s5p.so2-cobra.convert.output.filename     :  /Scratch/CSO-data/Europe/S5p/SO2-COBRA/C03/%Y/%m/S5p_SO2-COBRA_%{orbit}.nc

A flag is read to decide if existing files should be renewed or kept:

cso.s5p.so2-cobra.convert.renew                  :  True

The target file is created as an CSO_S5p_File object. It’s AddSelection method is called with the input object as argument, and this will copy the selected pixels for variables specified in the settings. The Write method creates the file.

Global attributes for the target file should be specified with:

! global attributes:
cso.s5p.so2-cobra.convert.output.attrs               :  format Conventions author institution email
!
cso.s5p.so2-cobra.convert.output.attr.format         :  1.0
cso.s5p.so2-cobra.convert.output.attr.Conventions    :  CF-1.7
cso.s5p.so2-cobra.convert.output.attr.author         :  Your Name
cso.s5p.so2-cobra.convert.output.attr.institution    :  CSO
cso.s5p.so2-cobra.convert.output.attr.email          :  Your.Name@cso.org

Listing file

A listing file contains the names of the converted orbit files, and the time range of pixels in the file:

filename                     ;start_time                   ;end_time                     ;orbit
2018/06/S5p_RPRO_SO2_03272.nc;2018-06-01T01:32:46.673000000;2018-06-01T01:36:12.948000000;03272
2018/06/S5p_RPRO_SO2_03273.nc;2018-06-01T03:12:53.649000000;2018-06-01T03:17:43.082000000;03273
2018/06/S5p_RPRO_SO2_03274.nc;2018-06-01T04:52:43.586000000;2018-06-01T04:59:12.377000000;03274
:

This file will be used by the observation operator to selects orbits with pixels valid for a desired time range.

A listing file is for example created using the CSO_S5p_Listing class. In the settings passed to the class, define the name of the file to be created:

! csv file that will hold records per file with:
! - timerange of pixels in file
! - orbit number
<rcbase>.file        :   /Scratch/CSO-data/Europe/S5p/SO2-COBRA/C03__listing.csv

An existing listing files is not replaced, unless the following flag is set:

! renew table?
<rcbase>.renew           :  True

Orbit files are searched within a timerange:

<rcbase>.timerange.start        :  2018-06-01 00:00
<rcbase>.timerange.end          :  2018-06-03 23:59

Specify filename filters to search for orbit files; the patterns are relative to the basedir of the listing file, and might contain templates for the time values. Multiple patterns could be defined; if for a certain orbit number more than one file is found, the first match is used. This could be explored to create a listing that combines reprocessed data with near-real-time data:

<rcbase>.patterns            :  CO3/%Y/%m/S5p_*.nc

Catalogue

The CSO_Catalogue class could be used to create a catalogue of images for the converted files. Configuration could look like:

! catalogue creation task:
cso.s5p.so2-cobra.catalogue.task.figs.class  :  cso.CSO_Catalogue
cso.s5p.so2-cobra.catalogue.task.figs.args   :  '${PWD}/config/Copernicus/cso-s5p-so2.rc', \
                                                 rcbase='cso.s5p.so2-cobra.catalogue'

The configuration describes where to find a listing file with orbits, which variables should be plot, the colorbar properties, etc. See CSO_Catalogue class description for how the settings in general look like.

The class creates figures for a list of variables:

! variables to be plotted:
cso.s5p.so2-cobra.catalogue.vars                    :  vcd vcd_errvar qa_value \
                                                       cloud_fraction cloud_radiance_fraction

By default the catalogue creator simply creates a map with the value of the a variable on the track. Optionally settings could be used to specifiy a different unit, or the value range for the colorbar:

! convert units:
cso.s5p.so2-cobra.catalogue.var.vcd.units          :  umol/m2
! style:
cso.s5p.so2-cobra.catalogue.var.vcd.vmin           :   0.0
cso.s5p.so2-cobra.catalogue.var.vcd.vmax           : 100.0

Figures are saved to files with the basename of the original orbit file and the plotted variable:

/Scratch/CSO/catalogue/2018/06/01/S5p_RPRO_SO2_03278__vcd.png
                                  S5p_RPRO_SO2_03278__qa_value.png
                                  :
S5p SO\ :sub:`2` columns

To search for interesting features in the data, the Indexer class could be used to create index pages. Configuration could look like:

! index creation task:
cso.s5p.so2-cobra.catalogue.task.index.class     :  utopya.Indexer
cso.s5p.so2-cobra.catalogue.task.index.args      :  '${PWD}/config/Copernicus/cso-s5p-so2.rc', \
                                                      rcbase='cso.s5p.so2-cobra.catalogue-index'

When succesful, the index creator displays an url that could be loaded in a browser:

Browse to:
  file:///Scratch/CSO/catalogue/index.html
Index for S5p SO2 columns