cso_s5p
module¶
The cso_s5p
module provides classes to convert S5p data
into a CSO format.
Class hierchy¶
The classes and are defined according to the following hierchy:
Classes¶
- class cso_s5p.S5p_File(filename)¶
Bases:
object
Base class to access data in S5p file.
Example of variables:
PRODUCT/ dimensions: scanline = 3245 ; ground_pixel = 450 ; corner = 4 ; time = 1 ; layer = 34 ; variables: int time(time) ; float latitude (time, scanline, ground_pixel) ; units = "degrees_east" ; float longitude(time, scanline, ground_pixel) ; units = "degrees_north" ; float nitrogendioxide_tropospheric_column(time, scanline, ground_pixel) ; units = "mol m-2" ; multiplication_factor_to_convert_to_molecules_percm2 = 6.02214e+19f ; float nitrogendioxide_tropospheric_column_precision(time, scanline, ground_pixel) ; units = "mol m-2" ; multiplication_factor_to_convert_to_molecules_percm2 = 6.022141e+19f ; float averaging_kernel(time, scanline, ground_pixel, layer) ; units = "1" ; ubyte qa_value(time, scanline, ground_pixel) ; units = "1" ; long_name = "data quality value" ; comment = "A continuous quality descriptor, varying between 0 (no data) and 1 (full quality data). Recommend to ignore data with qa_value < 0.5" ; SUPPORT_DATA/ GEOLOCATIONS/ variables: float solar_zenith_angle(time, scanline, ground_pixel) ; units = "degree" ; float latitude_bounds (time, scanline, ground_pixel, corner) ; units = "degrees_east" ; float longitude_bounds (time, scanline, ground_pixel, corner) ; units = "degrees_north" ; INPUT_DATA/ variables: float surface_altitude(time, scanline, ground_pixel) ; units = "m" ; float surface_pressure(time, scanline, ground_pixel) ; units = "Pa" ; float surface_albedo(time, scanline, ground_pixel) ; units = "1" ; float cloud_fraction_crb(time, scanline, ground_pixel) ; units = "1" ;
Arguments:
filename
: name of input file
- GetData(path, units=None)¶
Extract data array from input file and perform some adhoc fixes:
Eventually insert units attribute if not present yet, this is then taken from the argument.
Convert data to target units if necessary.
Add a
long_name
attribute if not present yet.
Arguments:
path
: variable path in input
Optional arguments:
units
: target units
Return values:
da
: :py:xarray.DataArray` object
- SelectPixels(rcf, rckey, indent='')¶
Apply filters specified in rcfile.
Arguments:
rcf
:RcFile
object with settingsrckey
: basename for rcfile keys, e.g. “s5p” for the example below
Return values:
selected
: boolean array with shape of track(time,scan,pixel)
which is True if a pixel passed all checks;history
: list of character string describing the selelections.
Example configuration:
! Specifiy a list of filter names. ! For each name, specify the variable which values are used for testing, ! the type test, and eventually some thresholds or other settings for this type. ! The units of the thresholds should match with the units in the variables, ! the expected units have to be defined too. ! The examples below show possible types and their settings. ! filters: s5p.filters : lons lats albedo valid ! select range of values: s5p.filter.lons.var : PRODUCT/longitude s5p.filter.lons.type : minmax s5p.filter.lons.minmax : -15.0 35.0 s5p.filter.lons.units : degrees_east ! select above a minimum: s5p.filter.lats.var : PRODUCT/latitude s5p.filter.lats.type : minmax s5p.filter.lats.minmax : 35.0 70.0 s5p.filter.lats.units : degrees_north ! select below a maximum: s5p.filter.albedo.var : Data Fields/SurfaceAlbedo s5p.filter.albedo.type : max s5p.filter.albedo.max : 0.3 s5p.filter.albedo.units : 1 ! select only values with data (no "_FillValue"): s5p.filter.valid.var : Data Fields/NO2RetrievalTroposphericVerticalColumn s5p.filter.valid.type : valid
- class cso_s5p.CSO_S5p_File(filename=None)¶
Bases:
cso_file.CSO_File
Storage for CSO satellite file filled from S5p data.
- AddSelection(sfile, selected, rcf, rcbase, indent='')¶
Add selected OMI to satellite extract file.
Arguments:
sfile
:S5p_File
objectselected
: boolean array as provided byS5p_File.SelectPixels()
methodrcf
:RcFile
instance with settingsrcbase
: base of rcfile keys
The first setting that is read is a list with variable names to be created in the target file:
<rcbase>.output.vars : longitude corner_longitudes \ latitude corner_latitudes \ vcd ...
For each variable, a series of settings should be specified that describe how the variable should look like and how to create it.
The first setting is a list of dimension names that define the shape of the variable. Supported dimensions are:
pixel
: selected pixelscorner
: number of footprint bounds (probably 4)layer
: number of layers in atmospheric profile (layers in kernel)layeri
: number of layer interfaces in atmospheric profile (layer+1
)retr
: number of layers in retrieval product (1 for columns)retr0
: same asretr
, used for matrix dimensions(retr,retr0)
to avoid repeated dimensions where the cf-checker complains abouttrack_scan
: original scan index in 2D tracktrack_pixel
: original ground pixel in 2D track
For a 1D variable with values per pixel the dimension setting is therefore:
<rcbase>.output.var.longitude.dims : pixel
For most variables it is sufficient to provide only the name of the original variable from which the data should be read:
<rcbase>.output.var.longitude.from : Geolocation Fields/Longitude
For some variables a special processing needs to be done. For these variables a key ‘
special
’ is used which will enable the correct conversion. The following specials are currently implemented:track_longitude
: longiudes at centers of original 2D track; requires a.from
settingtrack_latitude
: latiudes at centers of original 2D track; requires a.from
settingtrack_longitude_bounds
: longiude bounds at centers of original 2D track; requires a.from
settingtrack_latitude_bounds
: latiude bounds at centers of original 2D track; requires a.from
settingground_pixel
: index of ground pixel in original 2D track; requires a.from
settingsum
: create a variable as the sum over over layers; requires a.from
settingsquare
: create a variable as the square of the input; requires a.from
settingtime
: create time stamps per pixel from a reference time and a time delta; requires settings:<rcbase>.output.var.time.tref : PRODUCT/time <rcbase>.output.var.time.dt : PRODUCT/delta_time
hybounds_to_pressure
: form pressure from hybride sigma pressure coordinate, where the available hybride coefficients have shape('layer',2)
; requires settings:<rcbase>.output.var.pressure.sp : PRODUCT/SUPPORT_DATA/INPUT_DATA/surface_pressure <rcbase>.output.var.pressure.hyab : PRODUCT/tm5_constant_a <rcbase>.output.var.pressure.hybb : PRODUCT/tm5_constant_b <rcbase>.output.var.pressure.units : Pa
hymid_to_pressure
: form pressure from hybride sigma pressure coordinate, where the available hybride coefficients are valid for the middle of the layers and therefore have shape('layer')
; requires settings:<rcbase>.output.var.pressure.sp : PRODUCT/SUPPORT_DATA/INPUT_DATA/surface_pressure <rcbase>.output.var.pressure.hyam : PRODUCT/SUPPORT_DATA/INPUT_DATA/hyam <rcbase>.output.var.pressure.hybm : PRODUCT/SUPPORT_DATA/INPUT_DATA/hybm <rcbase>.output.var.pressure.units : Pa
sp_dp_to_pressure
: form pressure from surface pressure and a constant pressure step per layer (top at zero); requires settings:<rcbase>.output.var.pressure.sp : PRODUCT/SUPPORT_DATA/INPUT_DATA/surface_pressure <rcbase>.output.var.pressure.dp : PRODUCT/SUPPORT_DATA/INPUT_DATA/pressure_interval <rcbase>.output.var.pressure.units : Pa
kernel_trop
: create averaging kernel for tropospheric column as the original kernel times the ratio between total-air-mass-factor and tropospheric-air-mass-factor:\[K_{tropo} ~=~ K~ AMF_{total}~/~{AMF}_{tropo}\]Required settings:
<rcbase>.output.var.kernel.avk : PRODUCT/averaging_kernel <rcbase>.output.var.kernel.amf : PRODUCT/air_mass_factor_total <rcbase>.output.var.kernel.amft : PRODUCT/air_mass_factor_troposphere <rcbase>.output.var.kernel.troplayer : PRODUCT/tm5_tropopause_layer_index
The variable specified by
troplayer
is used to reset the higher layers of the kernel (strospher) to zero.kernel_by_dh
: convert a kernel in m (per layer) to a unitless kernel using divison by a constant layer height:\[A ~=~ A_m~/~dh\]The layer thinkness
dh
is taken from alayers
coordinate that defines the middle of a layer for a regular grid; it is checked that the coordinate defines a regular spacing.Required settings:
<rcbase>.output.var.kernel.avk : PRODUCT/averaging_kernel <rcbase>.output.var.kernel.layer : PRODUCT/layer
square
: create a variable as the square of the input; requires a.from
setting.
Optionally provide target units too. In the (unlikely) case that the original variable has no
units
attribute, this setting is required to define the (assumed) units. If the provided units are different from the units in the original variable, the data is converted to the provided units:<rcbase>.output.var.longitude.units : degrees_east
Optionally provide a dictionairy with attributes to be added. If the attribute value is
None
, the attribute is removed if present from the input; this is sometimes needed if the CF compliance checker complains:!~ skip some attributes, cf-checker complains ... <rcbase>.output.var.qa_value.attrs : { 'valid_min' : None, 'valid_max' : None }
- class cso_s5p.CSO_S5p_Convert(rcfile, rcbase='', env={}, indent='')¶
Bases:
utopya_rc.UtopyaRc
Convert raw S5p observations to CSO format. During conversion, a variable and pixel selection could be applied.
This version will also download source files if not available yet, and (optionally) remove them after conversion. This is useful in case storage capacity is limitted and the entire archive of source files cannot be permanently mirrored.
Arguments:
rcfile
,rcbase
,env
: settings file, prefix for keys, and environment dictionairy
A time range is read to select the files to be converted:
<rcbase>.timerange.start : 2018-06-01 00:00 <rcbase>.timerange.end : 2018-06-03 23:59
The input files are searched in a table created by an inquire class, for example
CSO_SciHub_Inquire
orCSO_PAL_Inquire
. These have scanned the archives to examine which processings and versions are available, and stored the result in a csv file. Specify the name of the csv file; this might contain templates for a date that is taken from another key:! listing of available source files, ! created by 'inquire-s5phub' job: <rcbase>.inquire.file : /data/Copernicus/S5p/Copernicus_S5P_NO2_%Y-%m-%d.csv !! date used in filename, leave empty for today: !<rcbase>.inquire.filedate : 2022-01-28
Specify the directory where the input files are to be searched, or where to download them to if not present yet. A flag is used to decide whether downloaded files should be removed immediately after conversion:
! target dir for downloads: <rcbase>.input.dir : /data/Copernicus/S5P/%{processing}/NO2/%Y/%m ! remove downloaded input files after convert? <rcbase>.downloads.cleanup : False
The input files keep the same name as used in the SciHub archive, for example:
/data/Copernicus/S5P/OFFL/NO2/2018/07/S5P_OFFL_L2__NO2____20180701T005930_20180701T024100_03698_01_010002_20180707T022838.nc start_time end_time orbit
Specify which processings should be converted; other processings listed in the inquirerd table are ignored:
! list of processings: <rcbase>.processings : OFFL RPRO
Similar specify a list of processor versions:
! list of processor versions, empty for all: <rcbase>.processor_versions : 020301
Sometimes a file cannot be converted, for example because it is corrupted or could not be downloaded at all. Specify an (optional) blacklist:
! skip some input files: <rcbase>.blacklist : S5P_PAL__L2__NO2____20190806T022006_20190806T040136_09388_01_020301_20211110T020511.nc
If an input file should be converted, it is read into a
S5p_File
object. TheSelectPixels
method is called to select pixels based on critera defined in the settings; see its documentation for how to configure the pixel selection. This method als returns a history line to desribe the selection, which will be added as attribute to the output file.The output file is created as an
CSO_S5p_File
object. It’sAddSelection
method is called with the input object as argument, and this will copy the selected pixels for variables specified in the settings. TheWrite
method creates the file.The name of the output files should be specified with a directory and filename; the later could include a template for the orbit number:
! output filename: ! - times are taken from mid of selection, rounded to hours ! - replace templates with column values of listing, for example: ! %{orbit}, %{processing}, ... <rcbase>.output.filename : /Scratch/CSO/S5p/RPRO/NO2/Europe/%Y/%m/S5p_RPRO_NO2_%{orbit}.nc
A flag is read to decide if existing output files should be renewed or kept:
<rcbase>.renew : True
Global attributes for the target file should be specified with:
! global attributes: <rcbase>.output.attrs : format Conventions author institution email ! <rcbase>.output.attr.format : 1.0 <rcbase>.output.attr.Conventions : CF-1.7 <rcbase>.output.attr.author : T. Emplate <rcbase>.output.attr.institution : CSO <rcbase>.output.attr.email : t.emplate@cso.org
For testing an (optional) whitelist could be provided with output filenames (no path); if defined, only the listed files will be created:
! testing: create only these files: <rcbase>.whitelist : S5p_RPRO_NO2_123456.nc
- class cso_s5p.CSO_S5p_Listing(rcfile, rcbase='', env={}, indent='')¶
Bases:
utopya_rc.UtopyaRc
Create listing file for converted orbit files.
A listing file contains the names of the converted orbit files, and the time range of pixels in the file:
filename ;start_time ;end_time ;orbit 2018/06/S5p_RPRO_NO2_03272.nc;2018-06-01T01:32:46.673000000;2018-06-01T01:36:12.948000000;03272 2018/06/S5p_RPRO_NO2_03273.nc;2018-06-01T03:12:53.649000000;2018-06-01T03:17:43.082000000;03273 2018/06/S5p_RPRO_NO2_03274.nc;2018-06-01T04:52:43.586000000;2018-06-01T04:59:12.377000000;03274 :
This file will be used by the observation operator to selects orbits with pixels valid for a desired time range.
In the settings, define the name of the file to be created:
! csv file that will hold records per file with: ! - timerange of pixels in file ! - orbit number <rcbase>.file : /Scratch/CSO/S5p/RPRO/NO2/Europe/listing.csv
An existing listing file is not replaced, unless the following flag is set:
! renew table? <rcbase>.renew : True
Orbit files are searched within a timerange:
<rcbase>.timerange.start : 2018-06-01 00:00 <rcbase>.timerange.end : 2018-06-03 23:59
Specify filename filters to search for orbit files; the patterns are relative to the basedir of the listing file, and might contain templates for the time values. Multiple patterns could be defined; if for a certain orbit number more than one file is found, the first match is used. This could be explored to create a listing that combines reprocessed data with near-real-time data:
<rcbase>.patterns : RPRO/NO2/Europe/%Y/%m/S5p_*.nc OFFL/NO2/Europe/%Y/%m/S5p_*.nc
Usually the time range is read from the file, but in case the file does not have a time accordinate, then the following flag might be used to force that the time that matches with the filename is used:
! adhoc: use the time for which file is valid as timerange; ! this is used for the synthetic S4 data that have no time record ... <rcbase>.use_t : True
The
orbit
column in the listing is extra, and is read from global attributes; the list of extra columns is defined with:! extra columns to be added, read from global attributes: <rcbase>.xcolumns : orbit