cso_gridded module

The cso_gridded module provides classes to average pixels over a regular grid.

Class hierchy

The classes and are defined according to the following hierchy:

Classes

class cso_gridded.CSO_GriddedAverage(rcfile, rcbase='', env={}, indent='')

Bases: UtopyaRc

Average pixels over regular grid using rcfile settings.

The value assigned to a cell is a weighted average of the pixel values, where the weight is relative to the area of the overlapping part. Since a cell might be only partly covered by pixels, the normalization is done using the total overlapping area:

\[x(i_k,j_k) ~=~ \left(\ \sum\limits_{p\in P_k} y_p\ w_{p,k}\ \right)\ /\ \sum\limits_{p\in P_k}\ w_{p,k}\]

where:

  • \(i_k,j_k\) are the indices of grid cell \(k\);

  • \(P_k\) is the set of pixels that overlap with cell \(k\);

  • \(y_p\) is the data value of pixel \(p\);

  • \(w_p\) is the footprint area [m2] of pixel \(p\)

  • \(w_{p,k}\) is the area [m2] of pixel \(p\) that overlaps with the cell \(k\).

The overlapping area is computed using the LonLatPolygonCentroids method. This fractions a footprint area into a large number of triangles, and returns for each triangle the centroid and the area. The centroids are collected per grid cell, and sum over the associated triangle area’s is then used as measure for the overlapping area.

The target grid is regular longitude/latitude and defined using the following settings:

! domain definition:
<rcbase>.west              :  40.0
<rcbase>.east              :  60.0
<rcbase>.south             :  20.0
<rcbase>.north             :  35.0
! grid resolution:
<rcbase>.dlon              :  0.1
<rcbase>.dlat              :  0.1

The level recursive splitting of footprint into triangles and assignment of centroids is defined with:

! for 4-corner footprints, number of centroids is:
!  1 (levels=0), 4 (1), 8 (2), 16 (3), 64 (5), 256 (7)
<rcbase>.mapping.levels         :  7

A gridded average is created for each orbit within a time range. Specify the time range in the settings using:

<rcbase>.timerange.start        :  2018-06-01 00:00
<rcbase>.timerange.end          :  2018-06-03 23:59

Also specify a time step freqency that is used within the range for a temporal mean:

! step is one of: hour | day | month
<rcbase>.timerange.step   :  hour

Input data could originate from different sources, for example a data file with footprints and retrieval, and a state file with simulations by a model. Specify a list with keywords describing the sources; the first source should be a file that contains the footprints:

! keywords for source files:
<rcbase>.sources                :  data state

For each source, specify the input file(s) that should be used to create an output file. There are two options supported:

  • Specify the name of a listing file:

    ! listing file with input files:
    <rcbase>.source.data.listing  :  /scratch/CSO-data/S5p/listing-NO2-CAMS.csv
    

    The primary filenames in the listing are used for the first source (here “data”), other source are supposed to be defined in columns including the source name (“filename_state”).

    At each time step, the orbits are selected that have their ‘middle’ time (halfway the start_time and end_time values of a record) within (depending on the time step):

    • the past hour (ouptut file will have time step of end of this interval)

    • current day

    • current month

  • Specify a filename pattern for each source:

    ! filename pattern for source "data":
    <rcbase>.source.data.filenames   :  /scratch/CSO-data/S5p/RPRO/NO2/CAMS/%Y/%m/S5p_RPRO_NO2_%Y%m%d_%H%M_data.nc
    <rcbase>.source.state.filenames  :  /scratch/CSO-data/S5p/RPRO/NO2/CAMS/%Y/%m/S5p_RPRO_NO2_%Y%m%d_%H%M_state.nc
    

    This is typically used when the files have time stamps in their names.

The input could be filtered by applying selections, for example on quality value; see CSO_File.SelectPixels() method for supported filter types:

! keywords for filters:
<rcbase>.filters                :  quality

! minimum quality value required:
<rcbase>.filter.quality.var       :  qa_value
<rcbase>.filter.quality.type      :  min
<rcbase>.filter.quality.min       :  0.8
<rcbase>.filter.quality.units     :  1

Specify the output collection frequency, e.g. the time range that should be included in the target file; note that this should match with the time templates in the output filename:

! output frequency:
!   hourly (default) | daily | monthly | yearly
<rcbase>.output.freq            :   hourly

The class can create one or more output files. First specify a list of keywords to denote the output files, for example seperate files for observations and for simulations:

! keywords for output files:
<rdcbase>.output.files           :  obs sim

For each output keyword, specify the name of the target file; could be useful to include a description of the target grid and the applied filters in the path:

! target file, might contain templates:
!   %Y,%m,etc     : time values
<rcbase>.output.obs.file        :   /scratch/CSO-data-gridded/S5p/RPRO/NO2/CAMS_r01x01_qa08/%Y/%m/S5p_RPRO_NO2_%Y%m%d_%H%M_gridded.nc
<rcbase>.output.sim.file        :   /scratch/CSO-data-gridded/S5p/RPRO/NO2/CAMS_r01x01_qa08/%Y/%m/MODEL_RPRO_NO2_%Y%m%d_%H%M_gridded.nc

Specify a list of the variables to be created, and for each of them the source type and source variable:

! data variables to be created:
<rcbase>.output.obs.vars        :  xvmr
<rcbase>.output.sim.vars        :  xvmr

! input variables:
!   data:vcd    : from data file
!   state:hx    : from state file
<rcbase>.output.obs.xvmr.source     :  data:xvmr
<rcbase>.output.sim.xvmr.source     :  state:xvmr

Retrieval variables have by default a dimension “retr” that is often “1” to be prepared for profiles. If this extra dimension is “1” it can be removed from the output using the following flag:

! remove last dimension ("retr") if equal to 1?
<rcbase>.output.squeeze           :  True

To reduce file size, by default all variables are written to file as short-integers (2 bytes) accompanied by add_offset and scale_factor attributes. A flag is available to disable packing. In addition, zlib compression is enabled. The default compression level is 1 (out of 9); set the following flag to a higher level to have stronger compression (on expense of computation time), or set to 0 to disable compression:

! pack floats as shorts:
<rcbase>.output.packed              :  True
! zlib compression level (default 1, 0 for no compression):
<rcbase>.output.complevel           :  1

Existing files are replaced if the following flag is set:

! renew existing files?
<rcbase>.renew                  :  True
class cso_gridded.CSO_GriddedAverageMeans(rcfile, rcbase='', env={}, indent='')

Bases: UtopyaRc

Create files with temporal means over gridded averages.

The gridded averages are (probably) created using CSO_GriddedAverage class, and stored in files:

2020/06/CSO_gridded_20200601_1100_S5p-no2.nc
        CSO_gridded_20200601_1300_S5p-no2.nc
        :

Temporal means are computed per grid cell over time series, weighted using the pixel_area value that hold the area of a grid cell covered by the original pixel footprints.

In the settings, first specify a time range over which temporal means should be computed:

<rcbase>.timerange.start        :  2020-01-01 00:00
<rcbase>.timerange.end          :  2020-12-31 23:59

Temporal means can be computed over different resolutions. Specify a list of resolutions to be used:

! time resolutions: daily, monthly, yearly
<rcbase>.resolutions                     :  daily monthly yearly

For each of these resolutions, specify a filename pattern for the input files, and a template for the output file. Both should have templates to evaluate the target time a temporal mean, for example %%Y%%m%%d for a specific day. The input pattern should use * or ? patterns to denote the higher time resolutions. For example for the daily resolution, specify an input pattern for all hours in a day, and an output pattern for a daily file:

! daily means from hourly files:
<rcbase>.resolution.daily.input.file     :  /work/gridded/europe/%Y/%m/CSO_gridded_%Y%m%d_????_S5p-glyox.nc
<rcbase>.resolution.daily.output.file    :  /work/gridded/europe/daily/CSO_gridded_%Y%m%d_S5p-glyox.nc

Similar specification should be provided for monthly and yearly resolutions. To speedup calculations, the monthly means could use the daily means as input:

   ! monthly means from daily files:
   <rcbase>.resolution.monthly.input.file   :  /work/gridded/europe/daily/CSO_gridded_%Y%m??_S5p-glyox.nc
   <rcbase>.resolution.monthly.output.file  :  /work/gridded/europe/monthly/CSO_gridded_%Y%m_S5p-glyox.nc

and similar the ``yearly`` means could use the ``monthly`` means as input::

   ! yearly means from monthly files
   <rcbase>.resolution.yearly.input.file    :  /work/gridded/europe/monthly/CSO_gridded_%Y??_S5p-glyox.nc
   <rcbase>.resolution.yearly.output.file   :  /work/gridded/europe/yearly/CSO_gridded_%Y_S5p-glyox.nc

To reduce file size, by default all variables are written to file as short-integers (2 bytes) accompanied by add_offset and scale_factor attributes. A flag is available to disable packing. In addition, zlib compression is enabled. The default compression level is 1 (out of 9); set the following flag to a higher level to have stronger compression (on expense of computation time), or set to 0 to disable compression:

! pack floats as shorts:
<rcbase>.output.packed              :  True
! zlib compression level (default 1, 0 for no compression):
<rcbase>.output.complevel           :  1

Existing files are replaced if the following flag is set:

! renew existing files?
<rcbase>.renew                  :  True