TNO Intern

Skip to content
......@@ -87,6 +87,10 @@ A summary of the versions and changes.
| Reformatted using 'black'.
| *(2023-08)*
* | *v2.5*
| Support new Copernicus *DataSpace* portal to download Sentinel data.
| *(2023-11)*
To be included
==============
......
.. Documentation for module.
.. Import documentation from ".py" file:
.. automodule:: cso_scihub
.. automodule:: cso_dataspace
......@@ -27,8 +27,8 @@ Classes used for specific tasks are implemented in the ``cso_*`` modules.
:maxdepth: 1
pymod-cso_inquire
pymod-cso_dataspace
pymod-cso_pal
pymod-cso_scihub
pymod-cso_s5p
pymod-cso_file
pymod-cso_gridded
......
......@@ -104,6 +104,36 @@ Note that the official S5p filename formatting rules require exactly 10 characte
in the current product a 12-character key ``L2__CHOCHO__`` is used.
CSO processing
==============
*(See* :ref:`tutorial` *chapter for introduction to CSO scripts and configuration)*
An example configuration of the CSO processing of the S5p/CHOCHO data is available via
the following settings:
* `config/Copernicus/cso.rc <../../../config/Copernicus/cso.rc>`_
Top-level settings that configure the job-tree with various sub-tasks.
This is a generic file that could be used for multiple S5 products,
edit it to select the CHOCHO processing.
* `config/Copernicus/cso-user-settings.rc <../../../config/Copernicus/cso-user-settings.rc>`_
User-specific settings such as the work directory.
* `config/Copernicus/cso-s5p-chocho.rc <../../../config/Copernicus/cso-s5p-chocho.rc>`_
Specific settings for CHOCHO product.
Start the job-tree using::
./bin/cso config/Copernicus/cso.rc
Selected sub-steps in the processing are described below.
Inquire archives
================
......
......@@ -60,14 +60,46 @@ Notes:
The recommended minimum is 0.5, this excludes cloudy scenes and other problematic retrievals.
.. Label between '.. _' and ':' ; use :ref:`text <label>` for reference
CSO processing
==============
*(See* :ref:`tutorial` *chapter for introduction to CSO scripts and configuration)*
An example configuration of the CSO processing of the S5p/CO data is available via
the following settings:
* `config/Copernicus/cso.rc <../../../config/Copernicus/cso.rc>`_
Top-level settings that configure the job-tree with various sub-tasks.
This is a generic file that could be used for multiple S5 products,
edit it to select the CO processing.
* `config/Copernicus/cso-user-settings.rc <../../../config/Copernicus/cso-user-settings.rc>`_
User-specific settings such as the work directory.
* `config/Copernicus/cso-s5p-co.rc <../../../config/Copernicus/cso-s5p-co.rc>`_
Specific settings for CO product.
Start the job-tree using::
./bin/cso config/Copernicus/cso.rc
Selected sub-steps in the processing are described below.
.. _s5p-co-inquire:
Inquire Sentinel-5p/CO archive
================================
S5p/CO observations are available from the
`Copernicus Open Access Hub <https://scihub.copernicus.eu/>`_;
see the :ref:`cso-scihub` module for a detailed description.
`Copernicus DataSpace <https://dataspace.copernicus.eu/>`_;
see the :ref:`cso-dataspace` module for a detailed description.
Data is available for different processing streams, each identified by a 4-character key:
......@@ -80,8 +112,8 @@ but with different processor versions.
It is therefore necessary to first inquire both archives to see which data is available where,
and what the version numbers are.
The :py:class:`cso_scihub.CSO_SciHub_Inquire` class is available to inquire the
*Copernicus Open Access Hub*. The settings used by this class allow selection
The :py:class:`CSO_DataSpace_Inquire <cso_dataspace.CSO_DataSpace_Inquire>` class is available to inquire the
*Copernicus DataSpace*. The settings used by this class allow selection
on for example time range and intersection area.
The result is a csv file which with columns for keywords such orbit number and processor version,
as well as the filename of the data and the url that should be used to actually download the data::
......@@ -105,17 +137,19 @@ To visualize what is available from the various portals, the
The jobtree configuration to inquire the portals and create the overview figure could look like::
! single step:
cso.s5p.co.inquire-scihub.class : utopya.UtopyaJobStep
cso.s5p.co.inquire.class : utopya.UtopyaJobStep
! two tasks:
cso.s5p.co.inquire-scihub.tasks : table
!~ inquire available files:
cso.s5p.co.inquire-scihub.table.class : cso.CSO_SciHub_Inquire
cso.s5p.co.inquire-scihub.table.args : '${PWD}/config/ESA-S5p/cso-s5p-co.rc', \
rcbase='cso.s5p.co.inquire-s5phub-table'
cso.s5p.co.inquire.tasks : table-dataspace plot
!~ inquire files available on DataSpace:
cso.s5p.co.inquire.table-dataspace.class : cso.CSO_DataSpace_Inquire
cso.s5p.co.inquire.table-dataspace.args : '${PWD}/config/Copernicus/cso-s5p-co.rc', \
rcbase='cso.s5p.co.inquire-table-dataspace'
!~ create plot of available versions:
cso.s5p.co.inquire-scihub.plot.class : cso.CSO_SciHub_InquirePlot
cso.s5p.co.inquire-scihub.plot.args : '${PWD}/config/ESA-S5p/cso-s5p-co.rc', \
rcbase='cso.s5p.co.inquire-s5phub-plot'
cso.s5p.co.inquire.plot.class : cso.CSO_Inquire_Plot
cso.s5p.co.inquire.plot.args : '${PWD}/config/Copernicus/cso-s5p-co.rc', \
rcbase='cso.s5p.co.inquire-plot'
......
......@@ -65,6 +65,36 @@ References
| Atmos. Meas. Tech., 13, 3751-3767, `<https://doi.org/10.5194/amt-13-3751-2020>`_, 2020.
CSO processing
==============
*(See* :ref:`tutorial` *chapter for introduction to CSO scripts and configuration)*
An example configuration of the CSO processing of the S5p/HCHO data is available via
the following settings:
* `config/Copernicus/cso.rc <../../../config/Copernicus/cso.rc>`_
Top-level settings that configure the job-tree with various sub-tasks.
This is a generic file that could be used for multiple S5 products,
edit it to select the HCHO processing.
* `config/Copernicus/cso-user-settings.rc <../../../config/Copernicus/cso-user-settings.rc>`_
User-specific settings such as the work directory.
* `config/Copernicus/cso-s5p-hcho.rc <../../../config/Copernicus/cso-s5p-hcho.rc>`_
Specific settings for HCHO product.
Start the job-tree using::
./bin/cso config/Copernicus/cso.rc
Selected sub-steps in the processing are described below.
.. Label between '.. _' and ':' ; use :ref:`text <label>` for reference
.. _s5p-hcho-inquire:
......@@ -72,8 +102,8 @@ Inquire Sentinel-5p/HCHO archive
================================
S5p/HCHO observations are available from the
`Copernicus Open Access Hub <https://scihub.copernicus.eu/>`_;
see the :ref:`cso-scihub` module for a detailed description.
`Copernicus DataSpace <https://dataspace.copernicus.eu/>`_;
see the :ref:`cso-dataspace` module for a detailed description.
Data is available for different processing streams, each identified by a 4-character key:
......@@ -86,8 +116,8 @@ but with different processor versions.
It is therefore necessary to first inquire both archives to see which data is available where,
and what the version numbers are.
The :py:class:`cso_scihub.CSO_SciHub_Inquire` class is available to inquire the
*Copernicus Open Access Hub*. The settings used by this class allow selection
The :py:class:`CSO_DataSpace_Inquire <cso_dataspace.CSO_DataSpace_Inquire>` class is available to inquire the
*Copernicus DataSpace*. The settings used by this class allow selection
on for example time range and intersection area.
The result is a csv file which with columns for keywords such orbit number and processor version,
as well as the filename of the data and the url that should be used to actually download the data::
......@@ -111,17 +141,19 @@ To visualize what is available from the various portals, the
The jobtree configuration to inquire the portals and create the overview figure could look like::
! single step:
cso.s5p.hcho.inquire-scihub.class : utopya.UtopyaJobStep
cso.s5p.hcho.inquire.class : utopya.UtopyaJobStep
! two tasks:
cso.s5p.hcho.inquire-scihub.tasks : table
!~ inquire available files:
cso.s5p.hcho.inquire-scihub.table.class : cso.CSO_SciHub_Inquire
cso.s5p.hcho.inquire-scihub.table.args : '${PWD}/config/ESA-S5p/cso-s5p-hcho.rc', \
rcbase='cso.s5p.hcho.inquire-s5phub-table'
cso.s5p.hcho.inquire.tasks : table-dataspace plot
!~ inquire files available on DataSpace:
cso.s5p.hcho.inquire.table-dataspace.class : cso.CSO_DataSpace_Inquire
cso.s5p.hcho.inquire.table-dataspace.args : '${PWD}/config/Copernicus/cso-s5p-hcho.rc', \
rcbase='cso.s5p.hcho.inquire-table-dataspace'
!~ create plot of available versions:
cso.s5p.hcho.inquire-scihub.plot.class : cso.CSO_SciHub_InquirePlot
cso.s5p.hcho.inquire-scihub.plot.args : '${PWD}/config/ESA-S5p/cso-s5p-hcho.rc', \
rcbase='cso.s5p.hcho.inquire-s5phub-plot'
cso.s5p.hcho.inquire.plot.class : cso.CSO_Inquire_Plot
cso.s5p.hcho.inquire.plot.args : '${PWD}/config/Copernicus/cso-s5p-hcho.rc', \
rcbase='cso.s5p.hcho.inquire-plot'
......
......@@ -175,6 +175,36 @@ and simulations.
CSO processing
==============
*(See* :ref:`tutorial` *chapter for introduction to CSO scripts and configuration)*
An example configuration of the CSO processing of the S5p/NO2 data is available via
the following settings:
* `config/Copernicus/cso.rc <../../../config/Copernicus/cso.rc>`_
Top-level settings that configure the job-tree with various sub-tasks.
This is a generic file that could be used for multiple S5 products,
edit it to select the NO2 processing.
* `config/Copernicus/cso-user-settings.rc <../../../config/Copernicus/cso-user-settings.rc>`_
User-specific settings such as the work directory.
* `config/Copernicus/cso-s5p-no2.rc <../../../config/Copernicus/cso-s5p-no2.rc>`_
Specific settings for NO2 product.
Start the job-tree using::
./bin/cso config/Copernicus/cso.rc
Selected sub-steps in the processing are described below.
.. Label between '.. _' and ':' ; use :ref:`text <label>` for reference
.. _s5p-no2-inquire:
......@@ -183,8 +213,8 @@ Inquire Sentinel-5p/NO2 archives
S5p/NO2 observations from KNMI have been available from at least these sources:
* `Copernicus Open Access Hub <https://scihub.copernicus.eu/>`_;
see the :ref:`cso-scihub` module for a detailed description.
* `Copernicus DataSpace <https://dataspace.copernicus.eu/>`_;
see the :ref:`cso-dataspace` module for a detailed description.
*This is the operational version.*
......@@ -206,8 +236,8 @@ The portals provide data files created with the same retrieval algorithm, but mo
It is therefore necessary to first inquire both archives to see which data is available where,
and what the version numbers are.
The :py:class:`CSO_SciHub_Inquire <cso_scihub.CSO_SciHub_Inquire>` class is available to inquire the
*Copernicus Open Access Hub*. The settings used by this class allow selection
The :py:class:`CSO_DataSpace_Inquire <cso_dataspace.CSO_DataSpace_Inquire>` class is available to inquire the
*Copernicus DataSpace*. The settings used by this class allow selection
on for example time range and intersection area.
The result is a csv file which with columns for keywords such as orbit number and processor version,
as well as the filename of the data and the url that should be used to actually download the data::
......@@ -220,7 +250,7 @@ as well as the filename of the data and the url that should be used to actually
See the section on *File name convention* in the *Product User Manual* for the meaning of all
parts of the filename.
A similar class :py:class:`CSO_PAL_Inquire <cso_scihub.CSO_PAL_Inquire>` class is available to list the content
A similar class :py:class:`CSO_PAL_Inquire <cso_pal.CSO_PAL_Inquire>` class is available to list the content
of the *Product Algorithm Laboratory* portal. Also this will produce a table file.
To visualize what is available from the various portals, the
......@@ -234,24 +264,23 @@ To visualize what is available from the various portals, the
The jobtree configuration to inquire the portals and create the overview figure could look like::
! single step:
cso.s5p.no2.inquire-scihub.class : utopya.UtopyaJobStep
! inquire from portals, plot overview:
cso.s5p.no2.inquire.tasks : scihub pal plot
!~ create table with available files on portal:
cso.s5p.no2.inquire.scihub.class : cso.CSO_SciHub_Inquire
cso.s5p.no2.inquire.scihub.args : '${PWD}/config/EMEP/cso-s5p-no2.rc', \
rcbase='cso.s5p.no2.inquire-s5phub'
!~ create table with available files on portal:
cso.s5p.no2.inquire.pal.class : cso.CSO_PAL_Inquire
cso.s5p.no2.inquire.pal.args : '${PWD}/config/EMEP/cso-s5p-no2.rc', \
rcbase='cso.s5p.no2.inquire-pal'
cso.s5p.no2.inquire.class : utopya.UtopyaJobStep
! two tasks:
cso.s5p.no2.inquire.tasks : table-dataspace plot
!~ inquire files available on DataSpace:
cso.s5p.no2.inquire.table-dataspace.class : cso.CSO_DataSpace_Inquire
cso.s5p.no2.inquire.table-dataspace.args : '${PWD}/config/Copernicus/cso-s5p-no2.rc', \
rcbase='cso.s5p.no2.inquire-table-dataspace'
!~ create plot of available versions:
cso.s5p.no2.inquire.plot.class : cso.CSO_Inquire_Plot
cso.s5p.no2.inquire.plot.args : '${PWD}/config/EMEP/cso-s5p-no2.rc', \
cso.s5p.no2.inquire.plot.args : '${PWD}/config/Copernicus/cso-s5p-no2.rc', \
rcbase='cso.s5p.no2.inquire-plot'
.. Label between '.. _' and ':' ; use :ref:`text <label>` for reference
.. _s5p-no2-convert:
......
......@@ -77,6 +77,36 @@ Acknowledgements
We hereby thank D. Griffin and V. Fioletov for their valuable input.
CSO processing
==============
*(See* :ref:`tutorial` *chapter for introduction to CSO scripts and configuration)*
An example configuration of the CSO processing of the S5p/SO2 data is available via
the following settings:
* `config/Copernicus/cso.rc <../../../config/Copernicus/cso.rc>`_
Top-level settings that configure the job-tree with various sub-tasks.
This is a generic file that could be used for multiple S5 products,
edit it to select the SO2 processing.
* `config/Copernicus/cso-user-settings.rc <../../../config/Copernicus/cso-user-settings.rc>`_
User-specific settings such as the work directory.
* `config/Copernicus/cso-s5p-so2.rc <../../../config/Copernicus/cso-s5p-so2.rc>`_
Specific settings for SO2 product.
Start the job-tree using::
./bin/cso config/Copernicus/cso.rc
Selected sub-steps in the processing are described below.
.. Label between '.. _' and ':' ; use :ref:`text <label>` for reference
.. _s5p-so2-inquire:
......@@ -84,8 +114,8 @@ Inquire Sentinel-5p/SO2 archive
===============================
S5p/SO2 observations are available from the
`Copernicus Open Access Hub <https://scihub.copernicus.eu/>`_;
see the :ref:`cso-scihub` module for a detailed description.
`Copernicus DataSpace <https://dataspace.copernicus.eu/>`_;
see the :ref:`cso-dataspace` module for a detailed description.
Data is available for different processing streams, each identified by a 4-character key:
......@@ -93,13 +123,12 @@ Data is available for different processing streams, each identified by a 4-chara
* ``OFFL`` : `Offline`, available within weeks after observations;
* ``RPRO`` : re-processing of all previously made observations;
The portals provide data files created with the same retrieval algorithm,
but with different processor versions.
The portal provides data files created with different processor versions.
It is therefore necessary to first inquire both archives to see which data is available where,
and what the version numbers are.
The :py:class:`cso_scihub.CSO_SciHub_Inquire` class is available to inquire the
*Copernicus Open Access Hub*. The settings used by this class allow selection
The :py:class:`CSO_DataSpace_Inquire <cso_dataspace.CSO_DataSpace_Inquire>` class is available to inquire the
*Copernicus DataSpace*. The settings used by this class allow selection
on for example time range and intersection area.
The result is a csv file which with columns for keywords such orbit number and processor version,
as well as the filename of the data and the url that should be used to actually download the data::
......@@ -123,17 +152,19 @@ To visualize what is available from the various portals, the
The jobtree configuration to inquire the portals and create the overview figure could look like::
! single step:
cso.s5p.so2.inquire-scihub.class : utopya.UtopyaJobStep
cso.s5p.so2.inquire.class : utopya.UtopyaJobStep
! two tasks:
cso.s5p.so2.inquire-scihub.tasks : table
!~ inquire available files:
cso.s5p.so2.inquire-scihub.table.class : cso.CSO_SciHub_Inquire
cso.s5p.so2.inquire-scihub.table.args : '${PWD}/config/ESA-S5p/cso-s5p-so2.rc', \
rcbase='cso.s5p.so2.inquire-s5phub-table'
cso.s5p.so2.inquire.tasks : table-dataspace plot
!~ inquire files available on DataSpace:
cso.s5p.so2.inquire.table-dataspace.class : cso.CSO_DataSpace_Inquire
cso.s5p.so2.inquire.table-dataspace.args : '${PWD}/config/Copernicus/cso-s5p-so2.rc', \
rcbase='cso.s5p.so2.inquire-table-dataspace'
!~ create plot of available versions:
cso.s5p.so2.inquire-scihub.plot.class : cso.CSO_SciHub_InquirePlot
cso.s5p.so2.inquire-scihub.plot.args : '${PWD}/config/ESA-S5p/cso-s5p-so2.rc', \
rcbase='cso.s5p.so2.inquire-s5phub-plot'
cso.s5p.so2.inquire.plot.class : cso.CSO_Inquire_Plot
cso.s5p.so2.inquire.plot.args : '${PWD}/config/Copernicus/cso-s5p-so2.rc', \
rcbase='cso.s5p.so2.inquire-plot'
......
......@@ -34,7 +34,7 @@ Job tree
The configuration defines a series of jobs to be created and started.
For example, the following jobs might be defined to process Sentinel-5p data::
cso.tutorial.inquire-scihub
cso.tutorial.inquire
cso.tutorial.convert
cso.tutorial.listing
cso.tutorial.catalogue
......@@ -42,14 +42,14 @@ For example, the following jobs might be defined to process Sentinel-5p data::
This list is actually defined as tree, using lists in which the elements could be a list too::
cso # tree with element "tutorial"
.tutorial # tree with elements "inquire-scihub", "convert", "listing", and "catalogue"
.inquire-scihub # step
.tutorial # tree with elements "inquire", "convert", "listing", and "catalogue"
.inquire # step
.convert # step
.listing # step
.catalogue # step
For each element in the tree, the configuration file should specify
the name of a python class that takes care of the job creating..
the name of a python class that takes care of the job creation.
If the element is a *tree*, use the :py:class:`utopya.UtopyaJobTree <utopya_jobtree.UtopyaJobTree>` class,
and add a line to specify the names of the sub-elements.
For the main ``cso`` job, this looks like::
......@@ -87,7 +87,7 @@ it will only print a message::
! dummy task:
cso.tutorial.inquire.task.class : utopya.UtopyaJobTask
cso.tutorial.inquire.task.args : msg='Inquire SciHub ...'
cso.tutorial.inquire.task.args : msg='Inquire archive ...'
Running a single job step
......@@ -103,7 +103,7 @@ Job files
The job tree defined above will create a series of job files in a work directory,
with each job in a separate sub directory::
/work/yourname/CSO-Tests/cso/cso.jb
/work/yourname/CSO-Tutorial/cso/cso.jb
cso/tutorial/cso.s5p.jb
cso/tutorial/inquire/cso.tutorial.inquire.jb
......@@ -111,7 +111,7 @@ The location of the work directories is specified in the settings using::
! work directory for jobs;
! here path including subdirectories for job name elements:
*.workdir : /work/yourname/CSO-Tests/__NAME2PATH__
*.workdir : /work/yourname/CSO-Tutorial/__NAME2PATH__
The default settings create simple job files that will run in foreground,
perform a specified task, and submit the next job in the tree.
......@@ -130,15 +130,10 @@ Step 1 - Inquire S5p archive
*(See* :ref:`s5p-no2-inquire` *section for full description of S5p/NO2 inquireries)*
The ``cso.tutorial.inquire`` job is configured in ``rc/tutorial.rc`` to crawl
the Copernicus SciHub archive for available Sentinel-5p NO2 observations.
.. IMPORTANT::
Access to the `Copernicus Open Access Hub <https://scihub.copernicus.eu/>`_
requires a (not personal) login and password; see also the :ref:`SciHub-OpenHub` section.
Add the following login/password setting to your ``~/.netrc`` file::
machine s5phub.copernicus.eu login s5pguest password s5pguest
The ``cso.tutorial.inquire`` job is configured in
`config/tutorial/tutorial.rc <../../../config/tutorial/tutorial.rc>`_
to crawl the `Copernicus DataSpace <https://dataspace.copernicus.eu/>`_ archive
for available Sentinel-5p NO2 observations.
To run the inquire job only, limit the element list of the ``cso.tutorial`` job to::
......@@ -150,28 +145,35 @@ observations::
! single step:
cso.tutorial.inquire.class : utopya.UtopyaJobStep
! two tasks:
cso.tutorial.inquire.tasks : table-scihub plot
!~ inquire available files:
cso.tutorial.inquire.table-scihub.class : cso.CSO_SciHub_Inquire
cso.tutorial.inquire.table-scihub.args : '${PWD}/config/tutorial/tutorial.rc', \
rcbase='cso.tutorial.inquire-table-scihub'
!~ create plot of available versions:
cso.tutorial.inquire.tasks : table-dataspace plot
!~ task: inquire available files:
cso.tutorial.inquire.table-dataspace.class : cso.CSO_DataSpace_Inquire
cso.tutorial.inquire.table-dataspace.args : '${__filename__}', \
rcbase='cso.tutorial.inquire-table-dataspace'
!~ task: create plot of available versions:
cso.tutorial.inquire.plot.class : cso.CSO_Inquire_Plot
cso.tutorial.inquire.plot.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.inquire.plot.args : '${__filename__}', \
rcbase='cso.tutorial.inquire-plot'
The first setting defines that this is single job step that should do some work.
The ``tasks`` list defines keywords for the two tasks to be performed.
* For the ``table-scihub`` task, define that the :py:class:`CSO_SciHub_Inquire <cso_scihub.CSO_SciHub_Inquire>` class
* For the ``table-dataspace`` task, define that the :py:class:`CSO_DataSpace_Inquire <cso_dataspace.CSO_DataSpace_Inquire>` class
should be used to do the work; the class is accessible from the :py:mod:`cso` module
(implemented in``py/cso.py``).
The arguments that initialize the class specify the name of an rcfile with settings
(``tutorial.rc``) and that the settings start with keywords ``'cso.tutorial.inquire-table-scihub'``.
The first arguments that initialize the class specifies the name of an rcfile with settings;
in this case these settings are in the same file (``tutorial.rc``) that defines the job-tree,
and therefore the special keyword ``%{__filename__}`` could be used.
The second argument ``rcbase`` is optional and specifies that the settings start with keywords
``'cso.tutorial.inquire-table-dataspace'``.
* Similar for the ``plot`` task, the settings define that the
:py:class:`CSO_Inquire_Plot <cso_scihub.CSO_Inquire_Plot>` class should be used to do the work.
:py:class:`CSO_Inquire_Plot <cso_inquire.CSO_Inquire_Plot>` class should be used to do the work.
The tutorial settings will inquire the time range 2018-2023.
......@@ -180,10 +182,10 @@ the work directories.
This is specified with a `user defined` keyword, here chosen to start with '``my.``'::
! base location for work directories:
my.work : /work/${USER}/CSO-Tests
my.work : /work/yourname/CSO-Tutorial
This `user defined` key is for example used to specify the work directory of the
``cso.tutorial.inquire-scihub`` (and other) jobs::
``cso.tutorial.inquire`` (and other) jobs::
! work directory for jobs;
! here path including subdirectories for job name elements:
......@@ -192,26 +194,51 @@ This `user defined` key is for example used to specify the work directory of the
This tells that the work directory of the job should include the jobname expanded
as subdirectories. For this example, the full path becomes::
/work/yourname/CSO-Tests/cso/tutorial/inquire/
/work/yourname/CSO-Tutorial/cso/tutorial/inquire/
The base of the work directories is also used to specify where the inquired table file should be stored::
! output table, date of today:
cso.tutorial.inquire-s5phub-table.output.file : ${my.work}/Copernicus/Copernicus_S5p_NO2_scihub__%Y-%m-%d.csv
cso.tutorial.inquire-dataspace-table.output.file : ${my.work}/Copernicus/Copernicus_S5p_NO2_dataspace__%Y-%m-%d.csv
The created table is then for example::
/work/yourname/CSO-Tests/Copernicus/Copernicus_S5p_NO2_scihub__2023-08-09.csv
/work/yourname/tCSO-Tutorial/Copernicus/Copernicus_S5p_NO2_dataspace__2023-08-09.csv
If not already done, run the ``cso`` script with the tutorial settings::
./bin/cso config/tutorial/tutorial.rc
It could take a long time time to inquire the full time period!
Be patient, or limit the time range in the settings ...
When the inquiry is finished, check if indeed the table with available orbits is created.
To visualize what is available, the :py:class:`cso_inquire.CSO_DataSpace_InquirePlot` class
should have created an overview figure next to the table file::
To visualize what is available from the various portals, the
:py:class:`cso_scihub.CSO_SciHub_InquirePlot` could be used to create an overview figure.
The figure file looks like:
/work/yourname/tCSO-Tutorial/Copernicus/Copernicus_S5p_NO2_dataspace__2023-08-09.png
The figure should look like:
.. figure:: figs/NO2/Copernicus_S5p_NO2.png
:scale: 50 %
:align: center
:alt: Overview of available NO2 processings on SciHub.
:alt: Overview of available NO2 processings on DataSpace.
For the same orbit, multiple data files could be available.
A single S5p data file is uniquely identified by:
* processor version ``x.y.z``
* processing:
* ``NRTI`` : *Near Real Time*, processed within hours after observation;
* ``OFFL`` : *Offline* data, processed within a few weeks after observations;
* ``RPRO`` : *Reproduced* data, processed long after observations using latest processor version.
The collection numbers ``01``, ``02``, etc are used to identify a single timeseries
of the entire data set.
Step 2 - Convert to CSO format
......@@ -219,9 +246,21 @@ Step 2 - Convert to CSO format
*(See* :ref:`s5p-no2-convert` *section for full description of S5p/NO2 conversion)*
The ``cso.tutorial.convert`` job is configured to convert the downloaded orbit files into a common format.
The ``cso.tutorial.convert`` job is configured to convert the original S5p/NO2 data
into a common format. If the original data is not present yet, it is downloaded.
.. IMPORTANT::
Downloading data from the *Copernicus DataSpace* requires a personal login and password.
Add the login/password setting to your ``~/.netrc`` file::
machine zipper.dataspace.copernicus.eu login Your.Name@institute.org password ***********
See also the :ref:`dataspace-account` section in the descrption of the
:py:mod:`cso_dataspace` module.
The conversion includes filter options to selected only pixels within a certain domain, and with
some minimum quality flag, etc; this could strongly limit the data volume.
some minimum quality flag; this will strongly limit the data volume.
It is not necessary to keep the original data, eventually it could be downloaded again when needed.
To run the conversion job only, limit in ``rc/tutorial.rc`` the element list of the ``cso.tutorial`` job to::
......@@ -233,17 +272,18 @@ The conversion job is configured with::
cso.tutorial.convert.class : utopya.UtopyaJobStep
! conversion task:
cso.tutorial.convert.task.class : cso.CSO_S5p_Convert
cso.tutorial.convert.task.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.convert.task.args : '${__filename__}', \
rcbase='cso.tutorial.convert'
The conversion is thus done using the :py:class:`CSO_S5p_Convert <cso_s5p.CSO_S5p_Convert>` class
that can be accessed from the :py:mod:`cso` module.
The arguments that initialize the class specify the name of an rcfile with settings
(``tutorial.rc``) and that the settings start with keywords ``'cso.tutorial.convert'``.
The arguments that initialize the class specify the name of a rcfile with settings
(in this case the ``tutorial.rc`` that holds the job-tree definition)
and that the settings start with keywords ``'cso.tutorial.convert'``.
The result of the conversion is a set of files holding the selected pixels per orbit::
/work/yourname/CSO-Tests/CSO-data/S5p/RPRO/NO2/CAMS/2018/06/S5p_RPRO_NO2_03272.nc
/work/yourname/CSO-Tutorial/CSO-data/S5p/RPRO/NO2/CAMS/2018/06/S5p_RPRO_NO2_03272.nc
S5p_RPRO_NO2_03273.nc
:
......@@ -271,7 +311,7 @@ The following is a demo Python code that creates an S5p/NO2 map::
import cso
# sample file:
filename = '/work/yourname/CSO-Tests/CSO-data/S5p/RPRO/NO2/CAMS/2018/06/S5p_RPRO_NO2_03278.nc'
filename = '/work/yourname/CSO-Tutorial/CSO-data/S5p/RPRO/NO2/CAMS/2018/06/S5p_RPRO_NO2_03278.nc'
# read:
orb = cso.CSO_File( filename=filename )
......@@ -327,7 +367,7 @@ This file will be used by the observation operator to selects orbits with pixels
a desired time range.
Also the *catalogue* creator described below uses the listing file to select orbits.
The configuration of describes the name of the listing file to be created,
The configuration of the job specifies the name of the listing file to be created,
and the directories to be scanned for orbit files.
......@@ -337,10 +377,10 @@ Step 5 - Catalogue of figures
*(See* :ref:`s5p-no2-catalogue` *section for main description)*
For a first impression of how the downloaded and converted satellite data looks like,
the ``cso.tutorial.catalogue`` job can be used. This will figures out of the converted files,
the ``cso.tutorial.catalogue`` job can be used. This will create figures out of the converted files,
in particular maps that show values on the track.
To run the catalogue job only, limit in ``rc/tutorial.rc`` the element list of the ``cso.tutorial`` job to::
To run the catalogue job only, limit in ``tutorial.rc`` the element list of the ``cso.tutorial`` job to::
cso.tutorial.elements : catalogue
......@@ -355,25 +395,26 @@ This is configured using::
! catalogue creation task:
cso.tutorial.catalogue.figs.class : cso.CSO_Catalogue
cso.tutorial.catalogue.figs.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.catalogue.figs.args : '${__filename__}', \
rcbase='cso.tutorial.catalogue'
! indexer task:
cso.tutorial.catalogue.index.class : utopya.Indexer
cso.tutorial.catalogue.index.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.catalogue.index.args : '${__filename__}', \
rcbase='cso.tutorial.catalogue-index'
The ``figs`` task that creates the figures uses the :py:class:`CSO_Catalogue <.cso_catalogue.CSO_Catalogue>` class
that can be accessed from the :py:mod:`cso` module.
The arguments that initialize the class specify the name of an rcfile with settings
(``tutorial.rc``) and that the settings start with keywords ``'cso.tutorial.catalogue'``.
(in this case the ``tutorial.rc`` that holds the job-tree definition)
and that the settings start with keywords ``'cso.tutorial.catalogue'``.
The configuration describes where to find a *listing* file with orbits,
which variables should be plot, the colorbar properties, etc.
The names of the created figures is composed from the base name of the converted files
The names of the created figures are composed from the base name of the converted files
and the variable that is plotted::
/work/yourname/CSO-Tests/CSO-data-catalogue/2018/06/01/S5p_RPRO_NO2_03278__vcd.png
/work/yourname/CSO-Tutorial/CSO-data-catalogue/2018/06/01/S5p_RPRO_NO2_03278__vcd.png
S5p_RPRO_NO2_03278__qa_value.png
:
......@@ -393,7 +434,7 @@ The arguments that initialize the class specify the name of an rcfile with setti
When succesful, the index creator displays an url that could be loaded in a browser::
Browse to:
file:///work/yourname/CSO-Tests/CSO-data-catalogue/index.html
file:///work/yourname/CSO-Tutorial/CSO-data-catalogue/S5p/NO2/CAMS/index.html
.. figure:: figs/NO2/CSO_NO2_catalogue.png
:scale: 50 %
......@@ -430,7 +471,7 @@ For testing, copy the entire sub directory to a work directory;
configuration assumes a location in the work directory of the pre-processor where
the converted orbit files are::
cd /work/yourname/CSO-Tests
cd /work/yourname/CSO-Tutorial
cp -r ~/CSO/oper CSO-oper
cd CSO-oper
......@@ -569,11 +610,11 @@ The job for this is ``cso.tutorial.catalogue``, which is configured using::
cso.tutorial.sim-catalogue.tasks : figs index
! catalogue creation task:
cso.tutorial.sim-catalogue.figs.class : cso.CSO_SimCatalogue
cso.tutorial.sim-catalogue.figs.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.sim-catalogue.figs.args : '${__filename__}', \
rcbase='cso.tutorial.sim-catalogue'
! indexer task:
cso.tutorial.sim-catalogue.index.class : utopya.Indexer
cso.tutorial.sim-catalogue.index.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.sim-catalogue.index.args : '${__filename__}', \
rcbase='cso.tutorial.sim-catalogue-index'
The ``figs`` task that creates the figures is thus done using the
......@@ -604,7 +645,7 @@ For these variables it is necessary to define whether they should be read from a
The names of the created figures is composed from the base name of the converted files
and the variable that is plotted::
/work/yourname/CSO-Tests/CSO-oper/sim-catalogue/2018/06/01/S5p_RPRO_NO2_20180601_1100_yr.png
/work/yourname/CSO-Tutorial/CSO-oper/sim-catalogue/2018/06/01/S5p_RPRO_NO2_20180601_1100_yr.png
S5p_RPRO_NO2_20180601_1100_ys.png
:
......@@ -625,7 +666,7 @@ The arguments that initialize the class specify the name of an rcfile with setti
When successful, the index creator displays an url that could be loaded in a browser::
Browse to:
file:///work/yourname/CSO-Tests/CSO-oper/sim-catalogue/index.html
file:///work/yourname/CSO-Tutorial/CSO-oper/sim-catalogue/index.html
.. figure:: figs/NO2/CSO_NO2_sim-catalogue.png
:scale: 50 %
......@@ -733,7 +774,7 @@ The configuration of the gridding job is::
cso.tutorial.sim-gridded.class : utopya.UtopyaJobStep
! catalogue creation task:
cso.tutorial.sim-gridded.task.class : cso.CSO_GriddedAverage
cso.tutorial.sim-gridded.task.args : '${PWD}/config/tutorial/tutorial.rc', \
cso.tutorial.sim-gridded.task.args : '${__filename__}', \
rcbase='cso.tutorial.gridded'
It is also possible to create a catalogue of the gridded fields.
......
......@@ -14,6 +14,9 @@
! 2023-08, Arjo Segers
! Replaced `where` constructs by loops after memory errors on some systems.
!
! 2023-11, Arjo Segers
! Close files also in 'read' mode ...
!
!###############################################################################
!
#define TRACEBACK write (csol,'("in ",a," (",a,", line",i5,")")') rname, __FILE__, __LINE__; call csoErr
......@@ -2530,7 +2533,6 @@ contains
subroutine NcFile_Done( self, status )
use NetCDF , only : NF90_Close
use CSO_Comm, only : csoc
! --- in/out ---------------------------------
......@@ -2549,12 +2551,13 @@ contains
! switch:
select case ( self%rwmode )
! read, open:
case ( 'r', 'o' )
! open:
case ( 'o' )
! open/close is managed externally,
! nothing to be done
! write:
case ( 'w' )
! read, write:
case ( 'r', 'w' )
! written on root...
if ( csoc%root ) then
......
......@@ -36,7 +36,7 @@ Actual implementations can be found in submodules:
pymod-cso_file
pymod-cso_inquire
pymod-cso_scihub
pymod-cso_dataspace
pymod-cso_pal
pymod-cso_s5p
pymod-cso_s5p_superobs
......@@ -63,7 +63,7 @@ and are defined according to the following hierchy:
* :py:class:`.UtopyaRc`
* :py:class:`.CSO_Inquire_Plot`
* :py:class:`.CSO_SciHub_Inquire`
* :py:class:`.CSO_DataSpace_Inquire`
* :py:class:`.CSO_PAL_Inquire`
* :py:class:`.CSO_S5p_Convert`
* :py:class:`.CSO_S5p_Listing`
......@@ -112,7 +112,7 @@ and are defined according to the following hierchy:
from cso_file import *
from cso_inquire import *
from cso_scihub import *
from cso_dataspace import *
from cso_pal import *
from cso_s5p import *
from cso_s5p_superobs import *
......
This diff is collapsed.
......@@ -14,6 +14,9 @@
# 2023-08, Arjo Segers
# Reformatted using 'black'.
#
# 2023-11, Arjo Segers
# Added "CheckDir" method.
#
########################################################################
###
......@@ -53,6 +56,34 @@ import logging
########################################################################
def CheckDir(filename):
"""
Check if ``filename`` has a directory path;
if so, create that directory if it does not exist yet.
"""
# modules:
import os
# directory name, could be empty:
dname = os.path.dirname(filename)
# directory defined?
if len(dname) > 0:
# not present yet?
if not os.path.isdir(dname):
# create including subdirs:
os.makedirs(dname)
# endif # dname present
# endif # dname defined
# enddef CheckDir
# *
def Pack_DataArray(da, dtype="i2"):
"""
......
......@@ -532,8 +532,10 @@ class CSO_GriddedAverage(utopya.UtopyaRc):
datafiles.sort()
# info ..
logging.info(indent + " found %i file(s) matching: %s"
% (len(datafiles), infile_curr) )
logging.info(
indent
+ " found %i file(s) matching: %s" % (len(datafiles), infile_curr)
)
# endif # listing or filenames
......
......@@ -67,28 +67,29 @@ import utopya
class CSO_Inquire_Plot(utopya.UtopyaRc):
"""
Create plot of processing version versus time to indicate the available orbits in the SciHub archive.
Create plot of data version versus time to indicate the available orbits.
The information on orbits is taken from a csv table created by :py:class:`CSO_SciHub_Inquire` class.
The information on orbits is taken from a csv table created by for example
the :py:class:`CSO_DataSpace_Inquire` class.
Specifify the name of the table file in the settings::
! listing file:
cso.tutorial.inquire-s5phub-plot.file : ${my.work}/Copernicus/Copernicus_S5p_NO2_s5phub_%Y-%m-%d.csv
<rcbase>.file : ${my.work}/Copernicus/Copernicus_S5p_NO2_%Y-%m-%d.csv
The date templates are by default filled for the current day.
Alternatively, specify an explicit date::
!~ specify dates ("yyyy-mm-dd") to use historic table:
cso.tutorial.inquire-s5phub-plot.filedate : 2022-01-28
<rcbase>.filedate : 2022-01-28
The plot could also be created by combining multiple tables;
use a semi-colon to seperate the file names (and eventually the dates)::
! listing files:
cso.tutorial.inquire-s5phub-plot.file : ${my.work}/Copernicus/Copernicus_S5p_NO2_s5phub_%Y-%m-%d.csv ; \\
<rcbase>.file : ${my.work}/Copernicus/Copernicus_S5p_NO2_%Y-%m-%d.csv ; \\
${my.work}/Copernicus/Copernicus_S5p_NO2_pal_%Y-%m-%d.csv
!~ specify dates ("yyyy-mm-dd") to use historic tables:
!cso.tutorial.inquire-s5phub-plot.filedate : 2022-01-28 ; 2022-01-28
!<rcbase>.filedate : 2022-01-28 ; 2022-01-28
The created plot shows a time line and on the vertical ax the processor versions;
a bar indicates when a certain version was used to process orbits:
......@@ -108,7 +109,7 @@ class CSO_Inquire_Plot(utopya.UtopyaRc):
The following flag is used to ensure that the plot is renewed::
! renew existing plots?
cso.tutorial.inquire-s5phub-plot.renew : True
<rcbase>.renew : True
"""
......
......@@ -107,7 +107,7 @@ class CSO_PAL_Inquire(utopya.UtopyaRc):
Name of output csv file::
! output table, date of today:
cso.s5p.no2.inquire-s5phub.output.file : ${my.work}/PAL_S5P_NO2_%Y-%m-%d.csv
<rcbase>.output.file : ${my.work}/PAL_S5P_NO2_%Y-%m-%d.csv
Example records::
......@@ -238,8 +238,14 @@ class CSO_PAL_Inquire(utopya.UtopyaRc):
platform_name, rest = bname.split("_", 1)
processing = rest[0:4]
product_type = rest[5:15]
(start_time,end_time,orbit,collection,\
processor_version,production_time) = rest[16:].split("_")
(
start_time,
end_time,
orbit,
collection,
processor_version,
production_time,
) = rest[16:].split("_")
# convert:
tfmt = "%Y%m%dT%H%M%S"
......
......@@ -754,8 +754,9 @@ class ColorbarFigure(Figure):
# endif
# get red/green/blue arrays for extensions:
(red_under, green_under, blue_under) = \
matplotlib.colors.colorConverter.to_rgb(color_under)
(red_under, green_under, blue_under) = matplotlib.colors.colorConverter.to_rgb(
color_under
)
red_over, green_over, blue_over = matplotlib.colors.colorConverter.to_rgb(color_over)
# initialise color dictionary:
......@@ -1701,8 +1702,9 @@ def mid2corners(xx):
# *
def GetGrid( shp, xx=None, yy=None, x=None, y=None,
xm=None, ym=None, xxm=None, yym=None, domain=None):
def GetGrid(
shp, xx=None, yy=None, x=None, y=None, xm=None, ym=None, xxm=None, yym=None, domain=None
):
"""
Return 2D grid arrays with corner points.
......
......@@ -2118,8 +2118,10 @@ class CSO_Catalogue_RegionsTimeSeries(cso_catalogue.CSO_CatalogueBase):
# store:
if (len(reg_used) == 0) or (reg_code not in reg_used["code"].values):
reg_used = pandas.concat(
[ reg_used,
pandas.DataFrame({"code": reg_code, "name": reg_name}), ],
[
reg_used,
pandas.DataFrame({"code": reg_code, "name": reg_name}),
],
ignore_index=True,
)
# endif
......@@ -2463,15 +2465,20 @@ class CSO_Statistics_RegionsTables(utopya.UtopyaRc):
rbias_label = "(sim-obs)/obs"
# add record:
df = pandas.concat(
[ df,
pandas.DataFrame( {
[
df,
pandas.DataFrame(
{
"iso2": [reg_code2],
"iso3": [reg_code],
"name": [reg_name],
"time": [tlab],
obs_label: [obs],
sim_label: [sim],
rbias_label: [rbias], } ), ],
rbias_label: [rbias],
}
),
],
ignore_index=True,
)
......
......@@ -18,6 +18,10 @@
# 2023-08, Arjo Segers
# Reformatted using 'black'.
#
# 2023-09, Arjo Segers
# Fixed bug in definition of listing file dates from rcfile settings.
#
#
########################################################################
###
......@@ -692,6 +696,15 @@ class CSO_S5p_File(cso_file.CSO_File):
* ``square`` : create a variable as the square of the input; requires a ``.from`` setting.
Optionally swap layers, for example to have profiles in upward direction
(surface to top) rather than downward (top to bottom)::
<rcbase>.output.var.longitude.swap_layers : True
Optionally provide a target data type; by default original data type in the input file is used::
<rcbase>.output.var.longitude.dtype : f4
Optionally provide target units too.
In the (unlikely) case that the original variable has no ``units`` attribute,
this setting is required to define the (assumed) units.
......@@ -2030,7 +2043,7 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
<rcbase>.timerange.end : 2018-06-03 23:59
The input files are searched in a table created by an *inquire* class,
for example :py:class:`CSO_SciHub_Inquire <cso_scihub.CSO_SciHub_Inquire>`
for example :py:class:`CSO_DataSpace_Inquire <cso_dataspace.CSO_DataSpace_Inquire>`
or :py:class:`CSO_PAL_Inquire <cso_pal.CSO_PAL_Inquire>`.
These have scanned the archives to examine which processings and versions are available,
and stored the result in a csv file.
......@@ -2038,7 +2051,7 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
that is taken from another key::
! listing of available source files,
! created by 'inquire-s5phub' job:
! created by for example 'inquire' job:
<rcbase>.inquire.file : /data/Copernicus/S5p/Copernicus_S5P_NO2_%Y-%m-%d.csv
!! date used in filename, leave empty for today:
!<rcbase>.inquire.filedate : 2022-01-28
......@@ -2053,7 +2066,7 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
! remove downloaded input files after convert?
<rcbase>.downloads.cleanup : False
The input files keep the same name as used in the SciHub archive, for example::
The input files keep the same name as used in the *DataSpace* archive, for example::
/data/Copernicus/S5P/OFFL/NO2/2018/07/S5P_OFFL_L2__NO2____20180701T005930_20180701T024100_03698_01_010002_20180707T022838.nc
start_time end_time orbit
......@@ -2141,10 +2154,11 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
import datetime
import fnmatch
import pandas
import numpy
# tools:
import cso_file
import cso_scihub
import cso_dataspace
import utopya
# info ...
......@@ -2169,7 +2183,7 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
# inquire tables:
filename__templates = self.GetSetting("inquire.file").split(";")
# time stamp in file?
filedates = self.GetSetting("inquire.filedate", default="")
filedates = self.GetSetting("inquire.filedate", default="").split(";")
if len(filedates) == 0:
filedates = [""] * len(filename__templates)
elif len(filedates) != len(filename__templates):
......@@ -2454,8 +2468,13 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
if not os.path.isfile(input_file):
# info ..
logging.info(" not present yet, download ...")
# check ..
if ("href" not in rec.keys()) or numpy.isnan(rec["href"]):
logging.error(f"cannot download, no 'href' element in record ...")
raise Exception
# endif
# download ...
cso_scihub.CSO_SciHub_DownloadFile(rec["href"], input_file)
cso_dataspace.CSO_DataSpace_DownloadFile(rec["href"], input_file)
# store name:
downloads.append(input_file)
# endif
......@@ -2532,7 +2551,7 @@ class CSO_S5p_Convert(utopya.UtopyaRc):
for key in ["orbit", "processing", "processor_version", "collection"]:
attrs[key] = rec[key]
# endfor
attrs["orbit_file"] = input_file
attrs["orbit_file"] = os.path.basename(input_file)
# write:
csf.Write(
filename=output_filename,
......@@ -2813,6 +2832,265 @@ class CSO_S5p_Listing(utopya.UtopyaRc):
# endclass CSO_S5p_Listing
########################################################################
###
### create listing file for downloaded S5P files
###
########################################################################
class CSO_S5p_Download_Listing(utopya.UtopyaRc):
"""
Create *listing* file for files download from S5P data portals.
A *listing* file contains the names of the converted orbit files,
the time range of pixels in the file, and other information extracted from the filenames:
filename ;mission;processing;product_id;start_time ;end_time ;orbit;collection;processor_version;processing_time
RPRO/CH4/2018/04/S5P_RPRO_L2__CH4____20180430T001851_20180430T020219_02818_01_010301_20190513T141133.nc;S5P ;RPRO ;L2__CH4___;2018-04-30T00:18:51;2018-04-30T02:02:19;02818;01 ;010301 ;2019-05-13T14:11:33
RPRO/CH4/2018/04/S5P_RPRO_L2__CH4____20180430T020021_20180430T034349_02819_01_010301_20190513T135953.nc;S5P ;RPRO ;L2__CH4___;2018-04-30T02:00:21;2018-04-30T03:43:49;02819;01 ;010301 ;2019-05-13T13:59:53
:
This file could be used to scan for available versions and how they were produced.
In the settings, define the name of the file to be created::
! csv file that will hold records per file with:
! - timerange of pixels in file
! - orbit number
! time templates are replaced with todays date
<rcbase>.file : /Scratch/Copernicus/S5p/listing-CH4__%Y-%m-%d.csv
An existing listing file is not replaced,
unless the following flag is set::
! renew table?
<rcbase>.renew : True
Orbit files are searched within a timerange::
<rcbase>.timerange.start : 2018-06-01 00:00
<rcbase>.timerange.end : 2018-06-03 23:59
Specify filename filters to search for orbit files;
the patterns are relative to the basedir of the listing file,
and might contain templates for the time values.
Multiple patterns could be defined; if for a certain orbit number more than one
file is found, the first match is used.
This could be explored to create a listing that combines reprocessed data
with near-real-time data::
<rcbase>.patterns : RPRO/CH4/%Y/%m/S5p_*.nc \
OFFL/CH4/%Y/%m/S5p_*.nc
"""
def __init__(self, rcfile, rcbase="", env={}, indent=""):
"""
Convert data.
"""
# modules:
import os
import datetime
import glob
import collections
# tools:
import cso_file
# info ...
logging.info(indent + "")
logging.info(indent + "** create listing file")
logging.info(indent + "")
# init base object:
utopya.UtopyaRc.__init__(self, rcfile=rcfile, rcbase=rcbase, env=env)
# renew output?
renew = self.GetSetting("renew", totype="bool")
# table file to be written:
lst_file = self.GetSetting("file")
# evaluate current time:
lst_file = datetime.datetime.now().strftime(lst_file)
# create?
if (not os.path.isfile(lst_file)) or renew:
# info ..
logging.info(indent + "create %s ..." % lst_file)
# time range:
t1 = self.GetSetting("timerange.start", totype="datetime")
t2 = self.GetSetting("timerange.end", totype="datetime")
# info ...
tfmt = "%Y-%m-%d %H:%M"
logging.info(indent + " timerange: [%s,%s]" % (t1.strftime(tfmt), t2.strftime(tfmt)))
# base directory:
bdir = os.path.dirname(lst_file)
# create?
if len(bdir) > 0:
if not os.path.isdir(bdir):
os.makedirs(bdir)
# endif
# current directory?
if len(bdir) == 0:
bdir = "."
# info ...
logging.info(indent + " base directory: %s ..." % bdir)
# initiallize for (re)creation:
listing = cso_file.CSO_Listing(lst_file, indent=indent + " ")
# info ...
logging.info(indent + " cleanup records if necessary ...")
# remove entries that do not exist anymore:
listing.Cleanup(indent=indent + " ")
# filename pattern templates:
pattern_templates = self.GetSetting("patterns").split()
# collection of scanned patterns:
patterns = []
# loop over days:
t = t1
while t <= t2:
# loop over patterns:
for pattern_template in pattern_templates:
# expand time values:
pattern = t.strftime(pattern_template)
# skip if already scanned ...
if pattern in patterns:
continue
# store:
patterns.append(pattern)
# info ...
logging.info(indent + "scan %s ..." % pattern)
# list relative to basedir:
cwd = os.getcwd()
os.chdir(bdir)
fnames = glob.glob(pattern)
os.chdir(cwd)
# empty ?
if len(fnames) == 0:
logging.info(indent + " empty ..")
continue
# endif
# sort in place:
fnames.sort()
# loop over files:
for fname in fnames:
# absolute path:
filename = os.path.join(bdir, fname)
# already in table?
if fname in listing:
# info ...
logging.info(indent + " keep entry %s ..." % fname)
else:
# info ...
logging.info(indent + " add entry %s ..." % fname)
# Example filename:
# S5P_RPRO_L2__CH4____20180430T001851_20180430T020219_02818_01_010301_20190513T141133.nc
#
# Some products have incorrect product id (should be 10 characters):
# S5P_OFFL_L2__CHOCHO___20200101T005246_20200101T023416_11487_01_010000_20210128.nc
# The extracted product id is then truncated to 10 characters.
#
# basename:
bname, ext = os.path.splitext(os.path.basename(filename))
# extract:
try:
mission, processing, rest = bname.split("_", 2)
if rest.startswith("L2__CHOCHO__"):
product_id = rest[0:10]
(
start_time,
end_time,
orbit,
collection,
processor_version,
prod_time,
) = rest[13:].split("_")
else:
product_id = rest[0:10]
(
start_time,
end_time,
orbit,
collection,
processor_version,
prod_time,
) = rest[11:].split("_")
# endif
except:
logging.error("could not extract filename parts; expected format:")
logging.error(
" S5P_RPRO_L2__CH4____20180430T001851_20180430T020219_02818_01_010301_20190513T141133"
)
logging.error("found:")
logging.error(" %s" % bname)
raise
# endif
# fill data record:
data = collections.OrderedDict()
tfmt = "%Y%m%dT%H%M%S"
data["start_time"] = datetime.datetime.strptime(start_time, tfmt)
data["end_time"] = datetime.datetime.strptime(end_time, tfmt)
data["mission"] = mission
data["processing"] = processing
data["product_id"] = product_id
data["orbit"] = orbit
data["collection"] = collection
data["processor_version"] = processor_version
if len(prod_time) == 8:
data["processing_time"] = datetime.datetime.strptime(
prod_time, "%Y%m%d"
)
else:
data["processing_time"] = datetime.datetime.strptime(
prod_time, tfmt
)
# endif
# update record:
listing.UpdateRecord(fname, data, indent=indent + " ")
# endif # new record?
# endfor # filenames
# endfor # patterns
## testing ...
# break
# next
t = t + datetime.timedelta(0, 3600)
# endwhile
# save:
listing.Close(indent=indent + " ")
else:
# info ..
logging.info(indent + "keep %s ..." % lst_file)
# endif
# info ...
logging.info(indent + "")
logging.info(indent + "** end listing")
logging.info(indent + "")
# enddef __init__
# endclass CSO_S5p_Download_Listing
########################################################################
###
### end
......
This diff is collapsed.