cso_colhub module

The cso_colhub module provides classes for accessing data from the Norwegian ColHub archive. This is a (partial) mirror of the Copernicus Open Access Hub with all orbits covering Norway and surrounding areas.

The data can be accessed using a web-interface, but for Met Norway users also directly from the storage. This module provides the CSO_ColHubMirror_Inquire to inquire the storage to see what is already available. In addition, the CSO_ColHubMirror_Missing is available to create a table that lists files that are in the DataSpace but not yet in the mirror; these could be used to make additional downloads.

Class hierchy

The classes and are defined according to the following hierchy:

Classes

class cso_colhub.CSO_ColHubMirror_Inquire(rcfile, rcbase='', env={}, indent='')

Bases: UtopyaRc

Create listing file for files available in file archive.

The format is similar to the output of inquiry classes, with per line a filename, the time range of pixels in the file, and other information extracted from the filenames:

filename                                                                              ;processing;start_time         ;end_time           ;orbit;collection;processor_version;href
S5P_RPRO_L2__CH4____20180430T001851_20180430T020219_02818_01_010301_20190513T141133.nc;RPRO      ;2018-04-30T00:18:51;2018-04-30T02:02:19;02818;01        ;010301           ;/archive/mirror/S5P_RPRO_L2__CH4____20180430T001851_20180430T020219_02818_01_010301_20190513T141133.nc
S5P_RPRO_L2__CH4____20180430T020021_20180430T034349_02819_01_010301_20190513T135953.nc;RPRO      ;2018-04-30T02:00:21;2018-04-30T03:43:49;02819;01        ;010301           ;/archive/mirror/S5P_RPRO_L2__CH4____20180430T020021_20180430T034349_02819_01_010301_20190513T135953.nc
:

This file could be used to scan for available versions and how they were produced.

In the settings, define one or more base directories of the archive:

<rcbase>.dir           :  /archive/mirror  /scratch/mirror2

The directories are recursively scanned using the os.walk class on files that match a filename pattern::

! search S5P CH4 files:
<rcbase>.pattern       :  S5P_*_L2_CH4___*.nc

Specifiy the output file:

! csv file that will hold records per file with:
! - timerange of pixels in file
! - orbit number
! time templates are replaced with todays date
<rcbase>.file        :  /Scratch/Copernicus/S5p/listing-CH4__%Y-%m-%d.csv

Optionally define a creation mode for the (parent) directories:

! directory creation mode:
<rcbase>.dmode                         :  0o775

An existing listing file is not replaced, unless the following flag is set:

! renew table?
<rcbase>.renew           :  True
class cso_colhub.CSO_ColHubMirror_Missing(rcfile, rcbase='', env={}, indent='')

Bases: UtopyaRc

Create listing file for files that are in one inquiry table but not in another one. This could be used to complete a mirror archive.

The format is similar to the output of inquiry classes, with per line a filename, the time range of pixels in the file, and other information extracted from the filenames:

filename                                                                              ;start_time         ;end_time           ;orbit;processing;collection;processor_version;href
S5P_RPRO_L2__NO2____20180504T073130_20180504T091300_02879_03_020400_20221208T160012.nc;2018-05-04 07:31:30;2018-05-04 09:13:00;02879 ;RPRO     ;03        ;020400           ;https://zipper.dataspace.copernicus.eu/odata/v1/Products(ae43b35c-0569-4e1f-b8cb-0afc49790716)/$value
:

In the settings, define the listing file with all available data, for example the result of an inquiry step; eventually add a timestamp to replace templates in the filename:

<rcbase>.all.file           :  /work/inquire/Copernicus_S5p_NO2_dataspace__%Y-%m-%d.csv
!<rcbase>.all.filedate       :  2025-01-24

Similar specify the name (or “;” seperated list of names of the file that is listing the current mirror(s), probably the output of the CSO_ColHubMirror_Inquire class:

<rcbase>.curr.file           :  /work/inquire/Copernicus_S5p_NO2_colhub-mirror__%Y-%m-%d.csv ; \
                                /work/inquire/Copernicus_S5p_NO2_colhub-mirror2__%Y-%m-%d.csv
!<rcbase>.curr.filedate       :  2025-01-24

Specify a selection filter; this defines which of the orbit files are actually needed:

! Provide ';' seperated list of to decide if a particular orbit file should be processed.
! If more than one file is available for a particular orbit (from "OFFL" and "RPRO" processing),
! the file with the first match will be used.
! The expressions should include templates '%{header}' for the column values.
!
<rcbase>.selection                     :  (%{collection} == '03') and (%{processing} == 'RPRO') ; \
                                          (%{collection} == '03') and (%{processing} == 'OFFL')

Specifiy the output file:

! csv file that will hold records per file with:
! - timerange of pixels in file
! - orbit number
! time templates are replaced with todays date
<rcbase>.file        :  /work/inquire/Copernicus_S5p_NO2_colhub-mirror-missing__%Y-%m-%d.csv

Optionally define a creation mode for the (parent) directories:

! directory creation mode:
<rcbase>.dmode                         :  0o775

An existing listing file is not replaced, unless the following flag is set:

! renew table?
<rcbase>.renew           :  True