cso_scihub module

The cso_scihub module provides classes for accessing data from the Copernicus Open Access Hub. Below first the data hub itself is described, followed by how the CSO pre-processor could be used for batch download of a selection.

Copernicus Open Access Hub

The Copernicus Open Access Hub is the official portal for Copernicus satellite data.

Sci Hub main page.

Home page of Copernicus Open Access Hub (https://scihub.copernicus.eu/)

Different hubs are provided for different data sets. Below we describe:

  • Open Hub Provides access to Sentinel-1,2,3 data.

  • S-5P Pre-Ops Provides access to (pre-operational) Sentinel-5P data.

Open Hub

The Open Hub provides access to Sentinel-1,2,3 data. Since this is the oldest data, most examples in the User Guide refer to this hub.

To download data it is necessary to register. On the main page, select:

The user name and password should be stored in your home directory in the ~/.netrc file (ensure that it has read/write permissions only for you):

# access Copernicus Open Access Hub
machine scihub.copernicus.eu    login **********   password *********

S-5P Pre-Ops Hub

On the main page, select ‘S-5P Pre-Ops’ for the Sentinel-5P Pre-Operations Data Hub .

Access is open, the login/password are shown in the login pop-up. The user name and password should be stored in your home directory in the ~/.netrc file (ensure that it has read/write permissions only for you):

# access Copernicus Open Access Hub
machine s5phub.copernicus.eu    login s5pguest     password s5pguest

Data can be selected and downloaded interactively. In the search bar, open the ‘Advanced Search’ menu and specify a selection. The figure below shows an example for Level2 NO2 data.

Advanced Search menu at S-5P Hub.

Advanced Search menu at S-5P Hub.

The result of a search is a list of product files. Each product file contains data from one orbit. The name of a product file contains the scan period, the orbit number, and a processing time:

S5P_OFFL_L2__NO2____20190701T102357_20190701T120527_08882_01_010302_20190707T120219
|   |    \_prodid_/ \_start-time__/ \__end-time___/ orbit |  |      \__processed__/
|   stream                                                |  processor
mission                                                   collection

The search query is shown above the product list, and contains the time and product selections; the later could be useful for batch download:

( beginPosition:[2019-07-01T00:00:00.000Z TO 2019-07-01T23:59:59.999Z] AND
    endPosition:[2019-07-01T00:00:00.000Z TO 2019-07-01T23:59:59.999Z]     )
AND ( (    platformname:Sentinel-5  AND 
            producttype:L2__NO2___  AND 
        processinglevel:L2          AND 
         processingmode:Offline         ) )

OpenSearch API

For batch processing one can use the OpenSearch API . First a search query is send to the server, and the result then contains links that can be used to download selected files.

Batch download

An alternative for batch processing is to use the API Hub . This page also contains a link to the section of the User Guide with instructions for batch processing.

From the User Guide one can download the ‘dhusget.sh’ script. This will take care of searching the archive, downloading files, checking if the download is complete, etc. The call to download all NO2 data for a single day that overlaps with a longitude/latitude box over Europe looks like:

./dhusget.sh \
    -d 'https://s5phub.copernicus.eu/dhus' \
    -S '2018-07-01T00:00:00.000Z' \
    -E '2018-07-02T00:00:00.000Z' \
    -c '-30,30:45,76' \
    -F 'platformname:Sentinel-5 AND producttype:L2__NO2___ AND processinglevel:L2 AND processingmode:Reprocessing' \
    -o product \
    -O /work/Sentinel-5P/TROPOMI/NO2/ \
    -D

The download script from the API HUb is included in the CSO package, with a few minor modifications:

Batch download via CSO

The CSO download tools use by default the OpenSearch API, but could also use the Batch download option.

The CSO_SciHub_Download class performs the download using the OpenSearch API. This class is prefered since:

  • existing files could be kept without a new download;

  • error messages are cleaner.

See the documentation of the class for the settings to be used.

Alternatively, the CSO_SciHub_Download_DHuS class could be used which will call the dhusget.sh script to perform the download. This will give a lot of intermediate log files, and error messages that are not easily interpreted.

Class hierchy

The classes and are defined according to the following hierchy:

Classes

class cso_scihub.CSO_SciHub_Inquire(rcfile, rcbase='', env={}, indent='')

Bases: utopya_rc.UtopyaRc

Inquire available Sentinel data from the Copernicus Open Access Hub using the OpenSearch API:

A query is sent to search for products that are available for a certain time and overlap with a specified region. The result is a list with orbit files and instructions on how to download them.

In the settings, specify the url of the hub, which is either the Open Data hub or the S5-P hub:

! server url; provide login/password in ~/.netrc:
<rcbase>.url              :  https://s5phub.copernicus.eu/dhus

Specify the time range over which files should be downloaded:

<rcbase>.timerange.start  :  2018-07-01 00:00
<rcbase>.timerange.end    :  2018-07-01 23:59

Specify a target area, only orbits with some pixels within the defined box will be downloaded:

! target area, leave empty for globe; format:  west,south:east,north
<rcbase>.area             :  
!<rcbase>.area             :  -30,30:35,76

A query is used to select the required data. The search box on the hub could be used for inspiration on the format. Note that the ‘producttype’ should have exactly 10 characters, with the first 3 used for the retrieval level, and the last 6 for the product; empty characters should have an underscore instead:

! search query, obtained from interactive download:
<rcbase>.query            :      platformname:Sentinel-5    AND \
                                  producttype:L2__NO2___    AND \
                              processinglevel:L2

Name of ouptut csv file:

! output table, date of today:
<rcbase>.output.file            :  ${my.work}/Copernicus_S5P_NO2_%Y-%m-%d.csv
class cso_scihub.CSO_SciHub_DownloadFile(href, output_file, maxtry=10, timeout=60, indent='')

Bases: object

Download single file from SciHub.

Arguments:

  • href : download url: https://s5phub.copernicus.eu/dhus/odata/v1/Products('d483baa0-3a61-4985-aa0c-5642a83c9214')/$value

  • output_file : target file

Optional arguments:

  • maxtry : number of times to try again if download fails

  • timeout : delay in seconds between requests

class cso_scihub.CSO_SciHub_Download(rcfile, rcbase='', env={}, indent='')

Bases: utopya_rc.UtopyaRc

Download Sentinel data from the Copernicus Open Access Hub using the OpenSearch API:

To download orbit files, first a query is sent to search for products that are available for a certain time and overlap with a specified region. The result is a list with orbit files and instructions on how to download them.

In the settings, specify the url of the hub, which is either the Open Data hub or the S5-P hub:

! server url; provide login/password in ~/.netrc:
<rcbase>.url              :  https://s5phub.copernicus.eu/dhus

Specify the time range over which files should be downloaded:

<rcbase>.timerange.start  :  2018-07-01 00:00
<rcbase>.timerange.end    :  2018-07-01 23:59

Specify a target area, only orbits with some pixels within the defined box will be downloaded:

! target area, leave empty for globe; format:  west,south:east,north
<rcbase>.area             :  
!<rcbase>.area             :  -30,30:35,76

A query is used to select the required data. The search box on the hub could be used for inspiration on the format. Note that the ‘producttype’ should have exactly 10 characters, with the first 3 used for the retrieval level, and the last 6 for the product; empty characters should have an underscore instead:

! search query, obtained from interactive download:
<rcbase>.query            :      platformname:Sentinel-5    AND \
                                  producttype:L2__NO2___    AND \
                              processinglevel:L2            AND \
                               processingmode:Offline

The target directory for downloaded file could include templates for time values:

! output archive, store per month:
<rcbase>.output.dir       :  /data/Copernicus/S5P/OFFL/NO2/%Y/%m

Use the following flag to keep files that are already present:

! renew existing files?
<rcbase.renew             :  False
class cso_scihub.CSO_SciHub_Download_DHuS(rcfile, rcbase='', env={}, indent='')

Bases: utopya_rc.UtopyaRc

Download Sentinel data from the Copernicus Open Access Hub using DHuS (“Data Hub Software”).

A download script dhusget.sh is available from:

A copy of the script is included with CSO, with some minor modifications. The settings should specify the location of the local copy:

! location of script:
<rcbase>.script                 :  ${PWD}/bin/dhusget.sh

The script is called by this class using arguments from the settings. First specify the time range over which files should be downloaded:

<rcbase>.timerange.start  :  2018-07-01 00:00
<rcbase>.timerange.end    :  2018-07-01 23:59

Then specify the url of the hub, which is either the Open Data hub or the S5-P hub:

! server url; provide login/password in ~/.netrc:
<rcbase>.url              :  https://s5phub.copernicus.eu/dhus

Specify a target area, only orbits with some pixels within the defined box will be downloaded:

! target area, leave empty for globe; format:  west,south:east,north
<rcbase>.area             :  
!<rcbase>.area             :  -30,30:35,76

A query is used to select the required data. The search box on the hub could be used for inspiration on the format. Note that the ‘producttype’ should have exactly 10 characters, with the first 3 used for the retrieval level, and the last 6 for the product; empty characters should have an underscore instead:

! search query, obtained from interactive download:
<rcbase>.query            :      platformname:Sentinel-5    AND \
                                  producttype:L2__NO2___    AND \
                              processinglevel:L2            AND \
                               processingmode:Offline

The target directory for downloaded file could include templates for time values:

! output archive, store per month:
<rcbase>.output.dir       :  /data/Copernicus/S5P/OFFL/NO2/%Y/%m

Also specify a temporary work directory where the dhusget.sh script will actually run and create its log files etc:

! work directory, will contain log files etc:
<rcbase>.work.dir         :  /scratch/tmp.DHuS
class cso_scihub.CSO_SciHub_Listing(rcfile, rcbase='', env={}, indent='')

Bases: utopya_rc.UtopyaRc

Create listing file for files download from SciHub or other portals that use equivalent filenames.

A listing file contains the names of the converted orbit files, the time range of pixels in the file, and other information extracted from the filenames:

filename ;mission;processing;product_id;start_time ;end_time ;orbit;collection;processor_version;processing_time RPRO/CH4/2018/04/S5P_RPRO_L2__CH4____20180430T001851_20180430T020219_02818_01_010301_20190513T141133.nc;S5P ;RPRO ;L2__CH4___;2018-04-30T00:18:51;2018-04-30T02:02:19;02818;01 ;010301 ;2019-05-13T14:11:33 RPRO/CH4/2018/04/S5P_RPRO_L2__CH4____20180430T020021_20180430T034349_02819_01_010301_20190513T135953.nc;S5P ;RPRO ;L2__CH4___;2018-04-30T02:00:21;2018-04-30T03:43:49;02819;01 ;010301 ;2019-05-13T13:59:53 :

This file could be used to scan for available versions and how they were produced.

In the settings, define the name of the file to be created:

! csv file that will hold records per file with:
! - timerange of pixels in file
! - orbit number
! time templates are replaced with todays date
<rcbase>.file        :  /Scratch/Copernicus/S5p/listing-CH4__%Y-%m-%d.csv

An existing listing file is not replaced, unless the following flag is set:

! renew table?
<rcbase>.renew           :  True

Orbit files are searched within a timerange:

<rcbase>.timerange.start        :  2018-06-01 00:00
<rcbase>.timerange.end          :  2018-06-03 23:59

Specify filename filters to search for orbit files; the patterns are relative to the basedir of the listing file, and might contain templates for the time values. Multiple patterns could be defined; if for a certain orbit number more than one file is found, the first match is used. This could be explored to create a listing that combines reprocessed data with near-real-time data:

<rcbase>.patterns            :  RPRO/CH4/%Y/%m/S5p_*.nc                                         OFFL/CH4/%Y/%m/S5p_*.nc
class cso_scihub.CSO_SciHub_ListingPlot(rcfile, rcbase='', env={}, indent='')

Bases: utopya_rc.UtopyaRc

Create timeseries plot of number of orbits per processor version. Information taken from listing file created by CSO_SciHub_Listing class.