Skip to content

Implement cdds_retrieve_data #628

@matthew-mizielinski

Description

@matthew-mizielinski

We need a new tool that would allow the user to;
A. list data in MASS
B. retrieve specified data to a location with the CMIP structure.

An example of the command I imagine here;

cdds_retrieve_data <moose base location> CMIP6.CMIP.MOHC.UKESM1-0-LL.piControl.r1i1p1f2 variable_file destination

where the variable_file contains

Amon.tas
Amon.pr

will need to retrieve the appropriate datasets and put them in the destination, e.g.

destination/CMIP6/CMIP/MOHC/UKESM1-0-LL/piControl/r1i1p1f2/Amon/
    tas/gn/v20200828/
        tas_Amon_UKESM1-0-LL_piControl_r1i1p1f2_gn_196001-204912.nc
        .
        .
        .
        tas_Amon_UKESM1-0-LL_piControl_r1i1p1f2_gn_375001-383912.nc
    pr/gn/v20200828/
        pr_Amon_UKESM1-0-LL_piControl_r1i1p1f2_gn_196001-204912.nc
        .
        .
        .
        pr_Amon_UKESM1-0-LL_piControl_r1i1p1f2_gn_375001-383912.nc

Start by implementing a dry run option to ensure that the correct listing and retrieval commands are performed.

Data should be retrieved in reasonable sized chunks (i.e. not a single file at once) to a staging directory and then moved to its correct location when the extraction successfully completes. Files that are already on disk in the correct location should not be replaced.

Note that the CDDS inventory will have this information for CMIP6, but that we do not have this for every project

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions