c3s-cds

This repository contains scripts to download, preprocess, standardize, and consolidate the catalogues available in the CDS.

Environment:

The environment is the one used in the c3s-atlas user tools: https://github.com/ecmwf-projects/c3s-atlas/blob/main/environment.yml

Directory	Contents
requests	Contains one CSV file per CDS catalogue, listing the requested variables, temporal resolution, interpolation method, the target save directory, and whether the variable is raw or requires post-processing to be standardized.
provenance	Contains one JSON file per catalogue describing the provenance and definitions of each variable.
scripts/download	Python scripts to download data from the CDS.
scripts/standardization	Python recipes to standardize the variables.
scripts/derived	Python recipes to calculate derived products from the variables.
scripts/interpolation	Python recipes to interpolate data using reference grids.
scripts/catalogue	Python recipes to produce the catalogues of downloaded data.
catalogues	CSV catalogues of datasets consolidated in Lustre or GPFS. The catalogues are updated through a nightly CI job.

Downloaded data directory Structure

The repository uses a structured directory path format to organize downloaded, derived, and interpolated data:

{base_path}/{product_type}/{dataset}/{temporal_resolution}/{interpolation}/{variable}/

Examples:

Raw ERA5 hourly wind components:

/lustre/.../raw/reanalysis-era5-single-levels/hourly/native/u10/

Note: Interpolated data is stored under derived with the interpolation field indicating the target grid (e.g., gr006). This distinguishes it from calculated variables which use interpolation=native.

Filename format:

Format of the files is "{var}_{dataset}_{date}.nc" With date:

"{year}{month}" for big datasets like CERRA saved month by month (download is faster this way).
"{year}" for the other datasets the data is saved year by year.

Creating Directory Structure

Before downloading data, you can create the complete folder structure without downloading or calculating any data:

# Preview what directories would be created (dry-run mode)
python scripts/create_folder_structure.py --dry-run

# Create all directories
python scripts/create_folder_structure.py

The script reads all CSV files in the requests/ directory and creates the directory structure according to the format: {base_path}/{product_type}/{dataset}/{temporal_resolution}/{interpolation}/{variable}/

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.github/workflows		.github/workflows
catalogues		catalogues
provenance		provenance
requests		requests
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
REPOSITORY_SUMMARY.md		REPOSITORY_SUMMARY.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

c3s-cds

Environment:

Contents

Downloaded data directory Structure

Filename format:

Creating Directory Structure

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

SantanderMetGroup/c3s-cds

Folders and files

Latest commit

History

Repository files navigation

c3s-cds

Environment:

Contents

Downloaded data directory Structure

Filename format:

Creating Directory Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages