Skip to content

Enable Swell to use GMAO AWS R2D2 server and S3 data store#734

Open
ftgoktas wants to merge 31 commits intodevelopfrom
feature/aws-datastore
Open

Enable Swell to use GMAO AWS R2D2 server and S3 data store#734
ftgoktas wants to merge 31 commits intodevelopfrom
feature/aws-datastore

Conversation

@ftgoktas
Copy link
Copy Markdown
Member

@ftgoktas ftgoktas commented Mar 11, 2026

Description

AWS R2D2 server is ready and configured for GMAO users. Users and API keys can be created, and Swell can use this server to ingest and fetch observations via S3 instead of the JCSDA server and Discover local storage.

See the configuration guide for setup steps, example configs and IngestObs test commands.

Closes #733

Comment thread docs/configuring_aws_server.md Outdated
Comment thread docs/configuring_aws_server.md Outdated
Comment thread docs/configuring_aws_server.md Outdated
Comment thread docs/configuring_aws_server.md Outdated
Comment thread src/swell/utilities/scripts/prod_setup_env.sh Outdated
Comment thread src/swell/utilities/scripts/prod_setup_env.sh Outdated
Comment thread src/swell/utilities/r2d2.py Outdated
@ftgoktas
Copy link
Copy Markdown
Member Author

ftgoktas commented Mar 17, 2026

The AWS R2D2 server is up and running from 8:30am EST to 5:30pm EST on weekdays. Lifetime values and model names have been registered in the database with the following configuration:

Lifetimes:

  • debug: 14 days
  • science: 180 days
  • publication: 1825 days (5 years)
  • release: indefinite

Models:

  • geos
  • mom6
  • geos_cf
  • mom6_cice6_UFS

Please follow the instructions in the configuration guide to test against the server. Let us know if you run into any issues or have questions.

@ftgoktas ftgoktas marked this pull request as ready for review March 17, 2026 19:37
Comment thread docs/configuring_aws_server.md Outdated
Comment thread src/swell/utilities/r2d2.py Outdated
source venv_client/bin/activate
export R2D2_HOST="discover"
export R2D2_COMPILER="intel"
export R2D2_SERVER_HOST="http://13.217.72.149"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we replace such hardcoded server IP with a hostname or config variable. Sould we avoid plaintext AWS keys unless that is truly expected for our operations? @shiklomanov-an ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IP is an Elastic IP so it won't change, but agreed a DNS hostname would be cleaner long-term.

The server now supports two data stores: the local one on Discover (geos_cf directory with priority 1) and the S3 bucket (priority 2). For Discover, users only need R2D2_USER and R2D2_API_KEY and no AWS credentials. AWS credentials are only required when accessing the S3 data store.

Copy link
Copy Markdown
Contributor

@jeromebarre jeromebarre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me but see my minor comments

Copy link
Copy Markdown
Collaborator

@mranst mranst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me a while to review this. I was able to ingest and fetch obs from AWS, but I found that having to swap out the credentials file made it difficult to switch between working on other things. I wonder if we could make use of the dictionary structure in the yaml file to allow for datastores to be selected using an experiment key. Maybe something like:

r2d2_credentials.yaml

jcsda:
  user: ...
  api_key: ...
  ...

gmao_aws:
  user: ...
  api_key: ...
  ...

Swell could then select which of those credentials to use based on a key called r2d2_datastore, or something like that

Comment thread docs/examples/r2d2/ingest_obs.md
@ftgoktas ftgoktas marked this pull request as draft April 20, 2026 18:13
@ftgoktas ftgoktas marked this pull request as ready for review April 21, 2026 19:52
@ftgoktas
Copy link
Copy Markdown
Member Author

Tested on Discover with both JCSDA's and GMAO's R2D2 servers. Users can now select which R2D2 server and datastore to use directly from experiment.yaml via r2d2_server and r2d2_datastore fields, without needing to modify credentials or code between runs. It defaults to gmao_server and the r2d2-experiments-prod-us-east-1 S3 datastore. Updated the docs with more details for testing: Configuring different R2D2 servers and datastores with Swell on Discover

Copy link
Copy Markdown
Collaborator

@mranst mranst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for accommodating my request, this is looking great. I have a few last points - the largest of which is maintaining continuity once this PR goes in. I'd like to keep the amount of changes that users need to make to their environment to run swell as low as possible. Configuring R2D2 is probably the biggest stumbling point users encounter when trying to run swell, so if possible I think it's best to try to maintain compatibility with the current format of the credentials file

Comment thread src/swell/utilities/question_defaults.py
Comment on lines 172 to +179
try:
yaml = YAML(typ='safe')
with open(yaml_path, 'r') as yaml_file:
credentials = yaml.load(yaml_file)
credentials_yaml = yaml.load(yaml_file) or {}
except Exception as e:
logger.error(f"Error loading R2D2 credentials from {yaml_path}: {e}")
logger.info("Continuing with existing environment variables...")
credentials = {}
credentials_yaml = {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a common type of exception you're expecting here?

Comment thread src/swell/utilities/r2d2.py Outdated
Comment thread src/swell/utilities/r2d2.py Outdated
Comment thread src/swell/utilities/r2d2.py Outdated
Comment on lines 232 to +250
# Set host and compiler (YAML config takes precedence over platform detection)
if 'host' in credentials and 'R2D2_HOST' not in os.environ:
os.environ['R2D2_HOST'] = credentials['host']
logger.info(f"Using platform host '{r2d2_host}' (overriding YAML '{credentials['host']}')")
logger.warning("Using host from YAML file")
if 'r2d2_host' in credentials and 'R2D2_HOST' not in os.environ:
os.environ['R2D2_HOST'] = credentials['r2d2_host']
logger.info(
f"YAML r2d2_host ({credentials['r2d2_host']!r}) overrides "
f"platform default ({r2d2_host!r})"
)

elif r2d2_host and 'R2D2_HOST' not in os.environ:
os.environ['R2D2_HOST'] = r2d2_host
logger.info(f"Set R2D2_HOST={r2d2_host} from platform configuration")

# Set compiler
if 'compiler' in credentials and 'R2D2_COMPILER' not in os.environ:
os.environ['R2D2_COMPILER'] = credentials['compiler']
logger.info(f"Using platform compiler '{r2d2_compiler}' \
(overriding YAML '{credentials['compiler']}')")
logger.warning("Using compiler from YAML file")
if 'r2d2_compiler' in credentials and 'R2D2_COMPILER' not in os.environ:
os.environ['R2D2_COMPILER'] = credentials['r2d2_compiler']
logger.info(
f"YAML r2d2_compiler ({credentials['r2d2_compiler']!r}) overrides "
f"platform default ({r2d2_compiler!r})"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand changing host to the more-specific r2d2_host now that we also have r2d2_server_host in the mix. My only concern is that this will require users to change this file once this PR goes in. Maybe we can consider accommodating both keys, keeping host and compiler as functional but deprecated? We can even throw a deprecation warning if the old names are detected.

Comment thread docs/configuring_aws_server.md Outdated
Comment thread r2d2_credentials.yaml
# Platform configuration
host: discover-gmao
compiler: intel
1. Single server — put credentials at the root level:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some lines in this yaml file should be comments?

@dataclass
class r2d2_server(SuiteQuestion):
default_value: str = "gmao_server"
default_value: str | None = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_value is defined twice here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

R2D2 GMAO Server

5 participants