Skip to content

Introduce file staging delegation via JSON staging manifest #399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

kysrpex
Copy link

@kysrpex kysrpex commented Jun 30, 2025

Introduce a JSON transfer action that indicates that the pulsar server should create a JSON manifest that can be used to stage files by an external system that can stage files in and out of the compute environment. This should be the general strategy for collecting input and output files for ARC, DIRAC, AWS batch etc.

Before launching jobs, Pulsar clients receive the staging manifest as a dictionary. When the job is complete, the script pulsar-create-output-manifest can be used to create a manifest declaring which files should be staged out by the external system.

This PR also updates the coexecutor Dockerfile so that it works again and so that it builds the Pulsar wheel (no need to build it separately anymore).

This should be the general strategy for collecting input and output
files for ARC, DIRAC, AWS batch etc.
@kysrpex
Copy link
Author

kysrpex commented Jun 30, 2025

@mvdbeek That's all I needed to build the ARC integration (coming soon as a different PR). You may want to compare it to your branch offline_connector_marius. I stripped out the Pulsar client (that comes in the other PR), the tests (please let me know which tests should make it into the PR) and a couple of things that were not needed to complete the integration.

mvdbeek and others added 6 commits June 30, 2025 17:07
Change base image from `conda/miniconda3` (based off Debian Stretch) to `python:3.12-bookworm`. Miniconda is not required in the base image.

Add the Galaxy Depot repository, which provides SLURM DRMAA packages for Debian Buster and newer releases.

Do not install the package `apt-transport-https`, it is now a dummy package, see https://packages.debian.org/en/bookworm/apt-transport-https. Install the package `slurm` instead of `slurm-llnl`.

Newer versions of the `munge` package include the binary `/usr/sbin/mungekey` instead of `/usr/sbin/create-munge-key`. Nevertheless, the key seems to be created automatically when installing the package, as running `mungekey` yields 'mungekey: Error: Failed to create "/etc/munge/munge.key": File exists'.
Build wheel automatically when building the Docker image. Exclude the source code from the output image through a multistage build.
…oexecutionLaunchMixin` to `BaseRemoteConfiguredJobClient`
@kysrpex kysrpex force-pushed the offline_collector branch from 1e1a3c7 to 7c8371a Compare June 30, 2025 15:07
@kysrpex
Copy link
Author

kysrpex commented Jun 30, 2025

If this one is fine, then #370 should be closed.

@@ -0,0 +1,45 @@
import argparse
Copy link
Author

@kysrpex kysrpex Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmchilton commented:

A comment at the top of this file would be appreciated as well.

FROM conda/miniconda3
FROM python:3.12-bookworm
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmchilton commented:

I don't think it is fair at all to say "Miniconda" is not required in the base image as the commit message suggests. I get that it isn't the modality that you wish to run it in but it is documented in https://pulsar.readthedocs.io/en/latest/containers.html#co-execution as an option and it is an option that makes a lot of sense to me. Is the slurm stuff required for your use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants