Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clear, honest documentation that solves a real problem.
This accurately reflects the current state: Apptainer is in the SDK but not wired into the benchmark CLI. The writing is pragmatic and gives users concrete paths forward on Docker-restricted systems. No bikeshedding, no pretending features exist that don't - just straightforward technical documentation.
Taste Rating: Elegant
Verdict: ✅ Ship it
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Following up: I tried the same validation path with a public benchmark dataset instead of GAIA. Command attempted:
This is stronger evidence than the GAIA attempt because it used a public dataset and a published image:
The run reached
So at this point the blocker in this sandbox is no longer dataset access; it is the runtime environment itself. I still cannot complete Apptainer end-to-end validation here because:
The PR remains draft. There is still insufficient evidence to merge this PR, since it has not been run end-to-end; end-to-end Apptainer validation is currently blocked by human QA. |
Summary
--workspace apptainersupport in the shared parser/models and the supported runnerscreate_apptainer_workspace()helper for pre-built agent-server images, with configurable Apptainer runtime env vars--push, improve the error message for local-only builds, and reuse cached SIFs fromAPPTAINER_CACHE_DIRTesting
Evidence
uv run swebench-infer .llm_config/example.json --dataset princeton-nlp/SWE-bench_Lite --split test --select <tmpfile containing astropy__astropy-12907> --workspace apptainer --num-workers 1 --max-iterations 1 --max-attempts 1ApptainerWorkspaceinitialization and resolved a published image successfully:ghcr.io/openhands/eval-agent-server:bde715c-sweb.eval.x86_64.astropy_1776_astropy-12907-source-minimal[Errno 2] No such file or directory: 'apptainer'apptaineris not installed/dev/fuseis unavailable/var/run/docker.sockis unavailable, so a local Docker fallback is not possible here either