Skip to content

ci: increase SWT-Bench default parallelism to 16 workers / 50 batch#534

Closed
simonrosenberg wants to merge 2 commits intomainfrom
fix/swtbench-configurable-parallelism
Closed

ci: increase SWT-Bench default parallelism to 16 workers / 50 batch#534
simonrosenberg wants to merge 2 commits intomainfrom
fix/swtbench-configurable-parallelism

Conversation

@simonrosenberg
Copy link
Collaborator

Summary

Raises the default parallelism for the SWT-Bench image build workflow:

  • max-workers: 4 → 16
  • build-batch-size: 15 → 50

These were reduced to 4/15 as a safety measure during the image build regression investigation. With the 24-hour timeout (#528) and sdist caching (#515) now merged, higher parallelism can be restored.

The inputs remain fully configurable — manual workflow_dispatch runs can still override to any value (e.g., max-workers=4 for debugging).

What this PR does NOT change:

  • No SDK submodule bump
  • No cache mode changes (OPENHANDS_BUILDKIT_CACHE_MODE is not set — SDK defaults apply)

Test plan

  • Pre-commit checks pass
  • Trigger a manual SWT-bench build to verify the new defaults work

Related: #531

Raise the default `max-workers` from 4 to 16 and `build-batch-size`
from 15 to 50 for the SWT-Bench image build workflow.

The previous defaults were reduced as a safety measure during the
image build regression investigation. With the 24-hour timeout (#528)
and sdist caching (#515) now in place, higher parallelism can be
restored. The inputs remain fully configurable for manual runs that
need different settings.

Related: #531

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - This is fine. You're changing numbers in a YAML file after fixing the root causes. The defaults are duplicated across inputs/env/fallbacks, but that's how GitHub Actions workflow_dispatch works — over-engineering a "single source of truth" would add more complexity than the duplication costs. Easily reversible if the parallelism causes issues. Ship it.

…ce of truth

The job-level env block duplicated default values already defined in
the workflow inputs. Every step that uses these vars re-assigns them
from inputs with inline fallbacks, so the env defaults were never
actually read.

Remove DATASET, SPLIT, MAX_WORKERS, BUILD_BATCH_SIZE, and N_LIMIT
from the env block. Keep only INSTANCE_IDS and SELECT_FILE which
need initialization for set -euo pipefail (they can be legitimately
empty).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@simonrosenberg
Copy link
Collaborator Author

Closing — the defaults are already configurable via workflow_dispatch inputs. Before changing defaults, we should do a proper grid search to find the optimal values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants