Skip to content

feat: Cloud modal mvp#982

Open
int-chaos wants to merge 32 commits intomainfrom
cloud-modal-mvp
Open

feat: Cloud modal mvp#982
int-chaos wants to merge 32 commits intomainfrom
cloud-modal-mvp

Conversation

@int-chaos
Copy link
Collaborator

@int-chaos int-chaos commented Mar 7, 2026

feat: run MassGen as a cloud job (single-agent MVP) on Modal

Address Issue #885

Type of change

  • New feature (feat:) - Non-breaking change which adds functionality

Checklist

  • I have run pre-commit on my changed files and all checks pass
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Pre-commit status

# Paste the output of running pre-commit on your changed files:
# uv run pre-commit install
# git diff --name-only HEAD~1 | xargs uv run pre-commit run --files # for last commit
# git diff --name-only origin/<base branch>...HEAD | xargs uv run pre-commit run --files # for all commits in PR
# git add <your file> # if any fixes were applied
# git commit -m "chore: apply pre-commit fixes"
# git push origin <branch-name>

How to Test

Add test method for this PR.

Test CLI Command

Write down the test bash command. If there is pre-requests, please emphasize.

Expected Results

Description/screenshots of expected results.

Additional context

Add any other context about the PR here.

Summary by CodeRabbit

  • New Features

    • Run MassGen jobs in the cloud via a new --cloud option (with --cloud-timeout); progress streams to your terminal and results are saved locally under .massgen/cloud_jobs/.../artifacts/ and emitted as machine-readable outputs.
  • Documentation

    • Added a MassGen Cloud user guide with setup, Quick Start, and example cloud run commands.
  • Tests

    • Added tests for cloud launcher behavior, progress streaming, artifact extraction, and context upload handling.
  • Chores

    • Added optional Modal dependency for cloud execution.

@coderabbitai
Copy link

coderabbitai bot commented Mar 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an MVP Modal cloud execution path: CLI flags and runner to launch single-agent MassGen jobs on Modal, plus a Modal app, launcher, utilities for context/artifact handling, tests, and user documentation for cloud runs and local artifact materialization.

Changes

Cohort / File(s) Summary
Cloud core API
massgen/cloud/__init__.py, massgen/cloud/cloud_job.py
Add cloud job types and launcher interface: CloudJobError, CloudJobRequest, CloudJobResult, CloudJobLauncher and workspace/marker conventions.
Modal app & launcher
massgen/cloud/modal_app.py, massgen/cloud/modal_launcher.py
New Modal app entrypoint and ModalCloudJobLauncher: payload encoding, remote invocation, stream handling, result-marker parsing, artifact tarballing/extraction, and error/timeout handling.
Cloud utilities
massgen/cloud/utils.py
Helpers for base64 tarball create/extract, automation-value parsing, Modal context constants, and process_context_paths to upload/rewrite context entries (preserves metadata).
CLI integration
massgen/cli.py
New --cloud and --cloud-timeout flags and _run_cloud_job flow: rewrites context paths, enforces single-agent for cloud, launches remote job, materializes artifacts locally, and emits automation-mode machine-readable outputs; maps cloud timeout errors to EXIT_TIMEOUT.
Tests
massgen/tests/test_cloud_modal_launcher.py
Unit tests for Modal launcher and utils: artifact extraction, progress streaming, and context path processing (uses mocked Volume/Popen).
Docs & deps
docs/source/user_guide/cloud.rst, requirements.txt
Add user guide for cloud runs and Quick Start; add optional dependency modal>=1.3.4.

Sequence Diagram(s)

sequenceDiagram
    participant U as User/CLI
    participant C as MassGen CLI
    participant L as ModalCloudJobLauncher
    participant S as Modal Service
    participant R as Remote App
    participant F as Local FS

    U->>C: run massgen --cloud ...
    C->>C: validate single-agent config
    C->>L: launch(CloudJobRequest)
    L->>S: invoke run_massgen_job(payload_b64)
    S->>R: start remote execution
    R->>R: decode payload, write config, upload context
    R->>R: run massgen subprocess, stream stdout/stderr
    R->>R: create artifacts tar.gz (base64) and emit RESULT_MARKER(JSON)
    S-->>L: stream stdout (contains RESULT_MARKER)
    L->>L: parse marker, decode/extract artifacts
    L->>F: write .massgen/cloud_jobs/job_{id}/artifacts/
    L-->>C: return CloudJobResult (final_answer, paths)
    C->>U: print results and artifact locations
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Suggested reviewers

  • ncrispino
  • a5507203
🚥 Pre-merge checks | ✅ 3 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Documentation Updated ⚠️ Warning Pull request missing Google-style docstrings for four utility functions and has unresolved critical issues with type annotations and validation. Add docstrings to parse_automation_value, make_tar_gz_b64, extract_artifacts, process_context_paths. Fix type annotation and add cloud_job_id validation.
Title check ❓ Inconclusive The title 'feat: Cloud modal mvp' references a real part of the changeset (cloud modal implementation) but is vague and not fully descriptive, using the acronym 'mvp' without clarity and failing to convey the main feature (MassGen cloud job execution). Consider a more descriptive title like 'feat: Add MassGen cloud job execution on Modal' that clearly communicates the feature being added without unclear acronyms.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is largely complete with the correct format, type of change, and most checklist items marked, but testing sections contain only placeholders with no concrete test commands or expected results provided.
Capabilities Registry Check ✅ Passed This PR introduces cloud execution capabilities through new cloud modules without modifying backend or model definitions, so capabilities registry updates are not required.
Config Parameter Sync ✅ Passed PR does not introduce new YAML parameters; cloud feature only utilizes existing parameters like context_paths and ui.display_type already in exclusion lists.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cloud-modal-mvp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@int-chaos int-chaos marked this pull request as ready for review March 8, 2026 02:04
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (6)
docs/source/user_guide/cloud.rst (1)

22-30: Show users where {job_id} comes from.

Line 30 introduces the artifact path template, but the quick start never shows the progress/completion output that contains that generated id. A short success transcript would make it obvious how to find .massgen/cloud_jobs/job_<id>/artifacts/.

As per coding guidelines, "Documentation for new features must include runnable commands with expected output."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/user_guide/cloud.rst` around lines 22 - 30, Add a short, runnable
success transcript to docs/source/user_guide/cloud.rst that follows the existing
example command (massgen --cloud --config config.yaml "Your question") and shows
the terminal output that contains the generated job id; specifically include the
line from the tool that prints the created job id (so readers can copy that id)
and a final line demonstrating the concrete artifact path
.massgen/cloud_jobs/job_<job_id>/artifacts/ using that printed id. Ensure the
transcript is realistic, minimal, and clearly labels which line is the job id so
users can locate their artifacts.
massgen/cli.py (1)

463-464: Add Google-style docstrings to the new cloud helpers.

Both helpers are new/changed but only have summary lines; the repo standard for changed Python functions is Google-style Args/Returns documentation.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings.

Also applies to: 471-472

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 463 - 464, The new cloud helper functions (e.g.,
_is_single_agent_config(config: Dict[str, Any]) -> bool and the other helper
added in this change) only have summary lines; update each to use Google-style
docstrings including an Args section documenting parameters and types (for
_is_single_agent_config: config: Dict[str, Any]), a Returns section describing
the return type and meaning (bool indicating whether exactly one agent is
defined), and a short description line—apply the same pattern to the other
new/changed cloud helper(s) in this diff so all new/changed Python functions
follow the repo’s Google-style docstring convention.
massgen/cloud/cloud_job.py (1)

59-65: Add Google-style docstrings to __init__ and launch methods.

Per coding guidelines, new functions should include Google-style docstrings. Both __init__ and launch lack documentation.

📝 Proposed docstrings
     def __init__(self, workspace_root: Optional[Path] = None):
+        """Initialize the cloud job launcher.
+
+        Args:
+            workspace_root: Optional root directory for job artifacts.
+                Defaults to .massgen/cloud_jobs in the current directory.
+        """
         base = workspace_root or (Path.cwd() / ".massgen" / "cloud_jobs")
         self.workspace_root = base
         self.workspace_root.mkdir(parents=True, exist_ok=True)

     def launch(self, request: CloudJobRequest) -> CloudJobResult:
+        """Launch a cloud job with the given request.
+
+        Args:
+            request: The cloud job request payload.
+
+        Returns:
+            CloudJobResult containing job output and artifacts.
+
+        Raises:
+            NotImplementedError: Must be implemented by subclasses.
+        """
         raise NotImplementedError

As per coding guidelines: "For new or changed functions, include Google-style docstrings"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/cloud_job.py` around lines 59 - 65, Add Google-style docstrings
for the CloudJob class's __init__ and launch methods: for __init__ document
parameters (workspace_root: Optional[Path]), describe default behavior (using
Path.cwd() / ".massgen" / "cloud_jobs"), side effects (creating the directory
via self.workspace_root.mkdir(parents=True, exist_ok=True)), and set the
instance attribute self.workspace_root; for launch document the parameter type
(request: CloudJobRequest), the return type (CloudJobResult), expected behavior
(abstract / raises NotImplementedError), and any exceptions raised. Place
concise, Google-style docstrings directly above the __init__ and launch
definitions referencing the same names (__init__, launch, CloudJobRequest,
CloudJobResult, workspace_root).
massgen/cloud/modal_app.py (3)

61-63: Expand the docstring for the main Modal function.

The current docstring is minimal. Given this is the primary entrypoint, a more detailed docstring would help maintainability.

📝 Proposed expanded docstring
 `@app.function`(image=image, timeout=60 * 60, volumes={CONTEXT_MOUNT_PATH: context_vol})
 def run_massgen_job(payload_b64: str) -> dict:
-    """Local entrypoint for Modal job."""
+    """Modal entrypoint for running a MassGen cloud job.
+
+    Args:
+        payload_b64: Base64-encoded JSON payload containing prompt,
+            config_yaml, timeout_seconds, output_filename, and env.
+
+    Returns:
+        Dict with status, final_answer, remote_log_dir, and
+        artifacts_tar_gz_b64 (base64-encoded tarball of artifacts).
+    """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 61 - 63, Expand the docstring for
the run_massgen_job function to a full, descriptive summary: state that it is
the Modal app entrypoint invoked in the containerized job (decorated by
`@app.function`), describe the payload_b64 parameter (base64-encoded payload
structure and expected keys), document the return value (dictionary shape and
important keys), list possible exceptions or error conditions raised by the
function, and note side effects/environmental expectations (use of
CONTEXT_MOUNT_PATH, context_vol, image, timeout and any required environment
variables or mounted volumes). Keep it concise but explicit so future readers
know what the function expects, produces, and depends on.

23-35: Add Google-style docstrings to helper functions.

Both parse_automation_value and make_tar_gz_b64 lack docstrings.

📝 Proposed docstrings
 def parse_automation_value(label: str, text: str) -> str | None:
+    """Extract a labeled value from text output.
+
+    Args:
+        label: The label prefix to search for (e.g., "LOG_DIR").
+        text: The text to search within.
+
+    Returns:
+        The extracted value if found, otherwise None.
+    """
     pattern = re.compile(rf"^{re.escape(label)}:\s*(.+)$", re.MULTILINE)
     match = pattern.search(text)
     return match.group(1).strip() if match else None


 def make_tar_gz_b64(source_dir: Path) -> str:
+    """Create a base64-encoded gzipped tarball of a directory.
+
+    Args:
+        source_dir: The directory to archive.
+
+    Returns:
+        Base64-encoded string of the tar.gz archive.
+    """
     buf = io.BytesIO()
     with tarfile.open(fileobj=buf, mode="w:gz") as tar:

As per coding guidelines: "For new or changed functions, include Google-style docstrings"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 23 - 35, Add Google-style docstrings
to the two helper functions: parse_automation_value and make_tar_gz_b64. For
parse_automation_value, document purpose (extract labeled value from multiline
text), parameters (label: str, text: str), return (str | None) and behavior when
no match is found; for make_tar_gz_b64, document purpose (create gzipped tarball
of a directory and return base64 string), parameters (source_dir: Path), return
(str) and mention that files are added with relative paths and the function
reads into memory (may raise I/O errors). Place the docstrings immediately above
each function using Google style (short summary, blank line, Args:, Returns:,
and optional Raises:).

78-81: Static analysis flagged /tmp/massgen_cloud_job as potentially insecure.

While this is flagged by Ruff (S108), the usage is acceptable here since Modal containers are ephemeral and isolated. Consider adding a brief comment to suppress the lint warning and document this intentional choice.

💡 Suppress lint warning with comment
-    workspace = Path("/tmp/massgen_cloud_job")
+    workspace = Path("/tmp/massgen_cloud_job")  # noqa: S108 - ephemeral Modal container
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 78 - 81, The hard-coded
Path("/tmp/massgen_cloud_job") assignment is flagged by Ruff S108; add a brief
inline comment immediately above or on the same line as the workspace
declaration to suppress and document the lint warning (e.g., a "# nosec: S108 —
Modal containers are ephemeral and isolated, so using /tmp is intentional"
comment). Reference the workspace variable and the Path(...) expression when
making the change so the linter suppression and rationale are colocated with the
code that creates the temporary directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/cli.py`:
- Around line 9166-9176: The cloud branch calls _run_cloud_job() too early so
shared preprocessing (inject_prompt_context_paths() and the effective
config/mode handling in main that updates orchestrator.context_paths and flags
like --single-agent) is skipped; move the if getattr(args, "cloud", False) block
(and the call to _run_cloud_job) to execute after inject_prompt_context_paths()
and the shared mode-handling code that produces the effective config, but keep
it before relocate_filesystem_paths(), and preserve the early return after
_run_cloud_job; ensure references to _run_cloud_job,
inject_prompt_context_paths, relocate_filesystem_paths, main, and
orchestrator.context_paths are updated accordingly.
- Around line 463-471: The type annotations use undefined names Dict and
Optional which break import-time evaluation; update the signatures to use
built-in generics and PEP 604 unions: change _is_single_agent_config(config:
Dict[str, Any]) -> bool to _is_single_agent_config(config: dict[str, Any]) ->
bool, change _run_cloud_job(args: argparse.Namespace, config: Dict[str, Any],
config_path_label: Optional[str]) -> None to use config: dict[str, Any] and
config_path_label: str | None, and also update the other signature that declares
output_path: Path | None to use output_path: Path | None (PEP 604 style) or
ensure it uses the pipe form if currently using Optional; apply these edits to
the function parameter lists for _is_single_agent_config, _run_cloud_job, and
the function that defines output_path so all annotations use built-in
generics/unions instead of Dict/Optional.
- Around line 9939-9944: The exception handler currently does a top-level import
of CloudJobError from massgen.cloud.modal_launcher which can raise ImportError
and mask the original failure from _run_cloud_job(); wrap the import in a
try/except (catch ImportError/Exception) inside the except block so the handler
falls back to the generic error path if the import fails, and only perform
isinstance(e, CloudJobError) and the timeout string check when the import
succeeded (otherwise use the existing generic error handling/exit codes).

In `@massgen/cloud/cloud_job.py`:
- Around line 1-83: Pre-commit hooks (pyupgrade/autoflake) changed this file;
run the project's pre-commit suite and commit the resulting edits so CI passes.
Specifically, run `pre-commit run --all-files` locally, review and stage the
modifications that pre-commit applies to cloud_job.py (changes commonly affect
imports/usages in classes/functions like CloudJobLauncher, CloudJobRequest,
CloudJobResult, _extract_marker_payload, and collect_cloud_env), then commit and
push the updated file.
- Around line 67-73: The _extract_marker_payload method currently calls
json.loads on the extracted raw payload and will propagate a JSONDecodeError for
malformed JSON; modify _extract_marker_payload to catch json.JSONDecodeError
(and optionally ValueError) around json.loads and return None when parsing
fails, preserving the existing behavior used when the marker is not found;
reference the class constant RESULT_MARKER and ensure the exception handling is
limited to the json.loads call so other errors still surface.

In `@massgen/cloud/modal_app.py`:
- Around line 138-140: The code writes raw stdout_text into events.jsonl which
can produce invalid JSONL; modify the write to either (a) filter stdout_text
into valid JSONL by splitting stdout_text into lines and keeping only lines that
parse with json.loads (or match expected event prefix) then write those to
(artifact_root / "events.jsonl"), or (b) if you intend to keep raw output,
rename the target to a clearly named file like events_raw.log or stdout.jsonl;
update the write call that currently uses stdout_text and the symbols
artifact_root and stdout_text accordingly and ensure invalid JSON lines are
excluded before writing to events.jsonl.
- Line 68: The parsed timeout value int(payload.get("timeout_seconds", 1800)) is
currently unused; either remove that dead extraction line or wire the parsed
value into the function decorator that currently hardcodes timeout=60 * 60 so
the payload controls the timeout. Locate the timeout parsing (timeout_seconds)
and the decorator with timeout=60 * 60 in modal_app.py and either (a) assign the
parsed value to a variable (e.g., timeout_seconds) and replace the decorator's
timeout argument with that variable, or (b) delete the unused int(...) line if
the hardcoded decorator timeout is intended.

In `@massgen/cloud/modal_launcher.py`:
- Around line 26-41: The payload is being passed on argv via payload_b64
(variable payload_b64) to the subprocess cmd calling modal run
massgen/cloud/modal_app.py::run_massgen_job, which exposes secrets and can hit
argv size limits; instead, serialize payload to a secure non-argv transport and
pass a handle: write payload (the dict currently assigned to payload) to a
secure temporary file with restrictive permissions (e.g.,
tempfile.NamedTemporaryFile with delete=False and mode=0o600), pass the file
path to modal run using a new flag like --payload-file (replace payload_b64 in
cmd), and ensure the child (run_massgen_job in modal_app.py) reads and deletes
that temp file; alternatively use Modal's secrets API to store env and have
run_massgen_job read secrets rather than receiving env via argv—update both
modal_launcher.py (replace payload_b64 usage) and modal_app.py (add
file/secret-based loading logic).
- Around line 35-41: The command is using a repository-relative path
"massgen/cloud/modal_app.py::run_massgen_job" which fails when not run from the
repo root; update the entry in the cmd list in modal_launcher.py to reference
the installed package module instead (e.g.
"massgen.cloud.modal_app::run_massgen_job") so Modal can resolve the handler
from the installed package; if Modal requires a filesystem path instead, use
importlib.resources or importlib.util to locate the installed package file and
substitute its absolute path into the cmd element rather than the relative path.

In `@massgen/cloud/utils.py`:
- Around line 25-28: The extract_artifacts function currently calls
tarfile.extractall which is unsafe; update extract_artifacts to validate members
from tf.getmembers() before extracting: for each TarInfo ensure the resolved
destination path (out_dir.resolve() joined with member.name) starts with
out_dir.resolve() (to prevent path traversal) and only allow regular files and
directories (skip symlinks, hard links, device nodes—check
TarInfo.isfile()/TarInfo.isdir() and
TarInfo.issym()/TarInfo.islnk()/TarInfo.isdev()). For allowed dirs create
directories and for files use tf.extractfile() to stream and write them to the
validated path; do not call tarfile.extractall. Use the existing symbols
extract_artifacts, tf (the TarFile), and out_dir to locate the change.

In `@massgen/tests/test_cloud_modal_launcher.py`:
- Line 10: Replace the fragile `@patch`("modal.Volume.from_name") usage with the
same monkeypatch-based sys.modules["modal"] stub pattern used elsewhere in this
test file: create a fake modal module object and assign it to
sys.modules["modal"] in the test setup, implement a stub Volume class with a
from_name classmethod (or a mock attribute named Volume.from_name) that returns
the expected fake value, and remove the patch decorator so the test no longer
requires the real modal package; update the test function that referenced
Volume.from_name to use the stubbed behavior and ensure teardown/monkeypatch
cleanup is used so other tests remain isolated.

---

Nitpick comments:
In `@docs/source/user_guide/cloud.rst`:
- Around line 22-30: Add a short, runnable success transcript to
docs/source/user_guide/cloud.rst that follows the existing example command
(massgen --cloud --config config.yaml "Your question") and shows the terminal
output that contains the generated job id; specifically include the line from
the tool that prints the created job id (so readers can copy that id) and a
final line demonstrating the concrete artifact path
.massgen/cloud_jobs/job_<job_id>/artifacts/ using that printed id. Ensure the
transcript is realistic, minimal, and clearly labels which line is the job id so
users can locate their artifacts.

In `@massgen/cli.py`:
- Around line 463-464: The new cloud helper functions (e.g.,
_is_single_agent_config(config: Dict[str, Any]) -> bool and the other helper
added in this change) only have summary lines; update each to use Google-style
docstrings including an Args section documenting parameters and types (for
_is_single_agent_config: config: Dict[str, Any]), a Returns section describing
the return type and meaning (bool indicating whether exactly one agent is
defined), and a short description line—apply the same pattern to the other
new/changed cloud helper(s) in this diff so all new/changed Python functions
follow the repo’s Google-style docstring convention.

In `@massgen/cloud/cloud_job.py`:
- Around line 59-65: Add Google-style docstrings for the CloudJob class's
__init__ and launch methods: for __init__ document parameters (workspace_root:
Optional[Path]), describe default behavior (using Path.cwd() / ".massgen" /
"cloud_jobs"), side effects (creating the directory via
self.workspace_root.mkdir(parents=True, exist_ok=True)), and set the instance
attribute self.workspace_root; for launch document the parameter type (request:
CloudJobRequest), the return type (CloudJobResult), expected behavior (abstract
/ raises NotImplementedError), and any exceptions raised. Place concise,
Google-style docstrings directly above the __init__ and launch definitions
referencing the same names (__init__, launch, CloudJobRequest, CloudJobResult,
workspace_root).

In `@massgen/cloud/modal_app.py`:
- Around line 61-63: Expand the docstring for the run_massgen_job function to a
full, descriptive summary: state that it is the Modal app entrypoint invoked in
the containerized job (decorated by `@app.function`), describe the payload_b64
parameter (base64-encoded payload structure and expected keys), document the
return value (dictionary shape and important keys), list possible exceptions or
error conditions raised by the function, and note side effects/environmental
expectations (use of CONTEXT_MOUNT_PATH, context_vol, image, timeout and any
required environment variables or mounted volumes). Keep it concise but explicit
so future readers know what the function expects, produces, and depends on.
- Around line 23-35: Add Google-style docstrings to the two helper functions:
parse_automation_value and make_tar_gz_b64. For parse_automation_value, document
purpose (extract labeled value from multiline text), parameters (label: str,
text: str), return (str | None) and behavior when no match is found; for
make_tar_gz_b64, document purpose (create gzipped tarball of a directory and
return base64 string), parameters (source_dir: Path), return (str) and mention
that files are added with relative paths and the function reads into memory (may
raise I/O errors). Place the docstrings immediately above each function using
Google style (short summary, blank line, Args:, Returns:, and optional Raises:).
- Around line 78-81: The hard-coded Path("/tmp/massgen_cloud_job") assignment is
flagged by Ruff S108; add a brief inline comment immediately above or on the
same line as the workspace declaration to suppress and document the lint warning
(e.g., a "# nosec: S108 — Modal containers are ephemeral and isolated, so using
/tmp is intentional" comment). Reference the workspace variable and the
Path(...) expression when making the change so the linter suppression and
rationale are colocated with the code that creates the temporary directory.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 30ae5bce-9a6c-4429-a0b1-432bd0653716

📥 Commits

Reviewing files that changed from the base of the PR and between b388ff4 and 4e3d6a1.

📒 Files selected for processing (9)
  • docs/source/user_guide/cloud.rst
  • massgen/cli.py
  • massgen/cloud/__init__.py
  • massgen/cloud/cloud_job.py
  • massgen/cloud/modal_app.py
  • massgen/cloud/modal_launcher.py
  • massgen/cloud/utils.py
  • massgen/tests/test_cloud_modal_launcher.py
  • requirements.txt

Comment on lines +9939 to +9944
# Keep cloud-specific timeout mapping distinct from generic execution failures.
from .cloud.modal_launcher import CloudJobError

if isinstance(e, CloudJobError) and "timeout" in str(e).lower():
print(f"❌ Timeout error: {e}", flush=True)
sys.exit(EXIT_TIMEOUT)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and its size
wc -l massgen/cli.py

Repository: massgen/MassGen

Length of output: 79


🏁 Script executed:

# Look at the specific lines mentioned in the review
sed -n '9930,9950p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 931


🏁 Script executed:

# Get broader context - look at the exception handler that contains these lines
sed -n '9900,9960p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 2800


🏁 Script executed:

# Find the function containing this exception handler
sed -n '9700,9945p' massgen/cli.py | head -100

Repository: massgen/MassGen

Length of output: 4695


🏁 Script executed:

# Search for other imports of CloudJobError or modal_launcher in the file
rg -n "from .cloud.modal_launcher|CloudJobError" massgen/cli.py

Repository: massgen/MassGen

Length of output: 261


🏁 Script executed:

# Check if there are any conditional imports or try/except blocks around modal_launcher imports
rg -n "modal_launcher|ModalCloud" massgen/cli.py -B 2 -A 2

Repository: massgen/MassGen

Length of output: 922


🏁 Script executed:

# Find the function containing the exception handler with line 9939-9944
sed -n '9800,9945p' massgen/cli.py | grep -n "^async def\|^def " | tail -1

Repository: massgen/MassGen

Length of output: 41


🏁 Script executed:

# Get more context to find the function name
sed -n '9850,9870p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 1065


🏁 Script executed:

# Check the structure of cloud.modal_launcher module
head -50 massgen/cloud/modal_launcher.py

Repository: massgen/MassGen

Length of output: 1566


🏁 Script executed:

# Search for CloudJobError definition and its dependencies
rg -n "class CloudJobError|^CloudJobError" massgen/cloud/modal_launcher.py

Repository: massgen/MassGen

Length of output: 41


🏁 Script executed:

# Check if CloudJobError depends on Modal or has conditional imports
sed -n '1,100p' massgen/cloud/modal_launcher.py | rg -n "import|CloudJobError"

Repository: massgen/MassGen

Length of output: 699


🏁 Script executed:

# Find function name containing lines 9939-9944
sed -n '9800,9945p' massgen/cli.py | grep -E "^async def|^def " | tail -5

Repository: massgen/MassGen

Length of output: 41


🏁 Script executed:

# Get better context by looking around line 9900
sed -n '9895,9950p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 2586


🏁 Script executed:

# Check what cloud_job.py exports/defines
head -100 massgen/cloud/cloud_job.py

Repository: massgen/MassGen

Length of output: 2324


🏁 Script executed:

# Verify if CloudJobError is actually re-exported from modal_launcher or just imported
rg -n "CloudJobError" massgen/cloud/modal_launcher.py

Repository: massgen/MassGen

Length of output: 526


🏁 Script executed:

# Find the function name containing the exception handler
grep -n "^async def\|^def " massgen/cli.py | grep -B5 "9939"

Repository: massgen/MassGen

Length of output: 41


🏁 Script executed:

# Look for _run_cloud_job function
rg -n "def _run_cloud_job" massgen/cli.py

Repository: massgen/MassGen

Length of output: 174


🏁 Script executed:

# Check what happens around the _run_cloud_job call
sed -n '470,530p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 2590


🏁 Script executed:

# Verify the full exception handler context to ensure the review comment is accurate
sed -n '9935,9950p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 739


🏁 Script executed:

# Find where _run_cloud_job is called and verify it's in the try block
rg -n "_run_cloud_job" massgen/cli.py | head -20

Repository: massgen/MassGen

Length of output: 207


🏁 Script executed:

# Check the full try/except context by looking at larger range
sed -n '9700,9950p' massgen/cli.py | tail -100

Repository: massgen/MassGen

Length of output: 4510


🏁 Script executed:

# Verify if CloudJobError is imported anywhere else at module level
rg -n "^from.*CloudJobError|^import.*CloudJobError" massgen/cli.py

Repository: massgen/MassGen

Length of output: 41


🏁 Script executed:

# Check if cloud_job.py has any complex imports that could fail
head -50 massgen/cloud/cloud_job.py

Repository: massgen/MassGen

Length of output: 1172


🏁 Script executed:

# Check if modal_launcher.py has problematic imports at module level
head -20 massgen/cloud/modal_launcher.py

Repository: massgen/MassGen

Length of output: 645


🏁 Script executed:

# Verify what exceptions could be raised from _run_cloud_job
sed -n '471,530p' massgen/cli.py | grep -E "raise|CloudJobError"

Repository: massgen/MassGen

Length of output: 224


🏁 Script executed:

# Check what the try/except block contains at line 9171
sed -n '9100,9180p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 3587


🏁 Script executed:

# Check what imports could fail in modal_launcher and its dependencies
cat massgen/cloud/__init__.py 2>/dev/null || echo "File not found"

Repository: massgen/MassGen

Length of output: 470


🏁 Script executed:

# Verify the full import chain
rg -n "import.*modal|from.*modal" massgen/cloud/modal_launcher.py

Repository: massgen/MassGen

Length of output: 41


🏁 Script executed:

# Check if there are any optional/conditional imports in modal_launcher
sed -n '1,50p' massgen/cloud/modal_launcher.py

Repository: massgen/MassGen

Length of output: 1566


🏁 Script executed:

# Verify the exact exception handler context to understand what exceptions could occur
sed -n '9160,9180p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 860


🏁 Script executed:

# Check if CloudJobError is re-exported from modal_launcher or if it needs to be imported from cloud_job
python3 <<'EOF'
import sys
sys.path.insert(0, '/dev/null')
# Try to understand the module structure
with open('massgen/cloud/modal_launcher.py') as f:
    content = f.read()
    if 'CloudJobError' in content:
        # Find where it's used
        lines = content.split('\n')
        for i, line in enumerate(lines[:50], 1):
            if 'CloudJobError' in line:
                print(f"Line {i}: {line}")
EOF

Repository: massgen/MassGen

Length of output: 155


🏁 Script executed:

# Check if utils.py has any problematic imports
head -30 massgen/cloud/utils.py

Repository: massgen/MassGen

Length of output: 981


🏁 Script executed:

# Look for any existing try/except blocks around cloud imports to understand best practices
rg -n "try:" massgen/cloud/ -A 10 | head -50

Repository: massgen/MassGen

Length of output: 813


🏁 Script executed:

# Verify if there are any other similar patterns where imports happen inside exception handlers
rg -n "except.*as.*:\s*from" massgen/ -A 2

Repository: massgen/MassGen

Length of output: 50374


Guard the import inside the exception handler to avoid masking the original error.

If _run_cloud_job() fails due to a missing Modal dependency or an import error from massgen.cloud.modal_launcher, the handler will raise a secondary exception when attempting to import CloudJobError, masking the original failure. Wrap the import in a try/except to ensure the fallback error handling path is reached.

🔧 Suggested fix
-        from .cloud.modal_launcher import CloudJobError
-
-        if isinstance(e, CloudJobError) and "timeout" in str(e).lower():
+        try:
+            from .cloud.modal_launcher import CloudJobError
+        except Exception:  # Don't mask the original failure.
+            CloudJobError = None
+
+        if CloudJobError and isinstance(e, CloudJobError) and "timeout" in str(e).lower():
             print(f"❌ Timeout error: {e}", flush=True)
             sys.exit(EXIT_TIMEOUT)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Keep cloud-specific timeout mapping distinct from generic execution failures.
from .cloud.modal_launcher import CloudJobError
if isinstance(e, CloudJobError) and "timeout" in str(e).lower():
print(f"❌ Timeout error: {e}", flush=True)
sys.exit(EXIT_TIMEOUT)
# Keep cloud-specific timeout mapping distinct from generic execution failures.
try:
from .cloud.modal_launcher import CloudJobError
except Exception: # Don't mask the original failure.
CloudJobError = None
if CloudJobError and isinstance(e, CloudJobError) and "timeout" in str(e).lower():
print(f"❌ Timeout error: {e}", flush=True)
sys.exit(EXIT_TIMEOUT)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 9939 - 9944, The exception handler currently
does a top-level import of CloudJobError from massgen.cloud.modal_launcher which
can raise ImportError and mask the original failure from _run_cloud_job(); wrap
the import in a try/except (catch ImportError/Exception) inside the except block
so the handler falls back to the generic error path if the import fails, and
only perform isinstance(e, CloudJobError) and the timeout string check when the
import succeeded (otherwise use the existing generic error handling/exit codes).

Comment on lines +25 to +28
def extract_artifacts(tar_b64: str, out_dir: Path) -> None:
data = base64.b64decode(tar_b64.encode("utf-8"))
with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
tf.extractall(out_dir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Reject unsafe tar members before extraction.

tarfile.extractall() trusts member names. A crafted artifact bundle can escape out_dir or materialize non-regular entries on the local machine.

🛡️ Proposed fix
 def extract_artifacts(tar_b64: str, out_dir: Path) -> None:
     data = base64.b64decode(tar_b64.encode("utf-8"))
-    with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
-        tf.extractall(out_dir)
+    out_root = out_dir.resolve()
+    with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
+        for member in tf.getmembers():
+            if not (member.isdir() or member.isfile()):
+                raise ValueError(
+                    f"Refusing to extract non-regular artifact member: {member.name}"
+                )
+            target = (out_root / member.name).resolve()
+            try:
+                target.relative_to(out_root)
+            except ValueError as err:
+                raise ValueError(
+                    f"Unsafe artifact path in artifact bundle: {member.name}"
+                ) from err
+            tf.extract(member, out_root)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def extract_artifacts(tar_b64: str, out_dir: Path) -> None:
data = base64.b64decode(tar_b64.encode("utf-8"))
with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
tf.extractall(out_dir)
def extract_artifacts(tar_b64: str, out_dir: Path) -> None:
data = base64.b64decode(tar_b64.encode("utf-8"))
out_root = out_dir.resolve()
with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
for member in tf.getmembers():
if not (member.isdir() or member.isfile()):
raise ValueError(
f"Refusing to extract non-regular artifact member: {member.name}"
)
target = (out_root / member.name).resolve()
try:
target.relative_to(out_root)
except ValueError as err:
raise ValueError(
f"Unsafe artifact path in artifact bundle: {member.name}"
) from err
tf.extract(member, out_root)
🧰 Tools
🪛 Ruff (0.15.4)

[error] 28-28: Uses of tarfile.extractall()

(S202)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/utils.py` around lines 25 - 28, The extract_artifacts function
currently calls tarfile.extractall which is unsafe; update extract_artifacts to
validate members from tf.getmembers() before extracting: for each TarInfo ensure
the resolved destination path (out_dir.resolve() joined with member.name) starts
with out_dir.resolve() (to prevent path traversal) and only allow regular files
and directories (skip symlinks, hard links, device nodes—check
TarInfo.isfile()/TarInfo.isdir() and
TarInfo.issym()/TarInfo.islnk()/TarInfo.isdev()). For allowed dirs create
directories and for files use tf.extractfile() to stream and write them to the
validated path; do not call tarfile.extractall. Use the existing symbols
extract_artifacts, tf (the TarFile), and out_dir to locate the change.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (3)
massgen/cloud/cloud_job.py (1)

49-55: ⚠️ Potential issue | 🟡 Minor

Add a docstring and handle JSONDecodeError.

This method lacks a Google-style docstring per project guidelines. Additionally, malformed JSON in the marker line will raise JSONDecodeError, which should be caught for consistent "not found" behavior. As per coding guidelines, new functions should include Google-style docstrings.

,

🛡️ Proposed fix
     `@classmethod`
     def _extract_marker_payload(cls, stdout: str) -> Optional[Dict[str, object]]:
+        """Extract the JSON payload from the result marker line in stdout.
+
+        Args:
+            stdout: The captured stdout from a cloud job.
+
+        Returns:
+            Parsed JSON dict if found and valid, None otherwise.
+        """
         for line in reversed(stdout.splitlines()):
             if line.startswith(cls.RESULT_MARKER):
                 raw = line[len(cls.RESULT_MARKER) :].strip()
-                return json.loads(raw)
+                try:
+                    return json.loads(raw)
+                except json.JSONDecodeError:
+                    return None
         return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/cloud_job.py` around lines 49 - 55, Add a Google-style
docstring to the classmethod _extract_marker_payload describing parameters,
return value, and behavior, and update the implementation to catch
json.JSONDecodeError when parsing the marker payload so malformed JSON returns
None instead of raising; keep the existing loop over lines and the use of
cls.RESULT_MARKER and json.loads(raw) but wrap json.loads(raw) in a try/except
that logs or silently handles JSONDecodeError and returns None for that marker
(preserving current "not found" behavior).
massgen/cli.py (2)

9508-9518: ⚠️ Potential issue | 🟠 Major

--single-agent is still bypassed on cloud runs.

This return still happens before the later filter_agents_for_single_mode() step, so massgen --cloud --single-agent ... --config multi_agent.yaml is evaluated against the unfiltered config and gets rejected even though the local path would narrow it to one agent. Normalize the config to the selected agent before _run_cloud_job(), or reject that combination explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 9508 - 9518, The cloud path returns before
filtering for --single-agent, so calls to _run_cloud_job(args, config,
config_path_label) receive the unfiltered multi-agent config and get rejected;
either apply filter_agents_for_single_mode(args, config) (or equivalent
normalization) to produce a config narrowed to the selected agent and pass that
into _run_cloud_job, or explicitly validate/reject the combination when
args.single_agent is set; update the block that checks getattr(args, "cloud",
False) to run the same filter_agents_for_single_mode logic (using args and
resolved_path/context) before calling _run_cloud_job or return an explicit error
if filtering cannot be applied.

9945-9950: ⚠️ Potential issue | 🟠 Major

Don't let the timeout-path import hide the original cloud failure.

If _run_cloud_job() failed because massgen.cloud.modal_launcher itself could not import, this secondary import raises a new exception and masks the real error. Import from a safe module or guard the import before the isinstance() check.

🔧 Suggested fix
-        from .cloud.modal_launcher import CloudJobError
-
-        if isinstance(e, CloudJobError) and "timeout" in str(e).lower():
+        try:
+            from .cloud.cloud_job import CloudJobError
+        except Exception:
+            CloudJobError = None
+
+        if CloudJobError and isinstance(e, CloudJobError) and "timeout" in str(e).lower():
             print(f"❌ Timeout error: {e}", flush=True)
             sys.exit(EXIT_TIMEOUT)

Verify which module safely defines CloudJobError and whether massgen/cloud/modal_launcher.py pulls optional imports at module import time:

#!/bin/bash
set -euo pipefail

echo "== massgen/cloud/modal_launcher.py =="
sed -n '1,80p' massgen/cloud/modal_launcher.py | nl -ba

echo
echo "== massgen/cloud/cloud_job.py =="
sed -n '1,80p' massgen/cloud/cloud_job.py | nl -ba
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 9945 - 9950, The timeout-path import of
CloudJobError can mask the original error if massgen.cloud.modal_launcher fails
to import; update the _run_cloud_job() error handling to avoid importing
massgen.cloud.modal_launcher at the failure site: either import CloudJobError
from a safer module that always loads (e.g., the module that actually defines
the exception such as massgen.cloud.cloud_job) at top-level, or wrap the import
in a try/except ImportError and fall back to treating the exception string for
"timeout" when the import fails; ensure you reference and use the CloudJobError
symbol (or the fallback string-check) in the existing isinstance(e,
CloudJobError) / timeout-check logic so the original import error is not masked.
🧹 Nitpick comments (4)
massgen/cloud/modal_launcher.py (1)

54-57: Replace assert with explicit checks for production robustness.

Using assert for runtime checks can be disabled with python -O. While subprocess.Popen with PIPE should always populate these attributes, explicit checks are safer for production code.

♻️ Proposed fix
         def _drain_stderr():
-            assert proc.stderr is not None
+            if proc.stderr is None:
+                return
             for line in proc.stderr:
                 stderr_lines.append(line)
-        assert proc.stdout is not None
+        if proc.stdout is None:
+            raise CloudJobError("Failed to capture stdout from modal process")
         for line in proc.stdout:

Also applies to: 65-65

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_launcher.py` around lines 54 - 57, The use of assert in
_drain_stderr should be replaced with an explicit runtime check so the code is
robust when Python is run with optimizations; change the beginning of
_drain_stderr to check if proc.stderr is None and raise a clear RuntimeError (or
handle gracefully) before iterating, and apply the same pattern to the
corresponding stdout drain function (the function handling proc.stdout around
line 65) so both functions explicitly verify proc.stderr/proc.stdout and fail
with a descriptive error instead of relying on assert.
massgen/cli.py (1)

463-472: Use Google-style docstrings for the new cloud helpers.

Both new functions only have summary lines; the repo standard for changed Python functions asks for Args/Returns sections.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 463 - 472, The docstrings for
_is_single_agent_config and _run_cloud_job are missing Google-style Args/Returns
sections; update each function to use a full Google-style docstring: include a
short summary, an Args section describing parameters (e.g., config: dict[str,
Any], args: argparse.Namespace, config_path_label: str | None), and a Returns
section indicating the return type (bool for _is_single_agent_config and None
for _run_cloud_job), and any Raises if applicable, so they conform to the
repository's docstring standard.
massgen/cloud/modal_app.py (1)

22-34: Add Google-style docstrings to the new cloud helpers.

These functions are new, but the helpers have no function docstrings and run_massgen_job still uses a short sentence docstring rather than the repo’s required Google-style format.

As per coding guidelines, "For new or changed functions, include Google-style docstrings".

Also applies to: 66-67

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 22 - 34, Add Google-style docstrings
to the new helper functions and the nearby run_massgen_job function: update
parse_automation_value(label: str, text: str) -> str | None with a docstring
describing purpose, Args (label, text), Returns (matched value or None), and any
Exceptions; update make_tar_gz_b64(source_dir: Path) -> str with a docstring
describing purpose, Args (source_dir), Returns (base64 gzipped tar string), and
side-effects (reads files); and replace the short sentence docstring on
run_massgen_job (lines ~66-67) with a full Google-style docstring including
Args, Returns, and Raises as appropriate so all three functions follow the
repo’s docstring style.
massgen/tests/test_cloud_modal_launcher.py (1)

16-34: Align the new test helpers with the repo docstring convention.

The new helpers/tests either omit docstrings or use short sentence docstrings instead of the required Google-style format.

As per coding guidelines, "For new or changed functions, include Google-style docstrings".

Also applies to: 37-38, 71-72, 105-106

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/tests/test_cloud_modal_launcher.py` around lines 16 - 34, The helper
functions in this test module lack Google-style docstrings; update
_make_artifact_payload and _make_mock_popen (and all other new/changed test
helpers in this file) to use full Google-style docstrings that include a short
summary line, a blank line, and sections for Args/Returns (and Raises if
applicable), describing parameters like stdout_lines/stderr_lines/returncode and
the returned value; ensure the docstrings replace the current short sentence or
omitted docstrings for the helpers around the same area so they follow the repo
convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/cli.py`:
- Around line 463-468: The helper _is_single_agent_config currently treats any
presence of the "agent" key as single-agent; change it to only return True when
the top-level form is exactly one agent: require that config has an "agent" key
whose value is a dict AND does NOT have an "agents" key; otherwise, fall back to
checking "agents" is a list of length 1 whose single element is a dict. In
short: in _is_single_agent_config, validate both that config["agent"] is a dict
and that "agents" is absent, and when using the "agents" form ensure it's a list
with exactly one dict element so mixed or invalid types are rejected.
- Around line 490-497: The current rewrite only updates
orchestrator.context_paths; also locate and rewrite per-agent
backend.context_paths before launching the cloud job by calling
process_context_paths on each agent's backend.context_paths (the same way
orchestrator_cfg uses process_context_paths) so host-local mounts are converted
for remote runs; specifically update any agent dicts produced/consumed by
create_agents_from_config to replace agent["backend"]["context_paths"] with the
rewritten paths when present, preserving the existing cloud_job_id parameter and
only setting the field if process_context_paths returns non-empty rewritten
paths.

In `@massgen/cloud/modal_app.py`:
- Around line 139-143: The current logic reads LOG_DIR from stderr via
parse_automation_value and blindly copies src_log_dir into artifact_root,
allowing arbitrary filesystem exfiltration; change this so either (preferred)
you validate and restrict the path to a workspace-owned root before copying or
(alternative) stop copying and instead emit the LOG_DIR value into the
structured result channel. Concretely: keep parse_automation_value("LOG_DIR",
stderr_text) but if log_dir is set, resolve it (use Path(log_dir).resolve()) and
verify it is inside a dedicated workspace log root (e.g., workspace_log_root
variable) using is_relative_to or by comparing parents; only then perform
shutil.copytree to artifact_root / "log_dir". If the resolved path is outside
the workspace_log_root, do not copy and instead append the log_dir path to the
job result/metadata (structured result channel) for operator review. Reference
symbols: parse_automation_value, log_dir, src_log_dir, artifact_root.
- Around line 72-85: The code accepts output_filename from payload and joins it
to workspace, allowing absolute paths or ../ traversal to escape
/tmp/massgen_cloud_job; validate and reject unsafe names before creating
output_file: (1) read output_filename from payload, ensure it is not absolute
and does not contain path separators or .. segments (e.g., compare
Path(output_filename).name to the original or reject if
os.path.isabs(output_filename) or ".." in parts), (2) build the candidate path
with workspace / output_filename and verify workspace.resolve() is a prefix of
candidate.resolve() (or otherwise ensure
candidate.resolve().startswith(workspace.resolve())), and (3) raise or replace
with a safe default if validation fails; update references to output_filename,
workspace, and output_file accordingly.
- Around line 41-56: Replace pip_install("massgen") in the Modal image builder
(the image variable) with adding the local source so the job runs the PR code:
use image.add_local_python_source() or ensure the subprocess invokes the local
package via python -m massgen.cli instead of relying on the published package;
validate and update the Modal image construction code accordingly. For the path
traversal issue, sanitize and validate output_filename before joining into the
workspace (reject path separators and any '..' components, allow only a safe
basename and a whitelist of extensions) where the subprocess writes files. For
the artifact exfiltration risk, stop blindly using the parsed LOG_DIR from
subprocess stderr — either remove the automatic artifact copy or strictly
validate that LOG_DIR is a subdirectory of the intended workspace (resolve and
compare realpaths and refuse paths outside workspace or symlinks). Finally, add
Google-style docstrings to the two newly added functions near the top of the
file to satisfy coding guidelines.

In `@massgen/cloud/modal_launcher.py`:
- Around line 71-74: The JSON parsing block is currently raising CloudJobError
without preserving the original JSONDecodeError chain; modify the except clause
to capture the exception (e.g., "except json.JSONDecodeError as err:") and
re-raise the CloudJobError using "raise CloudJobError(f'Failed to parse cloud
job result JSON: {raw}') from err" so the traceback for the original error
(json.JSONDecodeError) is preserved; locate this change around the
marker_payload = json.loads(raw) logic in modal_launcher.py.

---

Duplicate comments:
In `@massgen/cli.py`:
- Around line 9508-9518: The cloud path returns before filtering for
--single-agent, so calls to _run_cloud_job(args, config, config_path_label)
receive the unfiltered multi-agent config and get rejected; either apply
filter_agents_for_single_mode(args, config) (or equivalent normalization) to
produce a config narrowed to the selected agent and pass that into
_run_cloud_job, or explicitly validate/reject the combination when
args.single_agent is set; update the block that checks getattr(args, "cloud",
False) to run the same filter_agents_for_single_mode logic (using args and
resolved_path/context) before calling _run_cloud_job or return an explicit error
if filtering cannot be applied.
- Around line 9945-9950: The timeout-path import of CloudJobError can mask the
original error if massgen.cloud.modal_launcher fails to import; update the
_run_cloud_job() error handling to avoid importing massgen.cloud.modal_launcher
at the failure site: either import CloudJobError from a safer module that always
loads (e.g., the module that actually defines the exception such as
massgen.cloud.cloud_job) at top-level, or wrap the import in a try/except
ImportError and fall back to treating the exception string for "timeout" when
the import fails; ensure you reference and use the CloudJobError symbol (or the
fallback string-check) in the existing isinstance(e, CloudJobError) /
timeout-check logic so the original import error is not masked.

In `@massgen/cloud/cloud_job.py`:
- Around line 49-55: Add a Google-style docstring to the classmethod
_extract_marker_payload describing parameters, return value, and behavior, and
update the implementation to catch json.JSONDecodeError when parsing the marker
payload so malformed JSON returns None instead of raising; keep the existing
loop over lines and the use of cls.RESULT_MARKER and json.loads(raw) but wrap
json.loads(raw) in a try/except that logs or silently handles JSONDecodeError
and returns None for that marker (preserving current "not found" behavior).

---

Nitpick comments:
In `@massgen/cli.py`:
- Around line 463-472: The docstrings for _is_single_agent_config and
_run_cloud_job are missing Google-style Args/Returns sections; update each
function to use a full Google-style docstring: include a short summary, an Args
section describing parameters (e.g., config: dict[str, Any], args:
argparse.Namespace, config_path_label: str | None), and a Returns section
indicating the return type (bool for _is_single_agent_config and None for
_run_cloud_job), and any Raises if applicable, so they conform to the
repository's docstring standard.

In `@massgen/cloud/modal_app.py`:
- Around line 22-34: Add Google-style docstrings to the new helper functions and
the nearby run_massgen_job function: update parse_automation_value(label: str,
text: str) -> str | None with a docstring describing purpose, Args (label,
text), Returns (matched value or None), and any Exceptions; update
make_tar_gz_b64(source_dir: Path) -> str with a docstring describing purpose,
Args (source_dir), Returns (base64 gzipped tar string), and side-effects (reads
files); and replace the short sentence docstring on run_massgen_job (lines
~66-67) with a full Google-style docstring including Args, Returns, and Raises
as appropriate so all three functions follow the repo’s docstring style.

In `@massgen/cloud/modal_launcher.py`:
- Around line 54-57: The use of assert in _drain_stderr should be replaced with
an explicit runtime check so the code is robust when Python is run with
optimizations; change the beginning of _drain_stderr to check if proc.stderr is
None and raise a clear RuntimeError (or handle gracefully) before iterating, and
apply the same pattern to the corresponding stdout drain function (the function
handling proc.stdout around line 65) so both functions explicitly verify
proc.stderr/proc.stdout and fail with a descriptive error instead of relying on
assert.

In `@massgen/tests/test_cloud_modal_launcher.py`:
- Around line 16-34: The helper functions in this test module lack Google-style
docstrings; update _make_artifact_payload and _make_mock_popen (and all other
new/changed test helpers in this file) to use full Google-style docstrings that
include a short summary line, a blank line, and sections for Args/Returns (and
Raises if applicable), describing parameters like
stdout_lines/stderr_lines/returncode and the returned value; ensure the
docstrings replace the current short sentence or omitted docstrings for the
helpers around the same area so they follow the repo convention.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ec14d1f9-6646-4fc6-894a-9efd6a23b6a5

📥 Commits

Reviewing files that changed from the base of the PR and between 4e3d6a1 and 44d82ca.

📒 Files selected for processing (6)
  • docs/source/user_guide/cloud.rst
  • massgen/cli.py
  • massgen/cloud/cloud_job.py
  • massgen/cloud/modal_app.py
  • massgen/cloud/modal_launcher.py
  • massgen/tests/test_cloud_modal_launcher.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/source/user_guide/cloud.rst

Comment on lines +41 to +56
image = (
modal.Image.debian_slim(python_version="3.11")
.pip_install(
"massgen",
"fastmcp",
"uv",
)
.apt_install(
"nodejs",
"npm",
"curl",
"wget",
"git",
"build-essential",
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
fd '^(pyproject\.toml|setup\.py)$' -x sed -n '1,220p' {}
rg -n --type=py 'add_local_python_source|add_local_dir|pip_install\("massgen"'
rg -n --type=py 'massgen\.cli|subprocess\.Popen|cmd = \['

Repository: massgen/MassGen

Length of output: 21405


🏁 Script executed:

cat -n massgen/cloud/modal_app.py | head -150

Repository: massgen/MassGen

Length of output: 5414


🏁 Script executed:

# Check for example docstrings in massgen modules
rg -n 'def ' massgen/cloud/modal_launcher.py -A 2 | head -30

Repository: massgen/MassGen

Length of output: 509


🏁 Script executed:

# Check how output_filename validation is handled elsewhere in the codebase
rg -n 'output_filename|output-file' massgen/cli.py | head -20

Repository: massgen/MassGen

Length of output: 89


🏁 Script executed:

# Get context around output-file handling in cli.py
sed -n '10510,10530p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 909


🏁 Script executed:

# Check how output_file is used after being passed in cli.py
rg -n 'args\.output_file|output_file.*=' massgen/cli.py | head -20

Repository: massgen/MassGen

Length of output: 585


🏁 Script executed:

# Check validation around output_file handling
sed -n '515,525p' massgen/cli.py
sed -n '9585,9600p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 1145


🏁 Script executed:

# Check if there's any path validation helper used elsewhere
rg -n 'resolve\(.*\)|\.relative_to\(' massgen/cloud/modal_app.py

Repository: massgen/MassGen

Length of output: 134


🏁 Script executed:

# Verify the exact subprocess.Popen call to confirm prompt handling
sed -n '87,105p' massgen/cloud/modal_app.py

Repository: massgen/MassGen

Length of output: 438


Fix Modal image build and critical security vulnerabilities in cloud job handling.

This file has three separate critical issues:

  1. Install PR code in Modal image, not published package. The pip_install("massgen") at line 43 installs the published PyPI version, so the subprocess at line 87 will run whatever code is on PyPI, silently skipping all cloud-related changes in this PR. Use add_local_python_source() or call the CLI entrypoint directly via python -m massgen.cli instead.

  2. Path traversal vulnerability at line 85. The output_filename from untrusted payload is directly joined into the workspace path without validation. A malicious client can pass ../../../etc/passwd to write files outside the intended workspace directory. Validate the filename to reject path separators and .. components.

  3. Artifact exfiltration vulnerability at lines 139–143. The code parses LOG_DIR from subprocess stderr and blindly copies it into artifacts. A compromised subprocess can inject a malicious path (e.g., /etc, /root, or symlinks) to exfiltrate arbitrary files. Either validate the parsed directory path strictly or remove this feature.

Missing Google-style docstrings on the new functions at lines 22–26 and 28–34 also violate the coding guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 41 - 56, Replace
pip_install("massgen") in the Modal image builder (the image variable) with
adding the local source so the job runs the PR code: use
image.add_local_python_source() or ensure the subprocess invokes the local
package via python -m massgen.cli instead of relying on the published package;
validate and update the Modal image construction code accordingly. For the path
traversal issue, sanitize and validate output_filename before joining into the
workspace (reject path separators and any '..' components, allow only a safe
basename and a whitelist of extensions) where the subprocess writes files. For
the artifact exfiltration risk, stop blindly using the parsed LOG_DIR from
subprocess stderr — either remove the automatic artifact copy or strictly
validate that LOG_DIR is a subdirectory of the intended workspace (resolve and
compare realpaths and refuse paths outside workspace or symlinks). Finally, add
Google-style docstrings to the two newly added functions near the top of the
file to satisfy coding guidelines.

@int-chaos int-chaos changed the title Cloud modal mvp feat: Cloud modal mvp Mar 8, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
massgen/cli.py (3)

9510-9520: ⚠️ Potential issue | 🟠 Major

Reject unsupported multi-agent configs before entering cloud mode.

--cloud is advertised as a single-agent MVP, but this branch never enforces that contract. Multi-agent configs currently fall through into the cloud path and only fail later with a much less clear error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 9510 - 9520, Before calling _run_cloud_job when
getattr(args, "cloud", False) is true, validate that the provided config is
single-agent; detect multi-agent definitions (e.g., config.agents or
config.get("agents")/config.get("multi_agent") and check length > 1) and abort
early with a clear logger.error and exit/raise (instead of proceeding). Add this
check in the same cloud branch (before enabling args.automation and
_run_cloud_job) so multi-agent configs are rejected immediately with a helpful
message referencing cloud being single-agent only.

9947-9952: ⚠️ Potential issue | 🟠 Major

Don't let the timeout classifier mask the original cloud failure.

If the original exception came from an import/setup failure in massgen.cloud.modal_launcher, this handler can raise a second exception while importing CloudJobError, and the real cause is lost.

🔧 Suggested fix
-        from .cloud.modal_launcher import CloudJobError
-
-        if isinstance(e, CloudJobError) and "timeout" in str(e).lower():
+        try:
+            from .cloud.modal_launcher import CloudJobError
+        except Exception:
+            CloudJobError = None
+
+        if CloudJobError and isinstance(e, CloudJobError) and "timeout" in str(e).lower():
             print(f"❌ Timeout error: {e}", flush=True)
             sys.exit(EXIT_TIMEOUT)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 9947 - 9952, The current handler imports
CloudJobError inside the exception path which can itself raise and hide the
original exception; instead ensure the classification cannot mask the root error
by moving the import of CloudJobError out of the exception handler (import
CloudJobError from massgen.cloud.modal_launcher at module top) or, if that
import may fail, wrap that import in a try/except at module init and set a
sentinel (e.g., CLOUD_JOB_ERROR_AVAILABLE) so the handler only checks
isinstance(e, CloudJobError) when the import succeeded; keep the existing
behavior of printing "❌ Timeout error: {e}" and calling sys.exit(EXIT_TIMEOUT)
only when the sentinel is true and the instance check passes.

480-500: ⚠️ Potential issue | 🟠 Major

Cloud context-path rewriting still has two correctness gaps.

process_context_paths() in massgen/cloud/utils.py restarts its upload index at 0 on every call, so reusing the same cloud_job_id here for orchestrator paths and backend paths can make different entries resolve to the same remote mount. This loop also only rewrites config["agents"], so a valid single-agent config using top-level agent.backend.context_paths still sends host-local paths to Modal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 480 - 500, The orchestrator/agent path rewriting
can produce duplicate remote mounts because process_context_paths resets its
upload index per call and the code only rewrites entries in config["agents"],
ignoring a top-level "agent" dict; fix by (1) introducing a shared upload index
counter used across all process_context_paths calls (e.g., create upload_index =
0 before processing orchestrator and agents, pass it into process_context_paths
via a new optional start_index argument or update process_context_paths to
accept/return the next index so each call continues the index), and (2) ensure
you also handle a top-level "agent" dict in addition to config["agents"] by
checking config_copy.get("agent") and rewriting its backend.context_paths using
the same shared index so all uploads get unique remote mounts.
🧹 Nitpick comments (5)
massgen/cloud/modal_app.py (2)

64-76: Expand docstring for run_massgen_job.

The function has a minimal docstring. Per coding guidelines, include Args and Returns sections.

📝 Suggested docstring
 `@app.function`(...)
 def run_massgen_job(payload_b64: str) -> dict:
-    """Local entrypoint for Modal job."""
+    """Execute a MassGen job in the Modal container.
+
+    Args:
+        payload_b64: Base64-encoded JSON containing prompt, config_yaml,
+            timeout_seconds, and cloud_job_id.
+
+    Returns:
+        A dict with status, final_answer, error info, remote_log_dir,
+        and artifacts_tar_gz_b64.
+    """
     payload: dict[str, object] = json.loads(base64.b64decode(payload_b64).decode("utf-8"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 64 - 76, Update the minimal
docstring for run_massgen_job to a full docstring that documents the function
purpose and includes an Args section describing payload_b64 (a base64-encoded
JSON payload containing keys like "prompt", "config_yaml", and optional
"cloud_job_id") and a Returns section describing the returned dict structure
(what keys/values callers can expect, e.g., status, output file path or error
info). Place the expanded docstring immediately beneath the def
run_massgen_job(...) line and ensure it remains a plain triple-quoted string
describing inputs, outputs, and any side effects (creates
/tmp/massgen_cloud_job, writes final_answer.txt).

162-172: Consider handling the case where make_tar_gz_b64 fails.

If artifact creation fails (e.g., disk full, permission issues), this will raise an unhandled exception. Consider wrapping in a try/except to ensure a proper error result is returned.

🛡️ Suggested defensive handling
+    try:
+        artifacts_b64 = make_tar_gz_b64(artifact_root)
+    except Exception as e:
+        artifacts_b64 = ""
+        # Log but don't fail the job for artifact packaging issues
+        print(f"[stderr] Warning: failed to package artifacts: {e}", flush=True)
+
     result = {
         "status": "ok" if returncode == 0 else "error",
         "timed_out": False,
         "error": None if returncode == 0 else f"massgen exited with code {returncode}",
         "final_answer": final_answer,
         "remote_log_dir": log_dir,
-        "artifacts_tar_gz_b64": make_tar_gz_b64(artifact_root),
+        "artifacts_tar_gz_b64": artifacts_b64,
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 162 - 172, Wrap the call to
make_tar_gz_b64(artifact_root) in a try/except inside the code that builds the
result dict (the block that creates result, result_marker, prints and returns
it) so that any exception during artifact creation is caught; on exception set
result["status"]="error", result["artifacts_tar_gz_b64"]=None, result["error"]
to a descriptive message including the exception (and include exception
details), keep "timed_out" as appropriate, still print the result_marker +
json.dumps(result) and return result; reference symbols: make_tar_gz_b64,
result, result_marker, cloud_job_id.
massgen/cloud/modal_launcher.py (1)

14-19: Add Google-style docstrings to class and __init__ method.

The class has a brief docstring, but the __init__ method lacks documentation. Per coding guidelines, new or changed functions should include Google-style docstrings.

📝 Suggested docstrings
 class ModalCloudJobLauncher(CloudJobLauncher):
     """Launch MassGen jobs in Modal and materialize outputs locally."""

     def __init__(self, workspace_root: Path | None = None):
+        """Initialize the Modal cloud job launcher.
+
+        Args:
+            workspace_root: Optional directory for storing job artifacts.
+                Defaults to .massgen/cloud_jobs in the current directory.
+        """
         super().__init__(workspace_root)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_launcher.py` around lines 14 - 19, Add Google-style
docstrings for the ModalCloudJobLauncher class and its __init__ method: update
the class-level docstring for ModalCloudJobLauncher to a full Google-style
docstring describing the class purpose, attributes (e.g., workspace_root: Path |
None) and behavior, and add a Google-style docstring to the __init__ method that
documents parameters, types, default values, and any side effects (calls
super().__init__). Make sure the docstrings use the Google docstring sections
(Args, Attributes, Returns if applicable) and reference the workspace_root
parameter and inheritance from CloudJobLauncher.
massgen/cloud/cloud_job.py (1)

33-44: Consider making CloudJobLauncher an abstract base class.

The launch method raises NotImplementedError, which is the pattern for an abstract method. Using abc.ABC and @abstractmethod makes the contract explicit and provides better IDE/type-checker support.

♻️ Suggested refactor
+from abc import ABC, abstractmethod
 import json
 from dataclasses import dataclass
 from pathlib import Path


-class CloudJobLauncher:
+class CloudJobLauncher(ABC):
     """Interface for launching cloud jobs."""

     RESULT_MARKER = "__MASSGEN_CLOUD_JOB_RESULT__"

     def __init__(self, workspace_root: Path | None = None):
         base = workspace_root or (Path.cwd() / ".massgen" / "cloud_jobs")
         self.workspace_root = base
         self.workspace_root.mkdir(parents=True, exist_ok=True)

+    `@abstractmethod`
     def launch(self, request: CloudJobRequest) -> CloudJobResult:
-        raise NotImplementedError
+        """Launch a cloud job and return the result.
+
+        Args:
+            request: The cloud job request containing prompt and config.
+
+        Returns:
+            The result of the cloud job execution.
+        """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/cloud_job.py` around lines 33 - 44, Convert CloudJobLauncher
into an explicit abstract base class by importing abc and making it inherit from
abc.ABC (class CloudJobLauncher(abc.ABC):), then decorate the launch method with
`@abc.abstractmethod` and keep its signature def launch(self, request:
CloudJobRequest) -> CloudJobResult: so implementers must override it; leave
RESULT_MARKER and __init__ unchanged but ensure you add the necessary import for
abc at the top.
massgen/cli.py (1)

463-464: Add a full Google-style docstring for _run_cloud_job().

This new entry point has parameters, side effects, and failure modes, so the summary-only docstring is below the repo standard.

As per coding guidelines "For new or changed functions, include Google-style docstrings"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 463 - 464, Add a full Google-style docstring to
_run_cloud_job(args, config, config_path_label) describing what the function
does, its parameters (args: argparse.Namespace, config: dict[str, Any],
config_path_label: str | None), the side effects (launches a Modal cloud job and
materializes results locally, logs, and may write files), return value (None),
and possible exceptions/failure modes (e.g., Modal API errors, file I/O errors)
and any important notes about blocking behavior or retries; place this docstring
immediately under the def _run_cloud_job(...) line and follow Google docstring
sections: Args, Returns, Raises, and Examples/Notes as appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/cli.py`:
- Around line 9510-9520: The early return after calling _run_cloud_job bypasses
the later finally block that calls SessionRegistry.complete_session and leaves
memory_session_id active; remove the immediate return (or otherwise ensure
control flows through to the existing finally) so that after _run_cloud_job
returns the shared session finalization logic
(SessionRegistry.complete_session(...)) is executed for cloud runs; locate the
cloud-path block using _run_cloud_job and memory_session_id to make this change.

In `@massgen/cloud/modal_app.py`:
- Around line 35-36: CONTEXT_VOLUME_NAME and CONTEXT_MOUNT_PATH are duplicated
here and in utils.py; replace the local definitions by importing the single
source of truth from massgen.cloud.utils (i.e., remove the local assignments of
CONTEXT_VOLUME_NAME and CONTEXT_MOUNT_PATH in modal_app.py and add an import:
from .utils import CONTEXT_VOLUME_NAME, CONTEXT_MOUNT_PATH) so both modules use
the same constants and avoid drift between upload and mount paths.

In `@massgen/cloud/modal_launcher.py`:
- Around line 94-101: The extract_artifacts function is currently calling
tarfile.extractall() without validating member paths, which allows path
traversal; update extract_artifacts to safely validate each TarInfo name (reject
absolute paths and any names that, when joined with the target artifacts_dir,
escape that directory using os.path.commonpath) before extraction, or if running
on Python 3.11+, use tarfile.open(...).extractall(members=...,
filter=safe_filter) where safe_filter rejects unsafe members; ensure
extract_artifacts is referenced from modal_launcher.py (where tar_b64 is passed)
and raise/skip on any unsafe member to prevent writing outside artifacts_dir.

---

Duplicate comments:
In `@massgen/cli.py`:
- Around line 9510-9520: Before calling _run_cloud_job when getattr(args,
"cloud", False) is true, validate that the provided config is single-agent;
detect multi-agent definitions (e.g., config.agents or
config.get("agents")/config.get("multi_agent") and check length > 1) and abort
early with a clear logger.error and exit/raise (instead of proceeding). Add this
check in the same cloud branch (before enabling args.automation and
_run_cloud_job) so multi-agent configs are rejected immediately with a helpful
message referencing cloud being single-agent only.
- Around line 9947-9952: The current handler imports CloudJobError inside the
exception path which can itself raise and hide the original exception; instead
ensure the classification cannot mask the root error by moving the import of
CloudJobError out of the exception handler (import CloudJobError from
massgen.cloud.modal_launcher at module top) or, if that import may fail, wrap
that import in a try/except at module init and set a sentinel (e.g.,
CLOUD_JOB_ERROR_AVAILABLE) so the handler only checks isinstance(e,
CloudJobError) when the import succeeded; keep the existing behavior of printing
"❌ Timeout error: {e}" and calling sys.exit(EXIT_TIMEOUT) only when the sentinel
is true and the instance check passes.
- Around line 480-500: The orchestrator/agent path rewriting can produce
duplicate remote mounts because process_context_paths resets its upload index
per call and the code only rewrites entries in config["agents"], ignoring a
top-level "agent" dict; fix by (1) introducing a shared upload index counter
used across all process_context_paths calls (e.g., create upload_index = 0
before processing orchestrator and agents, pass it into process_context_paths
via a new optional start_index argument or update process_context_paths to
accept/return the next index so each call continues the index), and (2) ensure
you also handle a top-level "agent" dict in addition to config["agents"] by
checking config_copy.get("agent") and rewriting its backend.context_paths using
the same shared index so all uploads get unique remote mounts.

---

Nitpick comments:
In `@massgen/cli.py`:
- Around line 463-464: Add a full Google-style docstring to _run_cloud_job(args,
config, config_path_label) describing what the function does, its parameters
(args: argparse.Namespace, config: dict[str, Any], config_path_label: str |
None), the side effects (launches a Modal cloud job and materializes results
locally, logs, and may write files), return value (None), and possible
exceptions/failure modes (e.g., Modal API errors, file I/O errors) and any
important notes about blocking behavior or retries; place this docstring
immediately under the def _run_cloud_job(...) line and follow Google docstring
sections: Args, Returns, Raises, and Examples/Notes as appropriate.

In `@massgen/cloud/cloud_job.py`:
- Around line 33-44: Convert CloudJobLauncher into an explicit abstract base
class by importing abc and making it inherit from abc.ABC (class
CloudJobLauncher(abc.ABC):), then decorate the launch method with
`@abc.abstractmethod` and keep its signature def launch(self, request:
CloudJobRequest) -> CloudJobResult: so implementers must override it; leave
RESULT_MARKER and __init__ unchanged but ensure you add the necessary import for
abc at the top.

In `@massgen/cloud/modal_app.py`:
- Around line 64-76: Update the minimal docstring for run_massgen_job to a full
docstring that documents the function purpose and includes an Args section
describing payload_b64 (a base64-encoded JSON payload containing keys like
"prompt", "config_yaml", and optional "cloud_job_id") and a Returns section
describing the returned dict structure (what keys/values callers can expect,
e.g., status, output file path or error info). Place the expanded docstring
immediately beneath the def run_massgen_job(...) line and ensure it remains a
plain triple-quoted string describing inputs, outputs, and any side effects
(creates /tmp/massgen_cloud_job, writes final_answer.txt).
- Around line 162-172: Wrap the call to make_tar_gz_b64(artifact_root) in a
try/except inside the code that builds the result dict (the block that creates
result, result_marker, prints and returns it) so that any exception during
artifact creation is caught; on exception set result["status"]="error",
result["artifacts_tar_gz_b64"]=None, result["error"] to a descriptive message
including the exception (and include exception details), keep "timed_out" as
appropriate, still print the result_marker + json.dumps(result) and return
result; reference symbols: make_tar_gz_b64, result, result_marker, cloud_job_id.

In `@massgen/cloud/modal_launcher.py`:
- Around line 14-19: Add Google-style docstrings for the ModalCloudJobLauncher
class and its __init__ method: update the class-level docstring for
ModalCloudJobLauncher to a full Google-style docstring describing the class
purpose, attributes (e.g., workspace_root: Path | None) and behavior, and add a
Google-style docstring to the __init__ method that documents parameters, types,
default values, and any side effects (calls super().__init__). Make sure the
docstrings use the Google docstring sections (Args, Attributes, Returns if
applicable) and reference the workspace_root parameter and inheritance from
CloudJobLauncher.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f965933c-5837-4d98-9004-b0294d377554

📥 Commits

Reviewing files that changed from the base of the PR and between 44d82ca and ff5a6d6.

📒 Files selected for processing (4)
  • massgen/cli.py
  • massgen/cloud/cloud_job.py
  • massgen/cloud/modal_app.py
  • massgen/cloud/modal_launcher.py

Comment on lines +9510 to +9520
# Cloud execution path (Modal MVP)
if getattr(args, "cloud", False):
if not args.automation:
logger.info("Cloud mode requires automation output; enabling --automation")
args.automation = True
_run_cloud_job(
args=args,
config=config,
config_path_label=str(resolved_path) if resolved_path else None,
)
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

This early return skips shared session finalization.

Cloud runs register memory_session_id above, but returning here bypasses the later finally block that calls SessionRegistry.complete_session(...). Successful cloud runs can therefore remain stuck in an active state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 9510 - 9520, The early return after calling
_run_cloud_job bypasses the later finally block that calls
SessionRegistry.complete_session and leaves memory_session_id active; remove the
immediate return (or otherwise ensure control flows through to the existing
finally) so that after _run_cloud_job returns the shared session finalization
logic (SessionRegistry.complete_session(...)) is executed for cloud runs; locate
the cloud-path block using _run_cloud_job and memory_session_id to make this
change.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
massgen/cloud/modal_app.py (2)

58-64: ⚠️ Potential issue | 🟠 Major

timeout_seconds still isn't enforced.

Line 60 hardcodes a 1-hour Modal timeout, and this function never reads the serialized timeout_seconds from the request. A caller asking for a shorter timeout will keep running until the child exits or Modal kills the container at one hour, so the public API currently advertises behavior it doesn't provide.

Also applies to: 64-71, 97-135


39-45: ⚠️ Potential issue | 🔴 Critical

The Modal worker is still running the published massgen, not this PR's code.

Lines 86-95 execute the massgen console script inside the image, but Lines 39-45 only install massgen from pip. That means the cloud job can silently run whatever version is on PyPI instead of the code under review here. Add the local package source to the image and invoke that copy.

#!/bin/bash
set -euo pipefail

echo "== modal_app image build and child command =="
sed -n '39,95p' massgen/cloud/modal_app.py

echo
echo "== search for local-source injection vs installed-package execution =="
rg -n --type=py 'add_local_python_source|add_local_dir|pip_install\("massgen"|python -m massgen\.cli|^\s*"massgen",$' massgen/cloud/modal_app.py

Expected result: .pip_install("massgen") is present, add_local_python_source/add_local_dir are absent, and the child command starts with "massgen".

Also applies to: 86-95

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 39 - 45, The image currently only
installs the published massgen via
modal.Image.debian_slim(...).pip_install("massgen", ...) so the cloud job may
run the PyPI package; add the local source to the image (use
image.add_local_python_source(".") or image.add_local_dir("massgen") after
creating the image) and keep pip_install for external deps (e.g., "fastmcp",
"uv"), and change the child command that currently invokes the console script
(the lines that run "massgen" / python -m massgen.cli) to run the local copy
explicitly (e.g., python -m massgen.cli) so the PR code is executed instead of
the published package.
massgen/cloud/modal_launcher.py (1)

24-39: ⚠️ Potential issue | 🟠 Major

Stop sending the cloud request body on argv.

payload_b64 still contains the full prompt/config and is passed as a literal modal run ... --payload-b64 ... argument. That keeps user data visible to local process inspection and makes large prompts/configs fail once argv limits are hit. Pass only a handle here and let the Modal side load the payload from a safer transport.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_launcher.py` around lines 24 - 39, The code is embedding
the full request payload into argv via payload_b64 (used when building cmd and
modal_entrypoint::run_massgen_job), exposing sensitive data and hitting argv
limits; instead generate a unique handle (e.g., uuid4), persist the payload
keyed by that handle to a temporary/safe storage (e.g., a temp file or shared
object store) before building cmd, and pass only the handle (not payload_b64) to
run_massgen_job; update the Modal-side loader (run_massgen_job) to accept the
handle and load the payload from the same storage by that handle.
🧹 Nitpick comments (2)
massgen/tests/test_cloud_modal_launcher.py (1)

52-55: Assert the Popen command, not just the returned mock.

These lambdas ignore the argv entirely, so the tests still pass if launch() stops calling modal run, drops --payload-b64, or points at the wrong entrypoint. Recording the call and asserting the command shape would make these tests protect the launcher contract.

Example pattern
+    popen_calls = []
+
+    def fake_popen(*args, **kwargs):
+        popen_calls.append((args, kwargs))
+        return mock_proc
+
     monkeypatch.setattr(
         "massgen.cloud.modal_launcher.subprocess.Popen",
-        lambda *a, **k: mock_proc,
+        fake_popen,
     )
@@
+    cmd = popen_calls[0][0][0]
+    assert cmd[0:2] == ["modal", "run"]
+    assert "--payload-b64" in cmd

Also applies to: 86-89

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/tests/test_cloud_modal_launcher.py` around lines 52 - 55, Replace the
current monkeypatch that stubs modal_launcher.subprocess.Popen with a lambda
that ignores its argv; instead patch Popen with a callable that records its
positional args/kwargs (or use unittest.mock.Mock/patch) so tests can assert the
actual command invoked by launch(), e.g., verify the argv contains "modal",
"run", the correct entrypoint and "--payload-b64" and that subprocess.Popen was
called with the expected list; update both occurrences (the stub at the current
block and the similar one around lines 86-89) to assert the command shape rather
than only returning mock_proc so the launcher contract is enforced.
massgen/cloud/modal_app.py (1)

170-171: Use the shared result marker constant here.

The launcher parses CloudJobLauncher.RESULT_MARKER, but this side hardcodes the literal again. Keeping both copies in sync is brittle; reuse the shared constant so a future rename doesn't silently break parsing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_app.py` around lines 170 - 171, The code hardcodes the
result marker string when printing job results; replace the inline construction
with the shared constant CloudJobLauncher.RESULT_MARKER so the printed marker
uses that constant plus cloud_job_id and the JSON payload. Update where
result_marker is built (currently using
f"__MASSGEN_CLOUD_JOB_RESULT__{cloud_job_id}_") to use
CloudJobLauncher.RESULT_MARKER combined with cloud_job_id and then print that
marker with json.dumps(result).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/cloud/modal_launcher.py`:
- Around line 94-101: The job directory is reused due to job_dir.mkdir(...,
exist_ok=True), which can leave stale files (e.g., events.jsonl, log_dir) from
previous runs; before calling extract_artifacts, detect if job_dir (and
artifacts_dir) already exist and fully recreate them (e.g., remove with
shutil.rmtree or clear their contents) and then mkdir fresh so extract_artifacts
writes into a clean directory; update the logic around job_dir, artifacts_dir,
and the extract_artifacts call to ensure any existing job_<id> is removed and
re-created when starting a new run (or alternatively generate a truly unique
directory when CloudJobRequest.cloud_job_id is empty).

---

Duplicate comments:
In `@massgen/cloud/modal_app.py`:
- Around line 39-45: The image currently only installs the published massgen via
modal.Image.debian_slim(...).pip_install("massgen", ...) so the cloud job may
run the PyPI package; add the local source to the image (use
image.add_local_python_source(".") or image.add_local_dir("massgen") after
creating the image) and keep pip_install for external deps (e.g., "fastmcp",
"uv"), and change the child command that currently invokes the console script
(the lines that run "massgen" / python -m massgen.cli) to run the local copy
explicitly (e.g., python -m massgen.cli) so the PR code is executed instead of
the published package.

In `@massgen/cloud/modal_launcher.py`:
- Around line 24-39: The code is embedding the full request payload into argv
via payload_b64 (used when building cmd and modal_entrypoint::run_massgen_job),
exposing sensitive data and hitting argv limits; instead generate a unique
handle (e.g., uuid4), persist the payload keyed by that handle to a
temporary/safe storage (e.g., a temp file or shared object store) before
building cmd, and pass only the handle (not payload_b64) to run_massgen_job;
update the Modal-side loader (run_massgen_job) to accept the handle and load the
payload from the same storage by that handle.

---

Nitpick comments:
In `@massgen/cloud/modal_app.py`:
- Around line 170-171: The code hardcodes the result marker string when printing
job results; replace the inline construction with the shared constant
CloudJobLauncher.RESULT_MARKER so the printed marker uses that constant plus
cloud_job_id and the JSON payload. Update where result_marker is built
(currently using f"__MASSGEN_CLOUD_JOB_RESULT__{cloud_job_id}_") to use
CloudJobLauncher.RESULT_MARKER combined with cloud_job_id and then print that
marker with json.dumps(result).

In `@massgen/tests/test_cloud_modal_launcher.py`:
- Around line 52-55: Replace the current monkeypatch that stubs
modal_launcher.subprocess.Popen with a lambda that ignores its argv; instead
patch Popen with a callable that records its positional args/kwargs (or use
unittest.mock.Mock/patch) so tests can assert the actual command invoked by
launch(), e.g., verify the argv contains "modal", "run", the correct entrypoint
and "--payload-b64" and that subprocess.Popen was called with the expected list;
update both occurrences (the stub at the current block and the similar one
around lines 86-89) to assert the command shape rather than only returning
mock_proc so the launcher contract is enforced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fc559b5f-a463-4c1c-9125-e0e3156155b5

📥 Commits

Reviewing files that changed from the base of the PR and between ff5a6d6 and 95549f0.

📒 Files selected for processing (4)
  • massgen/cloud/cloud_job.py
  • massgen/cloud/modal_app.py
  • massgen/cloud/modal_launcher.py
  • massgen/tests/test_cloud_modal_launcher.py

Comment on lines +94 to +101
job_dir = self.workspace_root / f"job_{request.cloud_job_id}"
job_dir.mkdir(parents=True, exist_ok=True)
artifacts_dir = job_dir / "artifacts"
artifacts_dir.mkdir(parents=True, exist_ok=True)

tar_b64 = marker_payload.get("artifacts_tar_gz_b64")
if tar_b64:
extract_artifacts(tar_b64, artifacts_dir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Recreate job_<id> before extracting artifacts.

This directory is reused with exist_ok=True, so a retry with the same cloud_job_id can keep events.jsonl or log_dir from an older run when the new run doesn't produce them. That becomes especially easy to hit if a caller reuses IDs or leaves CloudJobRequest.cloud_job_id empty.

Possible fix
 import base64
 import json
+import shutil
 import subprocess
 import threading
 from pathlib import Path
@@
-        job_dir = self.workspace_root / f"job_{request.cloud_job_id}"
-        job_dir.mkdir(parents=True, exist_ok=True)
+        job_dir = self.workspace_root / f"job_{request.cloud_job_id}"
+        if job_dir.exists():
+            shutil.rmtree(job_dir)
+        job_dir.mkdir(parents=True, exist_ok=False)
         artifacts_dir = job_dir / "artifacts"
-        artifacts_dir.mkdir(parents=True, exist_ok=True)
+        artifacts_dir.mkdir(parents=True, exist_ok=False)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/modal_launcher.py` around lines 94 - 101, The job directory is
reused due to job_dir.mkdir(..., exist_ok=True), which can leave stale files
(e.g., events.jsonl, log_dir) from previous runs; before calling
extract_artifacts, detect if job_dir (and artifacts_dir) already exist and fully
recreate them (e.g., remove with shutil.rmtree or clear their contents) and then
mkdir fresh so extract_artifacts writes into a clean directory; update the logic
around job_dir, artifacts_dir, and the extract_artifacts call to ensure any
existing job_<id> is removed and re-created when starting a new run (or
alternatively generate a truly unique directory when
CloudJobRequest.cloud_job_id is empty).

@coderabbitai
Copy link

coderabbitai bot commented Mar 9, 2026

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #989

coderabbitai bot added a commit that referenced this pull request Mar 9, 2026
Docstrings generation was requested by @int-chaos.

* #982 (comment)

The following files were modified:

* `massgen/cli.py`
* `massgen/cloud/cloud_job.py`
* `massgen/cloud/modal_app.py`
* `massgen/cloud/modal_launcher.py`
* `massgen/cloud/utils.py`
* `massgen/tests/test_cloud_modal_launcher.py`
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
massgen/cloud/utils.py (1)

26-29: ⚠️ Potential issue | 🔴 Critical

Reject unsafe tar members instead of calling extractall().

Line 29 still extracts untrusted archive members verbatim. A crafted artifact bundle can escape out_dir or materialize links/devices on the local machine.

🛡️ Safer extraction sketch
 def extract_artifacts(tar_b64: str, out_dir: Path) -> None:
     data = base64.b64decode(tar_b64.encode("utf-8"))
-    with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
-        tf.extractall(out_dir)
+    out_root = out_dir.resolve()
+    out_root.mkdir(parents=True, exist_ok=True)
+    with tarfile.open(fileobj=io.BytesIO(data), mode="r:gz") as tf:
+        for member in tf.getmembers():
+            if not (member.isdir() or member.isfile()):
+                raise ValueError(
+                    f"Refusing to extract artifact member: {member.name}"
+                )
+            target = (out_root / member.name).resolve()
+            target.relative_to(out_root)
+            if member.isdir():
+                target.mkdir(parents=True, exist_ok=True)
+                continue
+            target.parent.mkdir(parents=True, exist_ok=True)
+            extracted = tf.extractfile(member)
+            if extracted is None:
+                raise ValueError(f"Could not read artifact member: {member.name}")
+            with extracted, target.open("wb") as fh:
+                fh.write(extracted.read())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/utils.py` around lines 26 - 29, The extract_artifacts function
currently calls tf.extractall(out_dir) which is unsafe for untrusted archives;
update extract_artifacts(tar_b64, out_dir) to validate each TarInfo before
extracting: open the tar as now, iterate tf.getmembers(), skip any member with
absolute paths (name starts with '/'), any path component containing '..', and
any special entries (member.issym(), member.islnk(), member.isdev()), compute
the target path as (out_dir / member.name).resolve() and ensure it has
out_dir.resolve() as a prefix before writing; for regular files use
tf.extractfile(member) and write bytes to the target path creating parent dirs
and preserving permissions, and for directories create them; log or raise on
rejected members instead of extracting them.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/cloud/utils.py`:
- Line 11: Add Google-style docstrings to parse_automation_value and the three
other new helper functions defined below it (the other helpers at the subsequent
block starting around lines 17, 26 and the group at 35-39). For each function
include a one-line summary, Args with parameter names and types, Returns with
type and meaning (use "-> str | None" style semantics), and raise conditions if
any; keep examples or notes brief if helpful. Ensure the docstrings follow
Google style and accurately describe function behavior, inputs (label,
stderr_text, etc.), return values, and any edge cases so the cloud-path and
artifact contracts are discoverable.
- Around line 35-39: The function process_context_paths currently allows an
empty cloud_job_id which causes multiple jobs to share the same upload prefixes
and can produce leading-slash keys; fix by validating cloud_job_id at the start
of process_context_paths (and the code paths that build the shared volume path)
to reject empty or whitespace-only values (raise ValueError or similar) and
normalize the id (strip leading/trailing slashes) before using it to construct
upload prefixes so keys become "<job_id>/0/..." rather than "/0/..."; ensure any
callers in tests/cli continue to pass non-empty ids or are updated to satisfy
the new validation.
- Around line 35-36: The function process_context_paths currently types
context_paths as list[dict[str, Any]] but its implementation accepts and
converts non-dict entries to strings; update the parameter annotation to
list[str | dict[str, Any]] (or list[Union[str, dict[str, Any]]]) so the
signature matches the behavior. Modify the definition of process_context_paths
to use the union type for context_paths and run type checks/linters to ensure no
other dependent call sites break. Ensure imports for typing (Union) are added if
your codebase uses Python <3.10.

---

Duplicate comments:
In `@massgen/cloud/utils.py`:
- Around line 26-29: The extract_artifacts function currently calls
tf.extractall(out_dir) which is unsafe for untrusted archives; update
extract_artifacts(tar_b64, out_dir) to validate each TarInfo before extracting:
open the tar as now, iterate tf.getmembers(), skip any member with absolute
paths (name starts with '/'), any path component containing '..', and any
special entries (member.issym(), member.islnk(), member.isdev()), compute the
target path as (out_dir / member.name).resolve() and ensure it has
out_dir.resolve() as a prefix before writing; for regular files use
tf.extractfile(member) and write bytes to the target path creating parent dirs
and preserving permissions, and for directories create them; log or raise on
rejected members instead of extracting them.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 799c204a-dc8f-48dd-b828-d00740f18b4a

📥 Commits

Reviewing files that changed from the base of the PR and between 95549f0 and d84a1b6.

📒 Files selected for processing (2)
  • massgen/cloud/__init__.py
  • massgen/cloud/utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • massgen/cloud/init.py

from .modal_app import CONTEXT_MOUNT_PATH, CONTEXT_VOLUME_NAME


def parse_automation_value(label: str, stderr_text: str) -> str | None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add Google-style docstrings to these new helpers.

All four new functions are currently undocumented, which makes the cloud-path and artifact contracts harder to discover and violates the repo's Python guidelines.

As per coding guidelines, "**/*.py: For new or changed functions, include Google-style docstrings".

Also applies to: 17-17, 26-26, 35-39

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/utils.py` at line 11, Add Google-style docstrings to
parse_automation_value and the three other new helper functions defined below it
(the other helpers at the subsequent block starting around lines 17, 26 and the
group at 35-39). For each function include a one-line summary, Args with
parameter names and types, Returns with type and meaning (use "-> str | None"
style semantics), and raise conditions if any; keep examples or notes brief if
helpful. Ensure the docstrings follow Google style and accurately describe
function behavior, inputs (label, stderr_text, etc.), return values, and any
edge cases so the cloud-path and artifact contracts are discoverable.

Comment on lines +35 to +39
def process_context_paths(
context_paths: list[dict[str, Any]],
mount_point: str = CONTEXT_MOUNT_PATH,
cloud_job_id: str = "",
) -> list[dict[str, Any]]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== process_context_paths call sites =="
rg -n -C3 '\bprocess_context_paths\s*\(' massgen --type py

echo
echo "== cloud_job_id plumbing =="
rg -n -C3 '\bcloud_job_id\b' massgen --type py

Repository: massgen/MassGen

Length of output: 12524


Require a non-empty per-job upload prefix.

cloud_job_id defaults to "", but lines 67 and 70 use it to build the shared volume path. Callers that omit the ID will all upload into the same 0/, 1/, ... prefixes, so concurrent jobs can overwrite each other's context. The empty default also produces absolute-looking upload keys like "/0/...".

🔧 Minimal hardening
 def process_context_paths(
     context_paths: list[dict[str, Any]],
     mount_point: str = CONTEXT_MOUNT_PATH,
-    cloud_job_id: str = "",
+    cloud_job_id: str = "",
 ) -> list[dict[str, Any]]:
     import modal
+
+    if not cloud_job_id:
+        raise ValueError("cloud_job_id must be non-empty")
 
     vol = modal.Volume.from_name(CONTEXT_VOLUME_NAME, create_if_missing=True)
     rewritten: list[dict[str, Any]] = []
@@
-            if src.is_file():
-                batch.put_file(str(src), f"{cloud_job_id}/{idx}/{src.name}")
+            upload_prefix = f"{cloud_job_id}/{idx}"
+            if src.is_file():
+                batch.put_file(str(src), f"{upload_prefix}/{src.name}")
                 remote_path = str(remote_root / src.name)
             else:
-                batch.put_directory(str(src), f"{cloud_job_id}/{idx}")
+                batch.put_directory(str(src), upload_prefix)
                 remote_path = str(remote_root)

Current call sites (lines 484, 496 in cli.py and 138 in tests) all pass non-empty IDs, so this change is safe.

Also applies to: 64-70

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/utils.py` around lines 35 - 39, The function
process_context_paths currently allows an empty cloud_job_id which causes
multiple jobs to share the same upload prefixes and can produce leading-slash
keys; fix by validating cloud_job_id at the start of process_context_paths (and
the code paths that build the shared volume path) to reject empty or
whitespace-only values (raise ValueError or similar) and normalize the id (strip
leading/trailing slashes) before using it to construct upload prefixes so keys
become "<job_id>/0/..." rather than "/0/..."; ensure any callers in tests/cli
continue to pass non-empty ids or are updated to satisfy the new validation.

Comment on lines +35 to +36
def process_context_paths(
context_paths: list[dict[str, Any]],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C3 '\bprocess_context_paths\s*\(' massgen --type py

Repository: massgen/MassGen

Length of output: 1978


🏁 Script executed:

cat -n massgen/cloud/utils.py | head -70

Repository: massgen/MassGen

Length of output: 2794


🏁 Script executed:

# Check test case
sed -n '130,150p' massgen/tests/test_cloud_modal_launcher.py

# Check cli.py usage context
sed -n '475,505p' massgen/cli.py

Repository: massgen/MassGen

Length of output: 2200


🏁 Script executed:

# Search for any examples or tests using string context_paths
rg -n 'context_paths.*=' massgen --type py -B2 -A2 | grep -E '(context_paths|^\d+.*\[)' | head -30

# Check if there are any other tests for process_context_paths
rg -n 'process_context_paths' massgen/tests --type py -B3 -A3

Repository: massgen/MassGen

Length of output: 5609


🏁 Script executed:

# Check test_programmatic_api.py to see if those string context_paths are actually passed to process_context_paths
sed -n '85,105p' massgen/tests/test_programmatic_api.py

# Also check if there's actual string usage in config files or examples
rg -n 'context_paths\s*:\s*\[' massgen --type py -A3 | head -40

Repository: massgen/MassGen

Length of output: 959


🏁 Script executed:

# Verify the test_programmatic_api test actually calls process_context_paths indirectly through build_config
rg -n 'def build_config' massgen --type py -A15 | head -40

# Check if build_config uses process_context_paths
rg -n 'process_context_paths' massgen --type py

Repository: massgen/MassGen

Length of output: 1800


🏁 Script executed:

# Check build_config to see how context_paths is handled
sed -n '93,200p' massgen/__init__.py | head -100

Repository: massgen/MassGen

Length of output: 4230


🏁 Script executed:

# Continue reading build_config to see context_paths handling
sed -n '170,250p' massgen/__init__.py

Repository: massgen/MassGen

Length of output: 3213


Align the context_paths parameter type with actual implementation.

The signature declares list[dict[str, Any]], but lines 47–54 include an else clause that handles non-dict entries by converting them to strings. Update the type annotation to list[str | dict[str, Any]] to accurately reflect what the function accepts and avoid misleading type checkers and callers.

♻️ Suggested signature update
 def process_context_paths(
-    context_paths: list[dict[str, Any]],
+    context_paths: list[str | dict[str, Any]],
     mount_point: str = CONTEXT_MOUNT_PATH,
     cloud_job_id: str = "",
 ) -> list[dict[str, Any]]:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cloud/utils.py` around lines 35 - 36, The function
process_context_paths currently types context_paths as list[dict[str, Any]] but
its implementation accepts and converts non-dict entries to strings; update the
parameter annotation to list[str | dict[str, Any]] (or list[Union[str, dict[str,
Any]]]) so the signature matches the behavior. Modify the definition of
process_context_paths to use the union type for context_paths and run type
checks/linters to ensure no other dependent call sites break. Ensure imports for
typing (Union) are added if your codebase uses Python <3.10.

@ncrispino ncrispino mentioned this pull request Mar 16, 2026
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant