Sandbox exec() of LLM-generated code in ai.py (fixes #2771) by AAtomical · Pull Request #2777 · gee-community/geemap

AAtomical · 2026-05-12T00:24:26Z

Summary

Fixes #2771. Both Genie.show_layer and run_ee_code in geemap/ai.py invoke exec() on LLM-generated code with an empty/incomplete globals dict, which causes CPython to auto-inject the full __builtins__ namespace — enabling arbitrary code execution via prompts such as __import__('os').system(...) or import os; os.system(...).

Changes

Add a narrow _SAFE_BUILTINS whitelist (~64 names): constants, type constructors, iteration helpers (range/enumerate/zip/sorted), numerics (abs/min/max/sum), debug (print), common exceptions. Excludes __import__, eval, exec, compile, open, input, getattr/setattr/delattr, vars/globals/locals/dir, breakpoint.
Genie.show_layer: pass {"__builtins__": _SAFE_BUILTINS, "ee": ee} and move import ee out of the exec'd string (so removing __import__ doesn't break image construction); rename shadowed locals → locals_env.
run_ee_code: add "__builtins__": _SAFE_BUILTINS to the existing globals dict.

Verification

Tested against (a) typical LLM-generated EE code patterns and (b) RCE payloads:

All pass: ee.Image(...).select(...).filterDate(...).median(), for i in range(N): ..., print(sum(vals)), enumerate(names), int('5'), len(...), dict/list literals.
All blocked with NameError/ImportError: import os, __import__('os').system(...), open(...), eval(...), compile(...), getattr(...). No file written by payloads.

Known limitation (documented inline)

Dunder traversal — obj.__class__.__base__.__subclasses__() — is still reachable because attribute access bypasses __builtins__. This PR significantly raises the bar (script-kiddie payloads are blocked) but is not a complete sandbox. Stronger isolation requires RestrictedPython or a subprocess; left for follow-up.

Test plan

Run the existing geemap test suite (pytest tests/) to confirm no regressions.
Manually exercise Genie with the AI extra installed (pip install 'geemap[ai]') on a few real prompts to confirm common LLM-generated code still executes.
Confirm a malicious prompt (e.g. one that asks the model to import os) now fails cleanly instead of executing.

Both Genie.show_layer and run_ee_code passed empty/incomplete globals dicts to exec(), which causes CPython to inject the full __builtins__ namespace. An attacker who influences LLM output (direct prompt or indirect injection in dataset metadata) could execute arbitrary code, e.g. __import__('os').system(...). Fix: introduce a narrow _SAFE_BUILTINS whitelist (constants, type constructors, range/enumerate/zip, len/print/sorted, common exceptions) and pass it as __builtins__ to both exec() sinks. Excludes __import__/eval/exec/compile/open/input/getattr/setattr/delattr and the introspection family. show_layer: also moved 'import ee' out of the exec'd string so that removing __import__ from builtins does not break image construction; 'ee' is passed through globals. Known limitation: dunder traversal (obj.__class__.__base__.__subclasses__()) is still reachable. This patch raises the bar significantly but is not a complete sandbox; full isolation needs RestrictedPython or a subprocess.

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request enhances security when executing LLM-generated Earth Engine code by introducing a whitelist of safe Python builtins. By restricting the builtins available to the exec function in show_layer and run_ee_code, the changes provide a defense-in-depth measure against remote code execution. Feedback from the review suggests using a copy of the _SAFE_BUILTINS dictionary in each exec call to prevent malicious code from poisoning the shared namespace and affecting subsequent executions.

I am having trouble creating individual review comments. Click here to see my feedback.

geemap/ai.py (316)

The _SAFE_BUILTINS dictionary is shared across all exec calls. Since it is a mutable dictionary, code executed via exec could potentially modify it (e.g., by assigning to __builtins__['sum']), which would affect all subsequent calls and other users in the same process. Passing a copy of the dictionary ensures that each execution has its own isolated builtins namespace.

                    {"__builtins__": _SAFE_BUILTINS.copy(), "ee": ee},

geemap/ai.py (840)

Similar to the usage in show_layer, _SAFE_BUILTINS should be copied here to prevent the executed code from poisoning the shared builtins dictionary for subsequent calls.

                    "__builtins__": _SAFE_BUILTINS.copy(),

AAtomical and others added 2 commits May 11, 2026 20:24

[pre-commit.ci] auto fixes from pre-commit.com hooks

b0cbdf3

for more information, see https://pre-commit.ci

gemini-code-assist Bot reviewed May 12, 2026

View reviewed changes

AAtomical mentioned this pull request May 12, 2026

Unsandboxed exec() of LLM-generated code in geemap/ai.py — Arbitrary Code Execution via Prompt Injection #2771

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sandbox exec() of LLM-generated code in ai.py (fixes #2771)#2777

Sandbox exec() of LLM-generated code in ai.py (fixes #2771)#2777
AAtomical wants to merge 2 commits into
gee-community:masterfrom
AAtomical:fix/issue-2771-sandbox-llm-exec-v2

AAtomical commented May 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AAtomical commented May 12, 2026

Summary

Changes

Verification

Known limitation (documented inline)

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

geemap/ai.py (316)

geemap/ai.py (840)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant