Skip to content

Conversation

@0xCUB3
Copy link

@0xCUB3 0xCUB3 commented Dec 1, 2025

Adds PythonImportRepair requirement for detecting and suggesting fixes for missing/incorrect imports in LLM-generated Python code.

Detection modes:

  • Static analysis (default): undefined name detection + module availability checks
  • Execution-based (when allow_unsafe_execution=True): runs code and parses runtime errors. Matches PythonExecutionReq interface (i.e. when code execution is enabled, import validation uses execution too)

Features:

  • Handles ModuleNotFoundError, ImportError, NameError, AttributeError. Any I'm missing?
  • 155 common aliases (np, pd, plt, Path, Optional, etc.)
  • Module relocations for sklearn, tensorflow, torch, scipy
  • Fuzzy matching for misspelled modules via rapidfuzz
  • Works with RepairTemplateStrategy via ValidationResult.reason

Usage:

from mellea_contribs.reqlib import PythonImportRepair
from mellea.stdlib.sampling import RepairTemplateStrategy

result = session.instruct(
    "Write code using pandas",
    requirements=[PythonImportRepair()],
    strategy=RepairTemplateStrategy(loop_budget=3)
)

Files:

  • mellea_contribs/reqlib/common_aliases.py - Alias database
  • mellea_contribs/reqlib/import_resolution.py - Error parsing and resolution
  • mellea_contribs/reqlib/import_repair.py - PythonImportRepair class
  • test/test_import_repair.py - new tests

Dependencies:
rapidfuzz - Fast fuzzy matching for misspelled modules. Chosen over fuzzywuzzy (10x faster, no GPL licensing issues via python-Levenshtein, though way better name :) ) and difflib (faster, better ranking algorithm).

My tests

  • 51 new unit tests passing. I tried to put in as many edge cases as possible.
  • Ruff lint passing
  • Mypy type check passing
  • Tested with mellea + Ollama end-to-end on

Add common_aliases.py with ~100+ common Python import aliases
organized by category (data science, ML, stdlib, web, testing, etc.)
and module relocation patterns for sklearn, tensorflow, torch, scipy.
Add import_resolution.py with:
- Error parsing for ModuleNotFoundError, ImportError, NameError, AttributeError
- AST-based undefined name detection for static analysis
- Resolution functions with fuzzy matching support
- Integration with common_aliases database
Add import_repair.py with:
- PythonImportRepair Requirement class
- Static analysis mode (default) using AST
- Execution-based mode with subprocess or sandbox
- Formatted repair feedback for RepairTemplateStrategy
Test coverage for:
- Error parsing (ModuleNotFoundError, ImportError, NameError, AttributeError)
- AST-based undefined name detection
- Import resolution and suggestions
- Common aliases database
- Module availability checking
Change from absolute to relative import for statute_data module.
Add local implementation of extract_python_code to avoid dependency
on mellea version that may not export it.
…code

Add tests for:
- extract_python_code with various markdown formats
- PythonImportRepair with mocked mellea Context
- Validation of valid code, missing imports, syntax errors, unavailable modules
- Add lambda parameter detection to AST undefined name finder
- Support capital P in Python code block tags
- Add tests for lambda params and capital Python tags
- Handle Python 3.10+ match statement case pattern bindings
- Support ```py and ```python3 code block tags
- Clean up unnecessary comments in AST walker
- Add tests for new functionality
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove if not necessary. I know mellea proper has this but it doesn't exist here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant