fix: ripgrep fallback for list_files#250
fix: ripgrep fallback for list_files#250RyanLHicks wants to merge 7 commits intompfaffenberger:mainfrom
Conversation
📝 WalkthroughWalkthroughThis PR addresses a bug where file listing fails when ripgrep is not installed by implementing a graceful fallback mechanism. When ripgrep is unavailable in recursive mode, the function now logs a warning and continues with non-recursive listing instead of failing. Token estimation logic was also updated to align with existing heuristics. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/agents/test_token_estimation_consistency.py (1)
16-27: Test doesn't actually verify_read_fileuses the same heuristic.The docstring claims to ensure consistency between
file_operations._read_fileandBaseAgent.estimate_token_count, but this test only validates the agent's method against the expected formula. It doesn't import or test_read_filedirectly, so if_read_filediverges, this test won't catch it.Consider adding a test that reads a file via
_read_fileand comparesReadFileOutput.num_tokensagainstagent.estimate_token_count(content).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agents/test_token_estimation_consistency.py` around lines 16 - 27, Update the test to actually exercise file_operations._read_file and compare its token count to BaseAgent.estimate_token_count: create a temporary file containing the same 1000-character content (use tmp_path or tempfile), call file_operations._read_file(...) to obtain the ReadFileOutput and compare ReadFileOutput.num_tokens against agent.estimate_token_count(content) (with agent = CodePuppyAgent()), ensuring you import file_operations._read_file and ReadFileOutput (or the module that exposes them) so the test verifies both implementations use the same 2.5 chars/token heuristic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 517-518: The inline comment above the token estimation (the block
that defines num_tokens) is stale: it says "~4 characters per token" but the
code uses math.floor(len(content) / 2.5). Update the comment to match the actual
heuristic (e.g., "~2.5 characters per token" or describe the formula explicitly)
next to the num_tokens calculation in file_operations.py so the comment and the
expression (num_tokens = max(1, math.floor(len(content) / 2.5))) are consistent.
- Line 518: The code uses math.floor in the calculation of num_tokens inside the
_read_file logic (num_tokens = max(1, math.floor(len(content) / 2.5))) but the
math module is not imported; add "import math" alongside the other top-of-file
imports in code_puppy/tools/file_operations.py so math.floor resolves and
prevents a NameError when _read_file is called.
---
Nitpick comments:
In `@tests/agents/test_token_estimation_consistency.py`:
- Around line 16-27: Update the test to actually exercise
file_operations._read_file and compare its token count to
BaseAgent.estimate_token_count: create a temporary file containing the same
1000-character content (use tmp_path or tempfile), call
file_operations._read_file(...) to obtain the ReadFileOutput and compare
ReadFileOutput.num_tokens against agent.estimate_token_count(content) (with
agent = CodePuppyAgent()), ensuring you import file_operations._read_file and
ReadFileOutput (or the module that exposes them) so the test verifies both
implementations use the same 2.5 chars/token heuristic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d0ec8603-13ed-447f-88ac-4083774a36d3
📒 Files selected for processing (3)
code_puppy/tools/file_operations.pytests/agents/test_token_estimation_consistency.pytests/tools/test_list_files_ripgrep_fallback.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 197-202: The code currently appends a warning to output_lines and
sets recursive = False when ripgrep is missing but then proceeds to return a
successful ListFileOutput; to satisfy the existing test
test_list_files_ripgrep_not_found_recursive, change the behavior in the
ripgrep-missing branch (the block that manipulates output_lines and sets
recursive) to return immediately with an error: construct and return a
ListFileOutput where error is a descriptive exception/message (and keep the
warning in content if desired) instead of flipping recursive and continuing;
update the code that previously used recursive to assume an early return when
ripgrep is unavailable so the function returns error for recursive requests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3b45165c-6fbc-4d2c-a7cb-29f89b37efcd
📒 Files selected for processing (1)
code_puppy/tools/file_operations.py
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/tools/test_file_operations_coverage.py (1)
259-274:⚠️ Potential issue | 🟠 MajorStrengthen fallback assertions to avoid false-positive passes
This check is too permissive: on Line 273,
result.error is None or ...passes whenevererrorisNone, even if the fallback warning is missing. That weakens this regression test versus the implemented behavior incode_puppy/tools/file_operations.py(Line 196-202, Line 457-461).Suggested test tightening
def test_list_files_ripgrep_not_found_recursive(self, tmp_path): """Test list_files error when ripgrep not found for recursive listing.""" + (tmp_path / "visible.txt").write_text("content") # Mock both shutil.which and os.path.exists to ensure rg is not found with ( patch("shutil.which", return_value=None), patch( "os.path.exists", side_effect=lambda p: not (p.endswith("rg") or p.endswith("rg.exe")), ), ): result = _list_files(None, str(tmp_path), recursive=True) # Fallback behavior: warning in content, no hard error, files still listed - assert result.content is not None - assert result.error is None or "falling back" in (result.content or "").lower() + assert result.error is None + assert result.content is not None + assert "falling back to non-recursive listing" in result.content.lower() + assert "visible.txt" in result.content🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/tools/test_file_operations_coverage.py` around lines 259 - 274, The test test_list_files_ripgrep_not_found_recursive is too permissive because it allows result.error to be None without asserting the expected fallback warning; update the assertions for _list_files so that result.content is not None and then explicitly assert that either result.error is not None (indicating an error path) or, when result.error is None, result.content contains the fallback warning text (e.g., "falling back" or the exact message emitted by file_operations._list_files); ensure you reference result.content and result.error in the conditional assertion so the test fails if neither an error nor the fallback message is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@tests/tools/test_file_operations_coverage.py`:
- Around line 259-274: The test test_list_files_ripgrep_not_found_recursive is
too permissive because it allows result.error to be None without asserting the
expected fallback warning; update the assertions for _list_files so that
result.content is not None and then explicitly assert that either result.error
is not None (indicating an error path) or, when result.error is None,
result.content contains the fallback warning text (e.g., "falling back" or the
exact message emitted by file_operations._list_files); ensure you reference
result.content and result.error in the conditional assertion so the test fails
if neither an error nor the fallback message is present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 034dee71-ad63-4a22-a013-7ab47bcb1c99
📒 Files selected for processing (1)
tests/tools/test_file_operations_coverage.py
|
Noting that we should have already merged the token estimation changes - thanks for that contribution! Can you explain what are the conditions where |
|
No problem, glad to help! Yes that's my bad I forgot update the title from this PR, but everything is updated now for the ripgrep. This addresses Issue #36. The specific scenario: a user running pyenv with a named virtualenv where rg is installed inside the env but pyenv shims can't resolve it via shutil. which, the venv fallback could also miss it depending on how pyenv sets sys.executable. The user had to install ripgrep separately to unblock themselves. This change means file listing degrades to non-recursive instead of returning a hard error in that situation. If it's one off issue it's probably fine, but something I just noticed from someone else. If it's not a serious issue in your eyes, I'll close this PR. |
Closes #36
Ripgrep fallback for list_files
When ripgrep is not installed,
_list_filesreturned a hard error blocking all file operations. Changed to a graceful fallback to non-recursiveos.listdirwith a warning instead of failing.Summary by CodeRabbit
Release Notes
Bug Fixes
Tests