Skip to content

fix: ripgrep fallback for list_files#250

Open
RyanLHicks wants to merge 7 commits intompfaffenberger:mainfrom
RyanLHicks:fix/list-files-ripgrep-fallback
Open

fix: ripgrep fallback for list_files#250
RyanLHicks wants to merge 7 commits intompfaffenberger:mainfrom
RyanLHicks:fix/list-files-ripgrep-fallback

Conversation

@RyanLHicks
Copy link
Copy Markdown
Contributor

@RyanLHicks RyanLHicks commented Mar 22, 2026

Closes #36

Ripgrep fallback for list_files
When ripgrep is not installed, _list_files returned a hard error blocking all file operations. Changed to a graceful fallback to non-recursive os.listdir with a warning instead of failing.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • File listing now gracefully falls back to non-recursive mode when ripgrep is unavailable, instead of failing.
    • Token estimation for file operations now uses improved heuristics for more accurate counting.
  • Tests

    • Added regression tests for file listing fallback behavior.
    • Updated existing tests to reflect improved error handling.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 22, 2026

📝 Walkthrough

Walkthrough

This PR addresses a bug where file listing fails when ripgrep is not installed by implementing a graceful fallback mechanism. When ripgrep is unavailable in recursive mode, the function now logs a warning and continues with non-recursive listing instead of failing. Token estimation logic was also updated to align with existing heuristics.

Changes

Cohort / File(s) Summary
Ripgrep Fallback Logic
code_puppy/tools/file_operations.py
Modified _list_files to append a warning and fall back to non-recursive mode when ripgrep is unavailable instead of returning an error. Updated token estimation in _read_file from len(content)//4 to max(1, floor(len(content)/2.5)).
Fallback Behavior Tests
tests/tools/test_list_files_ripgrep_fallback.py
New regression test module verifying file listing works when ripgrep is unavailable by patching shutil.which, testing both recursive and non-recursive modes, and asserting expected files appear in output with fallback warnings.
Updated Test Assertions
tests/tools/test_file_operations_coverage.py
Updated test_list_files_ripgrep_not_found_recursive to reflect new fallback behavior, asserting that content is present and errors either don't exist or contain "falling back" warnings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A ripgrep went missing, oh what dismay!
But our puppy persisted, found a better way,
Fallback to fallback, a graceful retreat,
No files left behind—the workaround's sweet! 📋✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check ❓ Inconclusive Token estimation changes in _read_file are somewhat tangential to the ripgrep fallback objective, though they align with BaseAgent heuristics. Clarify whether token estimation alignment belongs in this PR or should be addressed separately, as it extends beyond the core ripgrep fallback fix.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: implementing a ripgrep fallback mechanism for the list_files function.
Linked Issues check ✅ Passed The PR addresses issue #36 by implementing graceful fallback when ripgrep is unavailable, allowing file listing to work without requiring ripgrep installation.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/agents/test_token_estimation_consistency.py (1)

16-27: Test doesn't actually verify _read_file uses the same heuristic.

The docstring claims to ensure consistency between file_operations._read_file and BaseAgent.estimate_token_count, but this test only validates the agent's method against the expected formula. It doesn't import or test _read_file directly, so if _read_file diverges, this test won't catch it.

Consider adding a test that reads a file via _read_file and compares ReadFileOutput.num_tokens against agent.estimate_token_count(content).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agents/test_token_estimation_consistency.py` around lines 16 - 27,
Update the test to actually exercise file_operations._read_file and compare its
token count to BaseAgent.estimate_token_count: create a temporary file
containing the same 1000-character content (use tmp_path or tempfile), call
file_operations._read_file(...) to obtain the ReadFileOutput and compare
ReadFileOutput.num_tokens against agent.estimate_token_count(content) (with
agent = CodePuppyAgent()), ensuring you import file_operations._read_file and
ReadFileOutput (or the module that exposes them) so the test verifies both
implementations use the same 2.5 chars/token heuristic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 517-518: The inline comment above the token estimation (the block
that defines num_tokens) is stale: it says "~4 characters per token" but the
code uses math.floor(len(content) / 2.5). Update the comment to match the actual
heuristic (e.g., "~2.5 characters per token" or describe the formula explicitly)
next to the num_tokens calculation in file_operations.py so the comment and the
expression (num_tokens = max(1, math.floor(len(content) / 2.5))) are consistent.
- Line 518: The code uses math.floor in the calculation of num_tokens inside the
_read_file logic (num_tokens = max(1, math.floor(len(content) / 2.5))) but the
math module is not imported; add "import math" alongside the other top-of-file
imports in code_puppy/tools/file_operations.py so math.floor resolves and
prevents a NameError when _read_file is called.

---

Nitpick comments:
In `@tests/agents/test_token_estimation_consistency.py`:
- Around line 16-27: Update the test to actually exercise
file_operations._read_file and compare its token count to
BaseAgent.estimate_token_count: create a temporary file containing the same
1000-character content (use tmp_path or tempfile), call
file_operations._read_file(...) to obtain the ReadFileOutput and compare
ReadFileOutput.num_tokens against agent.estimate_token_count(content) (with
agent = CodePuppyAgent()), ensuring you import file_operations._read_file and
ReadFileOutput (or the module that exposes them) so the test verifies both
implementations use the same 2.5 chars/token heuristic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0ec8603-13ed-447f-88ac-4083774a36d3

📥 Commits

Reviewing files that changed from the base of the PR and between 19e3122 and 78ce419.

📒 Files selected for processing (3)
  • code_puppy/tools/file_operations.py
  • tests/agents/test_token_estimation_consistency.py
  • tests/tools/test_list_files_ripgrep_fallback.py

Comment thread code_puppy/tools/file_operations.py Outdated
Comment thread code_puppy/tools/file_operations.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 197-202: The code currently appends a warning to output_lines and
sets recursive = False when ripgrep is missing but then proceeds to return a
successful ListFileOutput; to satisfy the existing test
test_list_files_ripgrep_not_found_recursive, change the behavior in the
ripgrep-missing branch (the block that manipulates output_lines and sets
recursive) to return immediately with an error: construct and return a
ListFileOutput where error is a descriptive exception/message (and keep the
warning in content if desired) instead of flipping recursive and continuing;
update the code that previously used recursive to assume an early return when
ripgrep is unavailable so the function returns error for recursive requests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b45165c-6fbc-4d2c-a7cb-29f89b37efcd

📥 Commits

Reviewing files that changed from the base of the PR and between 78ce419 and b4a9938.

📒 Files selected for processing (1)
  • code_puppy/tools/file_operations.py

Comment thread code_puppy/tools/file_operations.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/tools/test_file_operations_coverage.py (1)

259-274: ⚠️ Potential issue | 🟠 Major

Strengthen fallback assertions to avoid false-positive passes

This check is too permissive: on Line 273, result.error is None or ... passes whenever error is None, even if the fallback warning is missing. That weakens this regression test versus the implemented behavior in code_puppy/tools/file_operations.py (Line 196-202, Line 457-461).

Suggested test tightening
     def test_list_files_ripgrep_not_found_recursive(self, tmp_path):
         """Test list_files error when ripgrep not found for recursive listing."""
+        (tmp_path / "visible.txt").write_text("content")
         # Mock both shutil.which and os.path.exists to ensure rg is not found
         with (
             patch("shutil.which", return_value=None),
             patch(
                 "os.path.exists",
                 side_effect=lambda p: not (p.endswith("rg") or p.endswith("rg.exe")),
             ),
         ):
             result = _list_files(None, str(tmp_path), recursive=True)
 
         # Fallback behavior: warning in content, no hard error, files still listed
-        assert result.content is not None
-        assert result.error is None or "falling back" in (result.content or "").lower()
+        assert result.error is None
+        assert result.content is not None
+        assert "falling back to non-recursive listing" in result.content.lower()
+        assert "visible.txt" in result.content
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tools/test_file_operations_coverage.py` around lines 259 - 274, The
test test_list_files_ripgrep_not_found_recursive is too permissive because it
allows result.error to be None without asserting the expected fallback warning;
update the assertions for _list_files so that result.content is not None and
then explicitly assert that either result.error is not None (indicating an error
path) or, when result.error is None, result.content contains the fallback
warning text (e.g., "falling back" or the exact message emitted by
file_operations._list_files); ensure you reference result.content and
result.error in the conditional assertion so the test fails if neither an error
nor the fallback message is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tests/tools/test_file_operations_coverage.py`:
- Around line 259-274: The test test_list_files_ripgrep_not_found_recursive is
too permissive because it allows result.error to be None without asserting the
expected fallback warning; update the assertions for _list_files so that
result.content is not None and then explicitly assert that either result.error
is not None (indicating an error path) or, when result.error is None,
result.content contains the fallback warning text (e.g., "falling back" or the
exact message emitted by file_operations._list_files); ensure you reference
result.content and result.error in the conditional assertion so the test fails
if neither an error nor the fallback message is present.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 034dee71-ad63-4a22-a013-7ab47bcb1c99

📥 Commits

Reviewing files that changed from the base of the PR and between b4a9938 and c2a6663.

📒 Files selected for processing (1)
  • tests/tools/test_file_operations_coverage.py

@RyanLHicks RyanLHicks changed the title fix: fall back to non-recursive listing when ripgrep is not installed fix: ripgrep fallback for list_files and standardize token estimation heuristic Mar 22, 2026
@mpfaffenberger
Copy link
Copy Markdown
Owner

Noting that we should have already merged the token estimation changes - thanks for that contribution!

Can you explain what are the conditions where rg doesn't get installed? rg is a hard dependency, it's specified in our pyproject.toml.

@RyanLHicks RyanLHicks changed the title fix: ripgrep fallback for list_files and standardize token estimation heuristic fix: ripgrep fallback for list_files Apr 16, 2026
@RyanLHicks
Copy link
Copy Markdown
Contributor Author

No problem, glad to help! Yes that's my bad I forgot update the title from this PR, but everything is updated now for the ripgrep.

This addresses Issue #36. The specific scenario: a user running pyenv with a named virtualenv where rg is installed inside the env but pyenv shims can't resolve it via shutil. which, the venv fallback could also miss it depending on how pyenv sets sys.executable. The user had to install ripgrep separately to unblock themselves. This change means file listing degrades to non-recursive instead of returning a hard error in that situation.

If it's one off issue it's probably fine, but something I just noticed from someone else. If it's not a serious issue in your eyes, I'll close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] File listing doesn't work when ripgrep is not installed

2 participants