fix: ripgrep fallback for list_files by RyanLHicks · Pull Request #250 · mpfaffenberger/code_puppy

RyanLHicks · 2026-03-22T21:27:23Z

Closes #36

Ripgrep fallback for list_files
When ripgrep is not installed, _list_files returned a hard error blocking all file operations. Changed to a graceful fallback to non-recursive os.listdir with a warning instead of failing.

Summary by CodeRabbit

Release Notes

Bug Fixes
- File listing now gracefully falls back to non-recursive mode when ripgrep is unavailable, instead of failing.
- Token estimation for file operations now uses improved heuristics for more accurate counting.
Tests
- Added regression tests for file listing fallback behavior.
- Updated existing tests to reflect improved error handling.

coderabbitai · 2026-03-22T21:27:35Z

📝 Walkthrough

Walkthrough

This PR addresses a bug where file listing fails when ripgrep is not installed by implementing a graceful fallback mechanism. When ripgrep is unavailable in recursive mode, the function now logs a warning and continues with non-recursive listing instead of failing. Token estimation logic was also updated to align with existing heuristics.

Changes

Cohort / File(s)	Summary
Ripgrep Fallback Logic `code_puppy/tools/file_operations.py`	Modified `_list_files` to append a warning and fall back to non-recursive mode when ripgrep is unavailable instead of returning an error. Updated token estimation in `_read_file` from `len(content)//4` to `max(1, floor(len(content)/2.5))`.
Fallback Behavior Tests `tests/tools/test_list_files_ripgrep_fallback.py`	New regression test module verifying file listing works when ripgrep is unavailable by patching `shutil.which`, testing both recursive and non-recursive modes, and asserting expected files appear in output with fallback warnings.
Updated Test Assertions `tests/tools/test_file_operations_coverage.py`	Updated `test_list_files_ripgrep_not_found_recursive` to reflect new fallback behavior, asserting that content is present and errors either don't exist or contain "falling back" warnings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A ripgrep went missing, oh what dismay!
But our puppy persisted, found a better way,
Fallback to fallback, a graceful retreat,
No files left behind—the workaround's sweet! 📋✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check	❓ Inconclusive	Token estimation changes in _read_file are somewhat tangential to the ripgrep fallback objective, though they align with BaseAgent heuristics.	Clarify whether token estimation alignment belongs in this PR or should be addressed separately, as it extends beyond the core ripgrep fallback fix.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: implementing a ripgrep fallback mechanism for the list_files function.
Linked Issues check	✅ Passed	The PR addresses issue `#36` by implementing graceful fallback when ripgrep is unavailable, allowing file listing to work without requiring ripgrep installation.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/agents/test_token_estimation_consistency.py (1)
16-27: Test doesn't actually verify _read_file uses the same heuristic.

The docstring claims to ensure consistency between file_operations._read_file and BaseAgent.estimate_token_count, but this test only validates the agent's method against the expected formula. It doesn't import or test _read_file directly, so if _read_file diverges, this test won't catch it.

Consider adding a test that reads a file via _read_file and compares ReadFileOutput.num_tokens against agent.estimate_token_count(content).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agents/test_token_estimation_consistency.py` around lines 16 - 27,
Update the test to actually exercise file_operations._read_file and compare its
token count to BaseAgent.estimate_token_count: create a temporary file
containing the same 1000-character content (use tmp_path or tempfile), call
file_operations._read_file(...) to obtain the ReadFileOutput and compare
ReadFileOutput.num_tokens against agent.estimate_token_count(content) (with
agent = CodePuppyAgent()), ensuring you import file_operations._read_file and
ReadFileOutput (or the module that exposes them) so the test verifies both
implementations use the same 2.5 chars/token heuristic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 517-518: The inline comment above the token estimation (the block
that defines num_tokens) is stale: it says "~4 characters per token" but the
code uses math.floor(len(content) / 2.5). Update the comment to match the actual
heuristic (e.g., "~2.5 characters per token" or describe the formula explicitly)
next to the num_tokens calculation in file_operations.py so the comment and the
expression (num_tokens = max(1, math.floor(len(content) / 2.5))) are consistent.
- Line 518: The code uses math.floor in the calculation of num_tokens inside the
_read_file logic (num_tokens = max(1, math.floor(len(content) / 2.5))) but the
math module is not imported; add "import math" alongside the other top-of-file
imports in code_puppy/tools/file_operations.py so math.floor resolves and
prevents a NameError when _read_file is called.

---

Nitpick comments:
In `@tests/agents/test_token_estimation_consistency.py`:
- Around line 16-27: Update the test to actually exercise
file_operations._read_file and compare its token count to
BaseAgent.estimate_token_count: create a temporary file containing the same
1000-character content (use tmp_path or tempfile), call
file_operations._read_file(...) to obtain the ReadFileOutput and compare
ReadFileOutput.num_tokens against agent.estimate_token_count(content) (with
agent = CodePuppyAgent()), ensuring you import file_operations._read_file and
ReadFileOutput (or the module that exposes them) so the test verifies both
implementations use the same 2.5 chars/token heuristic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0ec8603-13ed-447f-88ac-4083774a36d3

📥 Commits

Reviewing files that changed from the base of the PR and between 19e3122 and 78ce419.

📒 Files selected for processing (3)

code_puppy/tools/file_operations.py
tests/agents/test_token_estimation_consistency.py
tests/tools/test_list_files_ripgrep_fallback.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 197-202: The code currently appends a warning to output_lines and
sets recursive = False when ripgrep is missing but then proceeds to return a
successful ListFileOutput; to satisfy the existing test
test_list_files_ripgrep_not_found_recursive, change the behavior in the
ripgrep-missing branch (the block that manipulates output_lines and sets
recursive) to return immediately with an error: construct and return a
ListFileOutput where error is a descriptive exception/message (and keep the
warning in content if desired) instead of flipping recursive and continuing;
update the code that previously used recursive to assume an early return when
ripgrep is unavailable so the function returns error for recursive requests.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b45165c-6fbc-4d2c-a7cb-29f89b37efcd

📥 Commits

Reviewing files that changed from the base of the PR and between 78ce419 and b4a9938.

📒 Files selected for processing (1)

code_puppy/tools/file_operations.py

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/tools/test_file_operations_coverage.py (1)

259-274: ⚠️ Potential issue | 🟠 Major

Strengthen fallback assertions to avoid false-positive passes

This check is too permissive: on Line 273, result.error is None or ... passes whenever error is None, even if the fallback warning is missing. That weakens this regression test versus the implemented behavior in code_puppy/tools/file_operations.py (Line 196-202, Line 457-461).

Suggested test tightening

     def test_list_files_ripgrep_not_found_recursive(self, tmp_path):
         """Test list_files error when ripgrep not found for recursive listing."""
+        (tmp_path / "visible.txt").write_text("content")
         # Mock both shutil.which and os.path.exists to ensure rg is not found
         with (
             patch("shutil.which", return_value=None),
             patch(
                 "os.path.exists",
                 side_effect=lambda p: not (p.endswith("rg") or p.endswith("rg.exe")),
             ),
         ):
             result = _list_files(None, str(tmp_path), recursive=True)
 
         # Fallback behavior: warning in content, no hard error, files still listed
-        assert result.content is not None
-        assert result.error is None or "falling back" in (result.content or "").lower()
+        assert result.error is None
+        assert result.content is not None
+        assert "falling back to non-recursive listing" in result.content.lower()
+        assert "visible.txt" in result.content

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/tools/test_file_operations_coverage.py` around lines 259 - 274, The
test test_list_files_ripgrep_not_found_recursive is too permissive because it
allows result.error to be None without asserting the expected fallback warning;
update the assertions for _list_files so that result.content is not None and
then explicitly assert that either result.error is not None (indicating an error
path) or, when result.error is None, result.content contains the fallback
warning text (e.g., "falling back" or the exact message emitted by
file_operations._list_files); ensure you reference result.content and
result.error in the conditional assertion so the test fails if neither an error
nor the fallback message is present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tests/tools/test_file_operations_coverage.py`:
- Around line 259-274: The test test_list_files_ripgrep_not_found_recursive is
too permissive because it allows result.error to be None without asserting the
expected fallback warning; update the assertions for _list_files so that
result.content is not None and then explicitly assert that either result.error
is not None (indicating an error path) or, when result.error is None,
result.content contains the fallback warning text (e.g., "falling back" or the
exact message emitted by file_operations._list_files); ensure you reference
result.content and result.error in the conditional assertion so the test fails
if neither an error nor the fallback message is present.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 034dee71-ad63-4a22-a013-7ab47bcb1c99

📥 Commits

Reviewing files that changed from the base of the PR and between b4a9938 and c2a6663.

📒 Files selected for processing (1)

tests/tools/test_file_operations_coverage.py

mpfaffenberger · 2026-04-15T14:02:09Z

Noting that we should have already merged the token estimation changes - thanks for that contribution!

Can you explain what are the conditions where rg doesn't get installed? rg is a hard dependency, it's specified in our pyproject.toml.

RyanLHicks · 2026-04-16T00:13:42Z

No problem, glad to help! Yes that's my bad I forgot update the title from this PR, but everything is updated now for the ripgrep.

This addresses Issue #36. The specific scenario: a user running pyenv with a named virtualenv where rg is installed inside the env but pyenv shims can't resolve it via shutil. which, the venv fallback could also miss it depending on how pyenv sets sys.executable. The user had to install ripgrep separately to unblock themselves. This change means file listing degrades to non-recursive instead of returning a hard error in that situation.

If it's one off issue it's probably fine, but something I just noticed from someone else. If it's not a serious issue in your eyes, I'll close this PR.

RyanLHicks added 2 commits March 22, 2026 16:01

fix: standardize token estimation heuristic and add consistency tests

cfeba28

fix: fall back to non-recursive listing when ripgrep is not installed

78ce419

RyanLHicks mentioned this pull request Mar 22, 2026

fix: standardize token estimation heuristic between file_operations and base_agent #249

Closed

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

Comment thread code_puppy/tools/file_operations.py Outdated

Comment thread code_puppy/tools/file_operations.py

fix: add missing math import and update stale token estimation comment

b4a9938

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

Comment thread code_puppy/tools/file_operations.py

fix: update test and stale comment to reflect ripgrep fallback behavior

c2a6663

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

RyanLHicks changed the title ~~fix: fall back to non-recursive listing when ripgrep is not installed~~ fix: ripgrep fallback for list_files and standardize token estimation heuristic Mar 22, 2026

Delete tests/agents/test_token_estimation_consistency.py

d32b274

RyanLHicks changed the title ~~fix: ripgrep fallback for list_files and standardize token estimation heuristic~~ fix: ripgrep fallback for list_files Apr 16, 2026

RyanLHicks added 2 commits April 15, 2026 19:16

style: apply ruff formatting

b76f6dc

fix: update tests for 2.5 chars/token heuristic

1b46d6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ripgrep fallback for list_files#250

fix: ripgrep fallback for list_files#250
RyanLHicks wants to merge 7 commits intompfaffenberger:mainfrom
RyanLHicks:fix/list-files-ripgrep-fallback

RyanLHicks commented Mar 22, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

mpfaffenberger commented Apr 15, 2026

Uh oh!

RyanLHicks commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RyanLHicks commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

mpfaffenberger commented Apr 15, 2026

Uh oh!

RyanLHicks commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RyanLHicks commented Mar 22, 2026 •

edited

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading