Perf/reduce redundant syscalls by SugarFreeDoNoSo · Pull Request #201 · mpfaffenberger/code_puppy

SugarFreeDoNoSo · 2026-02-24T18:27:29Z

It was taking longer than it had any right to, so I took a look under the hood. This trims redundant filesystem syscalls (scandir/stat batching), caches ripgrep lookup, and removes a few avoidable checks.

Summary by CodeRabbit

Refactor
- Faster, more efficient directory and file scanning and deduplication for snappier browsing and searches.
- Centralized and cached external search-tool lookup to speed repeated search operations.
Bug Fixes / Reliability
- More robust per-entry error handling to reduce failures during listing and reading.
- Ensures config directories are created with secure permissions when needed.
- Log rotation now behaves consistently even when log files are missing.

coderabbitai · 2026-02-24T18:27:49Z

📝 Walkthrough

Walkthrough

Refactors directory listing and file inspection to use os.scandir and stat-based checks, centralizes and caches ripgrep discovery, simplifies config directory creation with os.makedirs(..., exist_ok=True), and tightens per-operation error handling for listing, reading, and log rotation.

Changes

Cohort / File(s)	Summary
Command-line scanning `code_puppy/command_line/utils.py`	Replaced os.listdir-based listing with os.scandir; populate dirs/files during iteration; added per-entry OSError handling; removed intermediate entries list and broader exception wrapping.
File operations & ripgrep `code_puppy/tools/file_operations.py`	Added cached `_find_rg` (lru_cache) for ripgrep discovery; refactored `_list_files` and `_read_file` to use a single `os.stat` plus `stat.S_ISREG`/`stat.S_ISDIR`; introduced `seen_dirs` to avoid duplicate dir entries; switched non-recursive listing to `os.scandir`; centralized `_grep` to use `_find_rg`; improved explicit exception handling (FileNotFoundError, PermissionError, OSError).
Config directories `code_puppy/config.py`	Changed `ensure_config_exists` to unconditionally call `os.makedirs(..., mode=0o700, exist_ok=True)` for CONFIG_DIR, DATA_DIR, CACHE_DIR, and STATE_DIR (removes prior existence checks).
Log rotation `code_puppy/error_logging.py`	Removed explicit file-exists check before size inspection; now attempts `os.path.getsize` and lets OSError be caught if file is missing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hop through scandir fields at dawn's first light,

stat tells the story, ripgrep cached just right,
mkdirs spring up with exist_ok cheer,
errors handled softly, nothing to fear,
a tidy rabbit nibbles code, quick and bright 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Perf/reduce redundant syscalls' accurately captures the main objective of the PR, which is to reduce redundant filesystem syscalls across multiple files through optimizations like scandir-based approaches, stat batching, and ripgrep path caching.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/command_line/utils.py`:
- Around line 16-17: The list comprehensions that build dirs and files from
entries call DirEntry.is_dir(follow_symlinks=True) which can raise
OSError/PermissionError and crash the listing; replace those comprehensions in
utils.py with an explicit loop over entries that calls e.is_dir(...) inside a
try/except OSError block, skipping any entry that raises and appending e.name to
either dirs or files accordingly (keep the variables dirs and files and
follow_symlinks=True usage so behavior is preserved).

In `@code_puppy/tools/file_operations.py`:
- Around line 153-166: The cached _find_rg currently stores None permanently;
change it so missing results are not cached: inside _find_rg, after failing to
locate ripgrep (before returning None) call _find_rg.cache_clear() (or implement
a small module-level cache variable instead) so subsequent calls (from
list_files/grep) can rediscover a newly installed "rg"; ensure the function name
_find_rg and callers (e.g., list_files, grep) continue to use the same API.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a5d05f and f3d4723.

📒 Files selected for processing (4)

code_puppy/command_line/utils.py
code_puppy/config.py
code_puppy/error_logging.py
code_puppy/tools/file_operations.py

coderabbitai

🧹 Nitpick comments (1)

code_puppy/tools/file_operations.py (1)

312-339: Optional: reduce extra stat calls in non-recursive listing.
Right now each entry may hit multiple stat-like calls (is_dir, is_file, stat). You can mirror the recursive path by doing a single DirEntry.stat() and branching on stat.S_ISDIR/S_ISREG, with a small fallback on error to preserve behavior.

♻️ Suggested refactor (single stat per entry)

-                with os.scandir(directory) as it:
-                    for entry in sorted(it, key=lambda e: e.name):
-                        if entry.is_dir(follow_symlinks=True):
-                            if entry.name.startswith("."):
-                                continue
-                            results.append(
-                                ListedFile(
-                                    path=entry.name,
-                                    type="directory",
-                                    size=0,
-                                    full_path=entry.path,
-                                    depth=0,
-                                )
-                            )
-                        elif entry.is_file(follow_symlinks=True):
-                            try:
-                                size = entry.stat().st_size
-                            except OSError:
-                                size = 0
-                            results.append(
-                                ListedFile(
-                                    path=entry.name,
-                                    type="file",
-                                    size=size,
-                                    full_path=entry.path,
-                                    depth=0,
-                                )
-                            )
+                with os.scandir(directory) as it:
+                    for entry in sorted(it, key=lambda e: e.name):
+                        try:
+                            st = entry.stat()
+                            mode = st.st_mode
+                        except OSError:
+                            # Fallback: keep previous behavior on stat failures
+                            if entry.is_dir(follow_symlinks=True):
+                                if entry.name.startswith("."):
+                                    continue
+                                results.append(
+                                    ListedFile(
+                                        path=entry.name,
+                                        type="directory",
+                                        size=0,
+                                        full_path=entry.path,
+                                        depth=0,
+                                    )
+                                )
+                            elif entry.is_file(follow_symlinks=True):
+                                results.append(
+                                    ListedFile(
+                                        path=entry.name,
+                                        type="file",
+                                        size=0,
+                                        full_path=entry.path,
+                                        depth=0,
+                                    )
+                                )
+                            continue
+
+                        if stat.S_ISDIR(mode):
+                            if entry.name.startswith("."):
+                                continue
+                            results.append(
+                                ListedFile(
+                                    path=entry.name,
+                                    type="directory",
+                                    size=0,
+                                    full_path=entry.path,
+                                    depth=0,
+                                )
+                            )
+                        elif stat.S_ISREG(mode):
+                            results.append(
+                                ListedFile(
+                                    path=entry.name,
+                                    type="file",
+                                    size=st.st_size,
+                                    full_path=entry.path,
+                                    depth=0,
+                                )
+                            )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@code_puppy/tools/file_operations.py` around lines 312 - 339, Reduce duplicate
stat-like calls by calling entry.stat(follow_symlinks=True) once and branching
on stat.S_ISDIR/stat.S_ISREG: inside the os.scandir(directory) loop (where
ListedFile objects are created), replace the is_dir/is_file checks with a single
try: st = entry.stat(follow_symlinks=True) except OSError: set size=0 and skip
or treat as file per current behavior; then use stat.S_ISDIR(st.st_mode) to
detect directories (preserve the entry.name.startswith(".") skip) and
stat.S_ISREG for files (use st.st_size for size). Ensure you still populate
ListedFile(path=entry.name, type=..., size=..., full_path=entry.path, depth=0)
and preserve follow_symlinks semantics and the OSError fallback.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 312-339: Reduce duplicate stat-like calls by calling
entry.stat(follow_symlinks=True) once and branching on
stat.S_ISDIR/stat.S_ISREG: inside the os.scandir(directory) loop (where
ListedFile objects are created), replace the is_dir/is_file checks with a single
try: st = entry.stat(follow_symlinks=True) except OSError: set size=0 and skip
or treat as file per current behavior; then use stat.S_ISDIR(st.st_mode) to
detect directories (preserve the entry.name.startswith(".") skip) and
stat.S_ISREG for files (use st.st_size for size). Ensure you still populate
ListedFile(path=entry.name, type=..., size=..., full_path=entry.path, depth=0)
and preserve follow_symlinks semantics and the OSError fallback.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 970134b and 3a6e9bb.

📒 Files selected for processing (1)

code_puppy/tools/file_operations.py

mpfaffenberger · 2026-02-25T13:19:03Z

Are you working on code bases with like 50 million lines or something?

Please explain slowness... You sure it wasn't just LLM getting rate limited and exponential backoff?

SugarFreeDoNoSo · 2026-02-25T14:50:27Z

Not critical, just a micro-optimization. With multiple agents running in parallel on the same filesystem, contention was causing the same ops to slow down dramatically, so I tidied it up.

m0m1tge added 6 commits February 22, 2026 21:27

perf: reduce _list_files syscalls — stat batching, scandir, set lookup

a018047

perf: _read_file — eliminate redundant stat calls and TOCTOU race

e275c91

perf: cache ripgrep path lookup with lru_cache

f4a5771

perf: list_directory — scandir single-pass replaces double isdir loop

fbc0c1d

perf: minor syscall cleanups — error_logging and config

54cefa7

refactor: list_directory — use comprehensions with scandir

f3d4723

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

Comment thread code_puppy/command_line/utils.py Outdated

Comment thread code_puppy/tools/file_operations.py

SugarFreeDoNoSo added 2 commits February 24, 2026 16:21

Refactor list_directory to use os.scandir directly

970134b

Update file_operations.py

3a6e9bb

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf/reduce redundant syscalls#201

Perf/reduce redundant syscalls#201
SugarFreeDoNoSo wants to merge 8 commits intompfaffenberger:mainfrom
SugarFreeDoNoSo:perf/reduce-redundant-syscalls

SugarFreeDoNoSo commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

mpfaffenberger commented Feb 25, 2026

Uh oh!

SugarFreeDoNoSo commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SugarFreeDoNoSo commented Feb 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

mpfaffenberger commented Feb 25, 2026

Uh oh!

SugarFreeDoNoSo commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SugarFreeDoNoSo commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 24, 2026 •

edited

Loading