Skip to content

Perf/reduce redundant syscalls#201

Open
SugarFreeDoNoSo wants to merge 8 commits intompfaffenberger:mainfrom
SugarFreeDoNoSo:perf/reduce-redundant-syscalls
Open

Perf/reduce redundant syscalls#201
SugarFreeDoNoSo wants to merge 8 commits intompfaffenberger:mainfrom
SugarFreeDoNoSo:perf/reduce-redundant-syscalls

Conversation

@SugarFreeDoNoSo
Copy link
Copy Markdown

@SugarFreeDoNoSo SugarFreeDoNoSo commented Feb 24, 2026

It was taking longer than it had any right to, so I took a look under the hood. This trims redundant filesystem syscalls (scandir/stat batching), caches ripgrep lookup, and removes a few avoidable checks.

Summary by CodeRabbit

  • Refactor
    • Faster, more efficient directory and file scanning and deduplication for snappier browsing and searches.
    • Centralized and cached external search-tool lookup to speed repeated search operations.
  • Bug Fixes / Reliability
    • More robust per-entry error handling to reduce failures during listing and reading.
    • Ensures config directories are created with secure permissions when needed.
    • Log rotation now behaves consistently even when log files are missing.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

Refactors directory listing and file inspection to use os.scandir and stat-based checks, centralizes and caches ripgrep discovery, simplifies config directory creation with os.makedirs(..., exist_ok=True), and tightens per-operation error handling for listing, reading, and log rotation.

Changes

Cohort / File(s) Summary
Command-line scanning
code_puppy/command_line/utils.py
Replaced os.listdir-based listing with os.scandir; populate dirs/files during iteration; added per-entry OSError handling; removed intermediate entries list and broader exception wrapping.
File operations & ripgrep
code_puppy/tools/file_operations.py
Added cached _find_rg (lru_cache) for ripgrep discovery; refactored _list_files and _read_file to use a single os.stat plus stat.S_ISREG/stat.S_ISDIR; introduced seen_dirs to avoid duplicate dir entries; switched non-recursive listing to os.scandir; centralized _grep to use _find_rg; improved explicit exception handling (FileNotFoundError, PermissionError, OSError).
Config directories
code_puppy/config.py
Changed ensure_config_exists to unconditionally call os.makedirs(..., mode=0o700, exist_ok=True) for CONFIG_DIR, DATA_DIR, CACHE_DIR, and STATE_DIR (removes prior existence checks).
Log rotation
code_puppy/error_logging.py
Removed explicit file-exists check before size inspection; now attempts os.path.getsize and lets OSError be caught if file is missing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hop through scandir fields at dawn's first light,

stat tells the story, ripgrep cached just right,
mkdirs spring up with exist_ok cheer,
errors handled softly, nothing to fear,
a tidy rabbit nibbles code, quick and bright 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Perf/reduce redundant syscalls' accurately captures the main objective of the PR, which is to reduce redundant filesystem syscalls across multiple files through optimizations like scandir-based approaches, stat batching, and ripgrep path caching.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/command_line/utils.py`:
- Around line 16-17: The list comprehensions that build dirs and files from
entries call DirEntry.is_dir(follow_symlinks=True) which can raise
OSError/PermissionError and crash the listing; replace those comprehensions in
utils.py with an explicit loop over entries that calls e.is_dir(...) inside a
try/except OSError block, skipping any entry that raises and appending e.name to
either dirs or files accordingly (keep the variables dirs and files and
follow_symlinks=True usage so behavior is preserved).

In `@code_puppy/tools/file_operations.py`:
- Around line 153-166: The cached _find_rg currently stores None permanently;
change it so missing results are not cached: inside _find_rg, after failing to
locate ripgrep (before returning None) call _find_rg.cache_clear() (or implement
a small module-level cache variable instead) so subsequent calls (from
list_files/grep) can rediscover a newly installed "rg"; ensure the function name
_find_rg and callers (e.g., list_files, grep) continue to use the same API.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a5d05f and f3d4723.

📒 Files selected for processing (4)
  • code_puppy/command_line/utils.py
  • code_puppy/config.py
  • code_puppy/error_logging.py
  • code_puppy/tools/file_operations.py

Comment thread code_puppy/command_line/utils.py Outdated
Comment thread code_puppy/tools/file_operations.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
code_puppy/tools/file_operations.py (1)

312-339: Optional: reduce extra stat calls in non-recursive listing.
Right now each entry may hit multiple stat-like calls (is_dir, is_file, stat). You can mirror the recursive path by doing a single DirEntry.stat() and branching on stat.S_ISDIR/S_ISREG, with a small fallback on error to preserve behavior.

♻️ Suggested refactor (single stat per entry)
-                with os.scandir(directory) as it:
-                    for entry in sorted(it, key=lambda e: e.name):
-                        if entry.is_dir(follow_symlinks=True):
-                            if entry.name.startswith("."):
-                                continue
-                            results.append(
-                                ListedFile(
-                                    path=entry.name,
-                                    type="directory",
-                                    size=0,
-                                    full_path=entry.path,
-                                    depth=0,
-                                )
-                            )
-                        elif entry.is_file(follow_symlinks=True):
-                            try:
-                                size = entry.stat().st_size
-                            except OSError:
-                                size = 0
-                            results.append(
-                                ListedFile(
-                                    path=entry.name,
-                                    type="file",
-                                    size=size,
-                                    full_path=entry.path,
-                                    depth=0,
-                                )
-                            )
+                with os.scandir(directory) as it:
+                    for entry in sorted(it, key=lambda e: e.name):
+                        try:
+                            st = entry.stat()
+                            mode = st.st_mode
+                        except OSError:
+                            # Fallback: keep previous behavior on stat failures
+                            if entry.is_dir(follow_symlinks=True):
+                                if entry.name.startswith("."):
+                                    continue
+                                results.append(
+                                    ListedFile(
+                                        path=entry.name,
+                                        type="directory",
+                                        size=0,
+                                        full_path=entry.path,
+                                        depth=0,
+                                    )
+                                )
+                            elif entry.is_file(follow_symlinks=True):
+                                results.append(
+                                    ListedFile(
+                                        path=entry.name,
+                                        type="file",
+                                        size=0,
+                                        full_path=entry.path,
+                                        depth=0,
+                                    )
+                                )
+                            continue
+
+                        if stat.S_ISDIR(mode):
+                            if entry.name.startswith("."):
+                                continue
+                            results.append(
+                                ListedFile(
+                                    path=entry.name,
+                                    type="directory",
+                                    size=0,
+                                    full_path=entry.path,
+                                    depth=0,
+                                )
+                            )
+                        elif stat.S_ISREG(mode):
+                            results.append(
+                                ListedFile(
+                                    path=entry.name,
+                                    type="file",
+                                    size=st.st_size,
+                                    full_path=entry.path,
+                                    depth=0,
+                                )
+                            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code_puppy/tools/file_operations.py` around lines 312 - 339, Reduce duplicate
stat-like calls by calling entry.stat(follow_symlinks=True) once and branching
on stat.S_ISDIR/stat.S_ISREG: inside the os.scandir(directory) loop (where
ListedFile objects are created), replace the is_dir/is_file checks with a single
try: st = entry.stat(follow_symlinks=True) except OSError: set size=0 and skip
or treat as file per current behavior; then use stat.S_ISDIR(st.st_mode) to
detect directories (preserve the entry.name.startswith(".") skip) and
stat.S_ISREG for files (use st.st_size for size). Ensure you still populate
ListedFile(path=entry.name, type=..., size=..., full_path=entry.path, depth=0)
and preserve follow_symlinks semantics and the OSError fallback.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 312-339: Reduce duplicate stat-like calls by calling
entry.stat(follow_symlinks=True) once and branching on
stat.S_ISDIR/stat.S_ISREG: inside the os.scandir(directory) loop (where
ListedFile objects are created), replace the is_dir/is_file checks with a single
try: st = entry.stat(follow_symlinks=True) except OSError: set size=0 and skip
or treat as file per current behavior; then use stat.S_ISDIR(st.st_mode) to
detect directories (preserve the entry.name.startswith(".") skip) and
stat.S_ISREG for files (use st.st_size for size). Ensure you still populate
ListedFile(path=entry.name, type=..., size=..., full_path=entry.path, depth=0)
and preserve follow_symlinks semantics and the OSError fallback.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 970134b and 3a6e9bb.

📒 Files selected for processing (1)
  • code_puppy/tools/file_operations.py

@mpfaffenberger
Copy link
Copy Markdown
Owner

Are you working on code bases with like 50 million lines or something?

Please explain slowness... You sure it wasn't just LLM getting rate limited and exponential backoff?

@SugarFreeDoNoSo
Copy link
Copy Markdown
Author

Not critical, just a micro-optimization. With multiple agents running in parallel on the same filesystem, contention was causing the same ops to slow down dramatically, so I tidied it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants