Skip to content

Conversation

@dhirenmathur
Copy link
Contributor

@dhirenmathur dhirenmathur commented Sep 30, 2025

Summary by CodeRabbit

  • New Features

    • Global inference cache shared across projects for faster repeat processing.
    • Docstring requests now accept an optional metadata field.
  • Performance Improvements

    • Cache-aware batching, improved tokenization, and smarter reference resolution.
    • Chunking and consolidation for large nodes to stay within token limits.
  • Background Tasks

    • Automated cache cleanup (expired and least-accessed) with stats reporting.
  • Database

    • Added inference cache table and indexes; removed project linkage to enable global caching.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title highlights parsing performance and knowledge‐graph improvements, which correspond to some of the changes such as chunk handling for large files, but it does not reference the major new inference caching system and related migrations that constitute the bulk of the PR. It therefore captures only part of the changes and is overly broad about the actual scope. The phrasing is clear and descriptive of one aspect but misses the primary functional addition. As a result, the title is only partially related to the full changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cache_parse

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c950f83 and 99084e2.

📒 Files selected for processing (7)
  • app/alembic/versions/20250928_simple_global_cache.py (1 hunks)
  • app/modules/parsing/knowledge_graph/inference_service.py (8 hunks)
  • app/modules/parsing/services/cache_cleanup_service.py (1 hunks)
  • app/modules/parsing/services/inference_cache_service.py (1 hunks)
  • app/modules/parsing/tasks/cache_cleanup_tasks.py (1 hunks)
  • app/modules/parsing/utils/cache_diagnostics.py (1 hunks)
  • potpie-ui (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
app/modules/parsing/services/inference_cache_service.py (1)
app/modules/parsing/models/inference_cache_model.py (1)
  • InferenceCache (16-35)
app/modules/parsing/knowledge_graph/inference_service.py (6)
app/core/database.py (1)
  • get_db (29-34)
app/modules/parsing/knowledge_graph/inference_schema.py (3)
  • DocstringRequest (6-9)
  • DocstringResponse (18-19)
  • DocstringNode (12-15)
app/modules/parsing/services/inference_cache_service.py (3)
  • InferenceCacheService (10-148)
  • get_cached_inference (14-50)
  • store_inference (52-128)
app/modules/parsing/utils/content_hash.py (2)
  • generate_content_hash (6-26)
  • is_content_cacheable (29-49)
app/modules/parsing/utils/cache_diagnostics.py (2)
  • analyze_cache_misses (10-106)
  • log_diagnostics_summary (109-146)
app/modules/projects/projects_service.py (1)
  • get_project_from_db_by_id_sync (220-233)
app/modules/parsing/services/cache_cleanup_service.py (1)
app/modules/parsing/models/inference_cache_model.py (1)
  • InferenceCache (16-35)
app/alembic/versions/20250928_simple_global_cache.py (1)
app/alembic/versions/20250923_add_inference_cache_table.py (2)
  • upgrade (20-63)
  • downgrade (66-73)
app/modules/parsing/utils/cache_diagnostics.py (2)
app/modules/parsing/models/inference_cache_model.py (1)
  • InferenceCache (16-35)
app/modules/parsing/utils/content_hash.py (1)
  • generate_content_hash (6-26)
app/modules/parsing/tasks/cache_cleanup_tasks.py (2)
app/core/database.py (1)
  • get_db (29-34)
app/modules/parsing/services/cache_cleanup_service.py (4)
  • CacheCleanupService (11-90)
  • cleanup_expired_entries (31-43)
  • cleanup_least_accessed (45-71)
  • get_cleanup_stats (73-90)
🪛 Ruff (0.13.3)
app/modules/parsing/services/inference_cache_service.py

15-15: Unused method argument: project_id

(ARG002)


126-126: Use raise without specifying exception name

Remove exception name

(TRY201)

app/modules/parsing/knowledge_graph/inference_service.py

379-379: Unused method argument: project_id

(ARG002)


398-398: Do not catch blind exception: Exception

(BLE001)


615-615: Do not catch blind exception: Exception

(BLE001)


849-849: Do not catch blind exception: Exception

(BLE001)


926-926: Do not catch blind exception: Exception

(BLE001)


980-980: Do not catch blind exception: Exception

(BLE001)


990-990: Consider moving this statement to an else block

(TRY300)


992-992: Do not catch blind exception: Exception

(BLE001)


993-993: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

app/modules/parsing/utils/cache_diagnostics.py

91-91: Do not catch blind exception: Exception

(BLE001)


92-92: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

app/modules/parsing/tasks/cache_cleanup_tasks.py

21-21: Consider moving this statement to an else block

(TRY300)


46-46: Consider moving this statement to an else block

(TRY300)


67-67: Consider moving this statement to an else block

(TRY300)

🔇 Additional comments (1)
potpie-ui (1)

1-1: Confirm the potpie-ui submodule update.

Only the pointer changed, so I can’t review the underlying UI code here. Please confirm the linked commit was vetted (tests, lint, sanity checks) or share a brief summary of what it contains so we know the update is intentional.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (14)
app/modules/parsing/utils/content_hash.py (1)

44-47: Consider cross-platform line ending compatibility.

The current implementation splits on "\n" only, which may not correctly handle Windows-style "\r\n" line endings. This could affect the uniqueness calculation on different platforms.

Apply this diff to improve cross-platform compatibility:

-    lines = node_text.strip().split("\n")
+    lines = node_text.strip().splitlines()

The splitlines() method handles all line ending styles (\n, \r\n, \r) uniformly.

app/modules/parsing/services/cache_cleanup_service.py (2)

11-15: Add validation for TTL configuration.

The environment variable INFERENCE_CACHE_TTL_DAYS is parsed without validation. If the value is non-numeric, negative, or zero, it could cause unexpected behavior or failures.

Apply this diff to add validation:

     def __init__(self, db: Session):
         self.db = db
         # Default TTL: 30 days, configurable via environment
-        self.cache_ttl_days = int(os.getenv("INFERENCE_CACHE_TTL_DAYS", "30"))
+        try:
+            self.cache_ttl_days = int(os.getenv("INFERENCE_CACHE_TTL_DAYS", "30"))
+            if self.cache_ttl_days <= 0:
+                logger.warning(f"Invalid INFERENCE_CACHE_TTL_DAYS: {self.cache_ttl_days}, using default 30")
+                self.cache_ttl_days = 30
+        except ValueError:
+            logger.warning(f"Invalid INFERENCE_CACHE_TTL_DAYS format, using default 30")
+            self.cache_ttl_days = 30

31-57: Optimize bulk deletion for better performance.

Lines 51-52 delete entries one at a time in a loop, which is inefficient for large deletions. If entries_to_remove is substantial (e.g., 10,000+), this will generate thousands of individual DELETE statements and significantly slow down the cleanup task.

Apply this diff to use bulk deletion:

     def cleanup_least_accessed(self, max_entries: int = 100000) -> int:
         """Remove least accessed entries if cache grows too large"""
         total_entries = self.db.query(InferenceCache).count()
 
         if total_entries <= max_entries:
             return 0
 
         entries_to_remove = total_entries - max_entries
 
-        # Get least accessed entries
-        least_accessed = (
+        # Get IDs of least accessed entries
+        least_accessed_ids = (
             self.db.query(InferenceCache)
             .order_by(
                 InferenceCache.access_count.asc(), InferenceCache.last_accessed.asc()
             )
             .limit(entries_to_remove)
-            .all()
+            .with_entities(InferenceCache.id)
+            .subquery()
         )
 
-        # Delete them
-        for entry in least_accessed:
-            self.db.delete(entry)
+        # Bulk delete using subquery
+        result = self.db.execute(
+            delete(InferenceCache).where(InferenceCache.id.in_(least_accessed_ids))
+        )
 
         self.db.commit()
 
-        logger.info(f"Cleaned up {entries_to_remove} least accessed cache entries")
-        return entries_to_remove
+        deleted_count = result.rowcount
+        logger.info(f"Cleaned up {deleted_count} least accessed cache entries")
+        return deleted_count
app/alembic/versions/20250923_add_inference_cache_table.py (2)

52-54: Potential redundant index on unique column.

The UNIQUE constraint on content_hash (line 47) already creates an index in PostgreSQL. The explicit index creation on line 52-53 may be redundant. However, using op.f() for standardized naming is a best practice.

This is acceptable but note that PostgreSQL will use the unique constraint's index for lookups regardless.


20-64: Consider indexing access_count for cleanup queries.

The access_count column (line 45) is used by cache cleanup services to identify least-accessed entries for eviction but lacks an index. Cache cleanup queries filtering or ordering by access_count will require full table scans as the cache grows.

Add an index on access_count for cleanup performance:

     op.create_index(
         op.f("ix_inference_cache_last_accessed"), "inference_cache", ["last_accessed"]
     )
+    op.create_index(
+        op.f("ix_inference_cache_access_count"), "inference_cache", ["access_count"]
+    )

And in the downgrade:

 def downgrade():
+    op.drop_index(
+        op.f("ix_inference_cache_access_count"), table_name="inference_cache"
+    )
     op.drop_index(
         op.f("ix_inference_cache_last_accessed"), table_name="inference_cache"
     )
app/modules/parsing/models/inference_cache_model.py (2)

19-19: Redundant index=True on primary key.

Primary keys are automatically indexed by the database. The explicit index=True is redundant but harmless.

-    id = Column(Integer, primary_key=True, index=True)
+    id = Column(Integer, primary_key=True)

35-35: Consider adding index for access_count to support cleanup queries.

The access_count column is used by cache cleanup services to identify least-accessed entries but lacks an index. This will cause performance issues as the cache grows and cleanup queries need to filter/sort by access count.

This aligns with the missing index in the migration file.

-    access_count = Column(Integer, default=1, nullable=False)
+    access_count = Column(Integer, default=1, nullable=False, index=True)

Note: This requires a corresponding migration update in 20250923_add_inference_cache_table.py.

thoughts/shared/plans/simplified-global-cache-implementation.md (1)

23-26: Add language identifier to fenced code block.

The code block lacks a language identifier for syntax highlighting.

-```
+```text
 content → content_hash → global cache → inference
 (project_id stored as metadata only)

Based on coding guidelines (markdownlint MD040).

</blockquote></details>
<details>
<summary>app/modules/parsing/knowledge_graph/inference_service.py (6)</summary><blockquote>

`208-254`: **Remove unnecessary assignment at line 240.**

The method correctly splits large nodes into processable chunks. However, line 240 assigns `current_tokens` after adding a line to `current_chunk_lines`, but this value is never used since `test_tokens` is recalculated on the next iteration.



Apply this diff to remove the dead assignment:

```diff
             else:
                 current_chunk_lines.append(line)
-                current_tokens = test_tokens

Optionally, consider using unpacking at line 222 for better style (as per coding guidelines):

         for line in lines:
-            test_chunk = "\n".join(current_chunk_lines + [line])
+            test_chunk = "\n".join([*current_chunk_lines, line])

256-289: Move inline import to module level.

The consolidation logic is sound, but the inline import of DocstringNode at line 283 is unnecessary since it's already imported from inference_schema at line 17.

Remove the inline import and rely on the module-level import:

-        # Create single consolidated docstring for parent node
-        from app.modules.parsing.knowledge_graph.inference_schema import DocstringNode
-
         consolidated_docstring = DocstringNode(

291-336: Remove redundant inline import.

The chunk processing logic correctly consolidates multi-chunk responses. However, the inline import of DocstringResponse at lines 324-326 is redundant since it's already imported at line 19.

Remove the inline import:

             # Create a mock response list for consolidation
             mock_responses = []
             for chunk_doc in chunk_docstrings:
-                from app.modules.parsing.knowledge_graph.inference_schema import (
-                    DocstringResponse,
-                )
-
                 mock_responses.append(DocstringResponse(docstrings=[chunk_doc]))

547-559: Replace hardcoded condition with proper log level check.

The if True: at line 548 unconditionally runs diagnostics. Consider checking if debug logging is enabled instead.

-            # Run diagnostics on nodes if DEBUG logging is enabled
-            if True:
+            if logger.isEnabledFor(logging.DEBUG):
                 try:

852-855: Use logger.exception for better error context.

When logging errors within an exception handler, use logger.exception instead of logger.error to automatically include the stack trace.

                 except Exception as e:
-                    logger.error(f"Failed to process batch {batch_index}: {e}")
+                    logger.exception(f"Failed to process batch {batch_index}: {e}")
                     # Continue with next batch instead of failing entire operation
                     return DocstringResponse(docstrings=[])

356-365: Extract repeated cache initialization to a helper method.

The cache service initialization pattern is duplicated in batch_nodes and generate_docstrings. Consider extracting to a helper method to reduce duplication and ensure consistent error handling.

Add a new method:

def _get_cache_service(self) -> Optional[InferenceCacheService]:
    """Initialize cache service with proper resource management"""
    try:
        db_gen = get_db()
        db = next(db_gen)
        cache_service = InferenceCacheService(db)
        # Store generator for cleanup
        cache_service._db_gen = db_gen
        return cache_service
    except Exception as e:
        logger.warning(f"Failed to initialize cache service: {e}. Continuing without cache.")
        return None

def _close_cache_service(self, cache_service: Optional[InferenceCacheService]) -> None:
    """Clean up cache service database session"""
    if cache_service and hasattr(cache_service, '_db_gen'):
        try:
            next(cache_service._db_gen)
        except StopIteration:
            pass

Then replace both initialization blocks with cache_service = self._get_cache_service() and add cleanup calls before returning.

Also applies to: 783-791

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26cb74b and c950f83.

📒 Files selected for processing (17)
  • .gitignore (1 hunks)
  • app/alembic/versions/20250923_add_inference_cache_table.py (1 hunks)
  • app/alembic/versions/20250928_simple_global_cache.py (1 hunks)
  • app/core/config_provider.py (2 hunks)
  • app/modules/media/media_controller.py (1 hunks)
  • app/modules/media/media_router.py (2 hunks)
  • app/modules/media/media_service.py (1 hunks)
  • app/modules/parsing/knowledge_graph/inference_schema.py (1 hunks)
  • app/modules/parsing/knowledge_graph/inference_service.py (6 hunks)
  • app/modules/parsing/models/inference_cache_model.py (1 hunks)
  • app/modules/parsing/services/cache_cleanup_service.py (1 hunks)
  • app/modules/parsing/services/inference_cache_service.py (1 hunks)
  • app/modules/parsing/tasks/cache_cleanup_tasks.py (1 hunks)
  • app/modules/parsing/utils/cache_diagnostics.py (1 hunks)
  • app/modules/parsing/utils/content_hash.py (1 hunks)
  • deployment/stage/convo-server/convo-api-supervisord.conf (1 hunks)
  • thoughts/shared/plans/simplified-global-cache-implementation.md (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
app/alembic/versions/20250928_simple_global_cache.py (1)
app/alembic/versions/20250923_add_inference_cache_table.py (2)
  • upgrade (20-63)
  • downgrade (66-73)
app/modules/parsing/services/cache_cleanup_service.py (1)
app/modules/parsing/models/inference_cache_model.py (1)
  • InferenceCache (16-35)
app/modules/parsing/models/inference_cache_model.py (1)
app/core/base_model.py (1)
  • Base (9-16)
app/modules/parsing/knowledge_graph/inference_service.py (6)
app/core/database.py (1)
  • get_db (29-34)
app/modules/parsing/knowledge_graph/inference_schema.py (3)
  • DocstringRequest (6-9)
  • DocstringResponse (18-19)
  • DocstringNode (12-15)
app/modules/parsing/services/inference_cache_service.py (2)
  • get_cached_inference (14-44)
  • store_inference (46-122)
app/modules/parsing/utils/content_hash.py (2)
  • generate_content_hash (6-26)
  • is_content_cacheable (29-49)
app/modules/parsing/utils/cache_diagnostics.py (2)
  • analyze_cache_misses (10-104)
  • log_diagnostics_summary (107-144)
app/modules/projects/projects_service.py (1)
  • get_project_from_db_by_id_sync (220-233)
app/modules/parsing/utils/cache_diagnostics.py (2)
app/modules/parsing/models/inference_cache_model.py (1)
  • InferenceCache (16-35)
app/modules/parsing/utils/content_hash.py (1)
  • generate_content_hash (6-26)
app/modules/parsing/services/inference_cache_service.py (1)
app/modules/parsing/models/inference_cache_model.py (1)
  • InferenceCache (16-35)
app/alembic/versions/20250923_add_inference_cache_table.py (1)
app/alembic/versions/20250928_simple_global_cache.py (2)
  • upgrade (17-24)
  • downgrade (27-36)
app/modules/parsing/tasks/cache_cleanup_tasks.py (2)
app/core/database.py (1)
  • get_db (29-34)
app/modules/parsing/services/cache_cleanup_service.py (4)
  • CacheCleanupService (11-76)
  • cleanup_expired_entries (17-29)
  • cleanup_least_accessed (31-57)
  • get_cleanup_stats (59-76)
🪛 markdownlint-cli2 (0.18.1)
thoughts/shared/plans/simplified-global-cache-implementation.md

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.13.1)
app/modules/parsing/knowledge_graph/inference_service.py

222-222: Consider [*current_chunk_lines, line] instead of concatenation

Replace with [*current_chunk_lines, line]

(RUF005)


240-240: Local variable current_tokens is assigned to but never used

Remove assignment to unused variable current_tokens

(F841)


343-343: Unused method argument: project_id

(ARG002)


360-360: Do not catch blind exception: Exception

(BLE001)


558-558: Do not catch blind exception: Exception

(BLE001)


786-786: Do not catch blind exception: Exception

(BLE001)


836-836: Do not catch blind exception: Exception

(BLE001)


850-850: Consider moving this statement to an else block

(TRY300)


852-852: Do not catch blind exception: Exception

(BLE001)


853-853: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

app/modules/parsing/utils/cache_diagnostics.py

89-89: Do not catch blind exception: Exception

(BLE001)


90-90: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

app/modules/parsing/services/inference_cache_service.py

15-15: Unused method argument: project_id

(ARG002)


120-120: Use raise without specifying exception name

Remove exception name

(TRY201)

app/modules/parsing/tasks/cache_cleanup_tasks.py

20-20: Consider moving this statement to an else block

(TRY300)


22-22: Do not catch blind exception: Exception

(BLE001)


23-23: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


24-24: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


41-41: Consider moving this statement to an else block

(TRY300)


43-43: Do not catch blind exception: Exception

(BLE001)


44-44: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


45-45: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


58-58: Consider moving this statement to an else block

(TRY300)


60-60: Do not catch blind exception: Exception

(BLE001)


61-61: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


62-62: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (24)
deployment/stage/convo-server/convo-api-supervisord.conf (1)

1-13: LGTM! Cosmetic change only.

The only change in this file is the addition of a trailing newline, which has no functional impact on the supervisord configuration.

app/core/config_provider.py (2)

20-22: LGTM: Formatting improvement.

The multiline formatting improves readability without changing behavior.


123-130: LGTM: Good formatting with trailing comma.

The multiline formatting improves readability, and the trailing comma follows Python best practices for easier future modifications.

app/modules/media/media_service.py (1)

60-62: LGTM: Formatting improvement for readability.

The multi-line formatting of the logger message is a minor readability improvement with no functional impact.

app/modules/media/media_controller.py (1)

39-40: LGTM: Consistent formatting with trailing comma.

Adding the trailing comma follows Python best practices for multi-line dictionaries and reduces diff noise when adding future fields.

app/modules/media/media_router.py (1)

41-42: LGTM: Consistent formatting across multimodal error responses.

The trailing commas in both HTTPException detail dictionaries follow Python best practices and maintain consistency with the formatting changes in media_controller.py.

Also applies to: 188-189

.gitignore (1)

74-74: Confirm no .thoughts/ directory exists and intent
No hidden \.thoughts/ directories were detected. Please verify that renaming the ignore rule from .thoughts/ to thoughts/ is intentional and won’t expose previously ignored content.

app/modules/parsing/utils/content_hash.py (1)

6-26: LGTM: Hash generation is appropriate for cache keys.

The whitespace normalization (collapsing all whitespace to single spaces) is a reasonable trade-off for content-based caching, as it focuses on semantic equivalence rather than formatting. The optional node_type prefix provides useful differentiation for identical code in different contexts.

app/modules/parsing/knowledge_graph/inference_schema.py (1)

1-1: LGTM: Backward-compatible metadata field addition.

The new optional metadata field is properly typed and defaults to None, ensuring backward compatibility. This enables metadata propagation for the cache layer introduced in this PR.

Also applies to: 9-9

app/alembic/versions/20250928_simple_global_cache.py (1)

17-21: Please run these SQL queries against your PostgreSQL database and share the outputs to confirm whether the inference_cache_project_id_fkey constraint exists:

SELECT conname
FROM pg_constraint
WHERE conname = 'inference_cache_project_id_fkey'
  AND conrelid = 'inference_cache'::regclass;

SELECT conname, contype
FROM pg_constraint
WHERE conrelid = 'inference_cache'::regclass
  AND contype = 'f';
app/alembic/versions/20250923_add_inference_cache_table.py (1)

48-48: Note: FK constraint is removed in subsequent migration.

The foreign key constraint with CASCADE delete on project_id is removed in migration 20250928_simple_global_cache.py to enable global cache independence. This temporary FK creates a brief inconsistency where the cache would be deleted with projects, which contradicts the global cache design.

While the migration chain is correct, consider adding a comment here referencing the subsequent removal for maintainability.

app/modules/parsing/services/inference_cache_service.py (3)

14-44: LGTM! Global cache lookup correctly ignores project_id.

The implementation correctly performs lookups using only content_hash, with project_id kept for API compatibility as documented. Access tracking updates are appropriate.

Note: The Ruff ARG002 warning about unused project_id is a false positive—the parameter is intentionally kept for API compatibility during the migration to global cache.


37-37: Use server-side timestamp for consistency.

Using func.now() for last_accessed generates a SQL NOW() call, which is correct. However, ensure this matches the Column definition's server_default=func.now() behavior for consistency.

This is acceptable as-is but be aware that func.now() is evaluated server-side during commit.


124-142: Inconsistent project_id filtering in stats contradicts global cache design.

The get_cache_stats method filters by project_id (lines 128-129), which contradicts the global cache architecture where project_id is metadata only. This filtering suggests project-scoped analytics, but the cache itself is global.

While this may be intentional for analytics purposes, it creates confusion about whether the cache is truly global or project-scoped.

Clarify the intended use of project_id filtering in stats:

  • If stats should be global-only, remove the filtering
  • If project-scoped analytics are desired, document this as a metadata-only use case

Based on the plan document (thoughts/shared/plans/simplified-global-cache-implementation.md), project_id is "metadata only" for tracing, suggesting the filter should be removed or clearly documented as analytics-only.

app/modules/parsing/models/inference_cache_model.py (1)

16-36: LGTM! Model schema aligns with migration.

The ORM model correctly defines all columns, constraints, and indexes matching the Alembic migration. The explicit __tablename__ appropriately overrides the base class's auto-generated name.

thoughts/shared/plans/simplified-global-cache-implementation.md (1)

1-376: Excellent implementation plan with clear phases and verification criteria.

The plan document provides a well-structured migration path from the broken project-scoped cache to a simplified global cache. The three-phase approach (fix logic → remove FK → simplify integration) minimizes risk while delivering immediate value.

Key strengths:

  • Clear problem statement with root cause analysis
  • Incremental phases with automated and manual verification criteria
  • Comprehensive test strategy covering cross-project sharing and persistence
  • Performance considerations and rollback strategy

The embedded code examples match the actual implementation, demonstrating that the plan was followed.

app/modules/parsing/utils/cache_diagnostics.py (2)

10-104: Well-designed diagnostic utility with defensive programming.

The analyze_cache_misses function provides comprehensive analysis of cache effectiveness with appropriate defensive checks for malformed data. The categorization of issues (missing types, unresolved references, cache misses) with examples is helpful for debugging.


107-144: LGTM! Clear and actionable diagnostic output.

The summary logging provides well-formatted diagnostic information with appropriate categorization and example nodes for each issue type. The formatting with separators and percentage calculations makes it easy to identify cache problems.

app/modules/parsing/knowledge_graph/inference_service.py (6)

5-5: LGTM! Imports support the new caching infrastructure.

The additional imports for Optional, get_db, InferenceCacheService, and content hashing utilities are appropriate for the cache-aware inference processing being introduced.

Also applies to: 13-13, 21-25


83-91: LGTM! Robust input validation.

The added None and type checking prevents potential crashes when node text is missing or incorrectly typed, with appropriate warnings to surface data quality issues.


338-343: Unused parameter is acceptable for API consistency.

The project_id parameter is currently unused within batch_nodes itself (the cache service doesn't use it for lookups per the implementation). However, this is acceptable as it maintains API consistency and supports future enhancements where project-specific batching logic may be needed.


648-653: LGTM! Safety check prevents empty batches.

The added conditional ensures that only non-empty batches are appended, preventing potential issues downstream.


769-779: LGTM! Effective cache-first processing strategy.

The implementation correctly separates cached nodes from those requiring LLM processing, processing cached nodes first via update_neo4j_with_cached_inference before batching the remainder. This optimizes throughput by avoiding unnecessary LLM calls.


808-840: LGTM! Robust cache storage with graceful failure handling.

The cache storage logic correctly extracts inference data and stores it with appropriate metadata (content_hash, node_type, tags, embedding). The broad exception catch at line 836 is appropriate here - cache write failures should not break the inference pipeline.

Comment on lines 783 to 791
try:
db = next(get_db())
cache_service = InferenceCacheService(db)
except Exception as e:
logger.warning(
f"Failed to initialize cache service for result storage: {e}"
)
cache_service = None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix resource leak: improperly managed database session.

Same issue as in batch_nodes - the database session from get_db() is not properly closed, causing a connection leak.

Apply the same fix pattern as suggested for batch_nodes:

         # Process LLM batches and cache results
         try:
-            db = next(get_db())
+            db_gen = get_db()
+            db = next(db_gen)
             cache_service = InferenceCacheService(db)
         except Exception as e:
             logger.warning(
                 f"Failed to initialize cache service for result storage: {e}"
             )
             cache_service = None
+            db_gen = None

Add cleanup before the method returns (after line 870):

+        # Clean up database session
+        if 'db_gen' in locals() and db_gen:
+            try:
+                next(db_gen)
+            except StopIteration:
+                pass
+
         return updated_docstrings
🧰 Tools
🪛 Ruff (0.13.1)

786-786: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In app/modules/parsing/knowledge_graph/inference_service.py around lines 783 to
791, the code calls next(get_db()) to obtain a DB session but never closes the
generator, leaking connections; change to capture the generator (db_gen =
get_db()) then db = next(db_gen) and, before the method returns (add the cleanup
just prior to the return around line 870), add the cleanup block that checks if
db_gen is in locals() and calls next(db_gen) inside a try/except StopIteration
to close the generator; keep existing exception handling for initialization but
ensure the cleanup runs whether cache_service was created or set to None.

Comment on lines 980 to 1013
async def update_neo4j_with_cached_inference(self, node: Dict[str, Any]) -> None:
"""Update Neo4j with cached inference data for a single node"""
cached_inference = node.get("cached_inference", {})
if not cached_inference:
return

# Extract inference data
docstring = cached_inference.get("docstring", "")
tags = cached_inference.get("tags", [])
embedding = self.generate_embedding(docstring)

with self.driver.session() as session:
project = self.project_manager.get_project_from_db_by_id_sync(
node.get("project_id", "")
)
repo_path = project.get("repo_path") if project else None
is_local_repo = True if repo_path else False

session.run(
"""
MATCH (n:NODE {repoId: $repo_id, node_id: $node_id})
SET n.docstring = $docstring,
n.embedding = $embedding,
n.tags = $tags
"""
+ ("" if is_local_repo else ", n.text = null, n.signature = null"),
repo_id=node.get("project_id", ""),
node_id=node["node_id"],
docstring=docstring,
embedding=embedding,
tags=tags,
)

logger.debug(f"Updated Neo4j with cached inference for node {node['node_id']}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Verify project_id type and simplify boolean assignment.

Several issues to address:

  1. Potential type mismatch: Line 992 retrieves project_id as a string (with default ""), but get_project_from_db_by_id_sync expects an int parameter according to the method signature. This could cause a runtime error.

  2. Unnecessary async: This method has no await calls - the Neo4j session operations and project lookup are all synchronous. Consider making it a regular def method unless async is required for future compatibility.

  3. Simplify boolean assignment: Line 996 can be simplified.

Apply this diff to fix the type issue and simplify:

-        with self.driver.session() as session:
-            project = self.project_manager.get_project_from_db_by_id_sync(
-                node.get("project_id", "")
-            )
-            repo_path = project.get("repo_path") if project else None
-            is_local_repo = True if repo_path else False
+        with self.driver.session() as session:
+            # Convert project_id to int for DB lookup
+            project_id = node.get("project_id")
+            if project_id:
+                try:
+                    project_id = int(project_id) if isinstance(project_id, str) else project_id
+                except (ValueError, TypeError):
+                    logger.warning(f"Invalid project_id for node {node['node_id']}: {project_id}")
+                    project_id = None
+            
+            project = self.project_manager.get_project_from_db_by_id_sync(project_id) if project_id else None
+            repo_path = project.get("repo_path") if project else None
+            is_local_repo = bool(repo_path)

Additionally, consider changing the method signature to def instead of async def since there are no async operations.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def update_neo4j_with_cached_inference(self, node: Dict[str, Any]) -> None:
"""Update Neo4j with cached inference data for a single node"""
cached_inference = node.get("cached_inference", {})
if not cached_inference:
return
# Extract inference data
docstring = cached_inference.get("docstring", "")
tags = cached_inference.get("tags", [])
embedding = self.generate_embedding(docstring)
with self.driver.session() as session:
project = self.project_manager.get_project_from_db_by_id_sync(
node.get("project_id", "")
)
repo_path = project.get("repo_path") if project else None
is_local_repo = True if repo_path else False
session.run(
"""
MATCH (n:NODE {repoId: $repo_id, node_id: $node_id})
SET n.docstring = $docstring,
n.embedding = $embedding,
n.tags = $tags
"""
+ ("" if is_local_repo else ", n.text = null, n.signature = null"),
repo_id=node.get("project_id", ""),
node_id=node["node_id"],
docstring=docstring,
embedding=embedding,
tags=tags,
)
logger.debug(f"Updated Neo4j with cached inference for node {node['node_id']}")
async def update_neo4j_with_cached_inference(self, node: Dict[str, Any]) -> None:
"""Update Neo4j with cached inference data for a single node"""
cached_inference = node.get("cached_inference", {})
if not cached_inference:
return
# Extract inference data
docstring = cached_inference.get("docstring", "")
tags = cached_inference.get("tags", [])
embedding = self.generate_embedding(docstring)
with self.driver.session() as session:
# Convert project_id to int for DB lookup
project_id = node.get("project_id")
if project_id:
try:
project_id = int(project_id) if isinstance(project_id, str) else project_id
except (ValueError, TypeError):
logger.warning(f"Invalid project_id for node {node['node_id']}: {project_id}")
project_id = None
project = (
self.project_manager.get_project_from_db_by_id_sync(project_id)
if project_id
else None
)
repo_path = project.get("repo_path") if project else None
is_local_repo = bool(repo_path)
session.run(
"""
MATCH (n:NODE {repoId: $repo_id, node_id: $node_id})
SET n.docstring = $docstring,
n.embedding = $embedding,
n.tags = $tags
"""
("" if is_local_repo else ", n.text = null, n.signature = null"),
repo_id=node.get("project_id", ""),
node_id=node["node_id"],
docstring=docstring,
embedding=embedding,
tags=tags,
)
logger.debug(f"Updated Neo4j with cached inference for node {node['node_id']}")

Comment on lines +72 to +87
# Check if entry already exists
existing_entry = (
self.db.query(InferenceCache)
.filter(InferenceCache.content_hash == content_hash)
.first()
)

if existing_entry:
# Entry already exists - update access tracking and return
existing_entry.access_count += 1
existing_entry.last_accessed = func.now()
self.db.commit()
logger.debug(
f"Cache entry already exists for content_hash: {content_hash[:12]}... (updated access tracking)"
)
return existing_entry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Race condition: check-then-insert is not atomic.

The check for existing entry (lines 73-77) followed by insert (lines 90-106) creates a race window where concurrent processes can both pass the existence check and attempt to insert the same content_hash, leading to unique constraint violations.

While the exception handler (lines 107-122) recovers from this race condition, it's inefficient to rely on exception handling for expected concurrent behavior.

Consider using an atomic upsert with PostgreSQL's ON CONFLICT:

from sqlalchemy.dialects.postgresql import insert

def store_inference(self, content_hash: str, inference_data: Dict[str, Any], ...) -> InferenceCache:
    stmt = insert(InferenceCache).values(
        content_hash=content_hash,
        project_id=project_id,
        node_type=node_type,
        content_length=content_length,
        inference_data=inference_data,
        embedding_vector=embedding_vector,
        tags=tags,
    ).on_conflict_do_update(
        index_elements=['content_hash'],
        set_={
            'access_count': InferenceCache.access_count + 1,
            'last_accessed': func.now()
        }
    ).returning(InferenceCache)
    
    result = self.db.execute(stmt)
    cache_entry = result.scalar_one()
    self.db.commit()
    return cache_entry

This eliminates the race condition and is more efficient.

Comment on lines +89 to +90
except Exception as e:
logger.error(f"Error analyzing node {node_id}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use logging.exception to include traceback.

When logging errors within an exception handler, logging.exception automatically includes the stack trace, which is more helpful for debugging than logging.error.

         except Exception as e:
-            logger.error(f"Error analyzing node {node_id}: {e}")
+            logger.exception(f"Error analyzing node {node_id}: {e}")

Based on coding guidelines (Ruff TRY400).

Note: The broad Exception catch is acceptable here since this is a diagnostic utility that should be resilient to individual node failures and continue processing remaining nodes.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
logger.error(f"Error analyzing node {node_id}: {e}")
except Exception as e:
logger.exception(f"Error analyzing node {node_id}: {e}")
🧰 Tools
🪛 Ruff (0.13.1)

89-89: Do not catch blind exception: Exception

(BLE001)


90-90: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In app/modules/parsing/utils/cache_diagnostics.py around lines 89 to 90, replace
the logger.error call inside the except block with logger.exception so the stack
trace is recorded; keep the same contextual message (e.g., "Error analyzing node
{node_id}:") and pass the exception implicitly by calling logger.exception to
include the traceback while preserving the broad Exception catch behavior.

@dhirenmathur
Copy link
Contributor Author

@coderabbitai generate docstring

@potpie-ai potpie-ai deleted a comment from cursor bot Sep 30, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

Tip

For best results, initiate chat on the files or code changes.

Task: Generate docstrings for PR #452

I'll generate comprehensive docstrings for all public classes, methods, and functions across the new inference caching system to improve coverage from 58.14% to meet the 80% threshold.

Files requiring docstrings:

  1. app/modules/parsing/services/inference_cache_service.py - InferenceCacheService class and methods
  2. app/modules/parsing/services/cache_cleanup_service.py - CacheCleanupService class and methods
  3. app/modules/parsing/knowledge_graph/inference_service.py - Updated methods and new helper functions
  4. app/modules/parsing/utils/cache_diagnostics.py - Diagnostic utility functions
  5. app/modules/parsing/utils/content_hash.py - Hash generation utilities
  6. app/modules/parsing/models/inference_cache_model.py - ORM model class
  7. app/modules/parsing/tasks/cache_cleanup_tasks.py - Celery task functions
  8. app/alembic/versions/ - Migration scripts

@dhirenmathur I'll generate comprehensive docstrings across all 8 files in the inference caching system. The docstrings will follow Google-style format and cover:

  • Service classes: InferenceCacheService and CacheCleanupService with all methods
  • Updated inference logic: New cache-aware methods in InferenceService
  • Utility functions: Cache diagnostics and content hashing utilities
  • ORM model: InferenceCache table documentation
  • Celery tasks: Cache cleanup periodic tasks
  • Migration scripts: Database schema change documentation

This will bring your docstring coverage from 58.14% to well above the 80% threshold. The docstrings will include detailed descriptions, parameter specifications, return values, and any raised exceptions.


⏳ Preparing changes...

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

Caution

The CodeRabbit agent's plans did not produce any file changes.

dhirenmathur and others added 8 commits October 2, 2025 11:31
…nection_handling, and revert_script for database migration among other things.
…ached embedding reuse

- Preserved robust database connection cleanup in inference_service.py
- Maintained cached embedding vector reuse functionality
- Kept robust TTL configuration with proper error handling
- Preserved proper database connection cleanup in cache tasks
- Maintained correct node counting logic in cache diagnostics
* chore(submodule): refresh potpie-ui to latest origin/main

* chore: refresh potpie-ui submodule and update .gitignore

* chore: Auto-fix pre-commit issues

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@sonarqubecloud
Copy link

sonarqubecloud bot commented Oct 9, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants