-
-
Notifications
You must be signed in to change notification settings - Fork 115
Add comprehensive MCP interface proposal for issue #387 #726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This proposal designs an elegant, performant MCP (Model Context Protocol) interface for OpenContracts that provides read-only access to public resources. Key features: - One-corpus-at-a-time model for focused exploration - Anonymous user permission model (public resources only) - 4 MCP resources: corpus, document, annotation, thread - 7 MCP tools for discovery and retrieval - Performance optimized using AnnotationQueryOptimizer - Respects consolidated permissioning guide - Comprehensive implementation examples and testing strategy Implementation includes: - Resource URI scheme (corpus://, document://, annotation://, thread://) - Tool definitions (list_public_corpuses, list_documents, get_document_text, list_annotations, search_corpus, list_threads, get_thread_messages) - Security considerations (rate limiting, input validation) - Performance optimizations (caching, pagination, query optimization) - Complete code examples for server.py, resources.py, tools.py - Testing strategy with unit and integration tests - Deployment instructions Addresses: #387
PR Review: MCP Interface Proposal for OpenContractsOverall AssessmentStatus: APPROVE with Minor Suggestions This is an excellent, well-researched proposal that demonstrates deep understanding of OpenContracts' architecture and permissioning model. The document is comprehensive, production-ready, and follows established patterns throughout the codebase. Strengths1. Architecture Alignment ✅
2. Security Design ✅
3. Performance Considerations ✅
4. Documentation Quality ✅
Issues & SuggestionsCritical Issues: NoneMajor Concerns1. Missing Model Method VerificationThe proposal references several methods that may not exist in the current codebase: Line 88: "document_count": corpus.document_count(), # Does this method exist?Suggested fix: Use Lines 557, 567: embedder_path, query_vector = corpus.embed_text(query) # Verify method exists
doc_results = (Document.objects...search_by_embedding(...)) # Verify method existsSuggested action: Add a note that 2. Async/Sync Boundary IssuesThe proposal marks resource functions as Lines 80-92, 117-146, 177-202: Resource functions marked async def get_corpus_resource(corpus_slug: str) -> str: # async keyword
anonymous = AnonymousUser()
corpus = Corpus.objects.visible_to_user(anonymous).get(slug=corpus_slug) # blocking call!Issue: Mixing Suggested fix: Either:
Example fix: from asgiref.sync import sync_to_async
async def get_corpus_resource(corpus_slug: str) -> str:
anonymous = AnonymousUser()
corpus = await sync_to_async(
lambda: Corpus.objects.visible_to_user(anonymous).get(slug=corpus_slug)
)()
# ... rest of implementationMinor Issues3. Incomplete Error HandlingLines 129-130, 259-260: Uses if not document:
raise NotFoundError() # Not defined in importsSuggested fix: Import or define the exception, or use Django's 4. Missing Helper Function ImplementationsMultiple formatter functions are referenced but not implemented:
Suggestion: Add a section showing example implementations of these formatters, or note they need to be implemented. 5. Thread Message Recursion PotentialLines 277-284: def build_threaded_messages(thread: Conversation, user) -> list:
messages = (ChatMessage.objects
.visible_to_user(user)
.filter(conversation=thread, parent_message__isnull=True)
.order_by('created_at'))
return [format_message_with_replies(msg, user) for msg in messages] # Recursive?Concern: Could cause N+1 queries if not using Suggested fix: Show prefetching pattern: messages = (ChatMessage.objects
.visible_to_user(user)
.filter(conversation=thread, parent_message__isnull=True)
.prefetch_related('replies__replies') # Prefetch nested replies
.order_by('created_at'))6. Corpus Methods ReferencedLine 557: 7. Testing File TruncationThe test example at line 1020-1044 appears to be cut off mid-implementation. Complete the example or note it's abbreviated. 8. Configuration Settings IntegrationThe Django settings example (lines 1002-1014) should note where to import # config/settings/base.py
import environ # Add this
env = environ.Env()9. URI Parsing FragilityLines 767-786: URI parsing using string parts = uri.replace("document://", "").split("/")
return await get_document_resource(parts[0], parts[1]) # What if slug contains encoded slash?Suggested improvement: Use regex or proper URI parsing: import re
def parse_document_uri(uri: str) -> tuple[str, str]:
match = re.match(r'^document://([^/]+)/([^/]+)$', uri)
if not match:
raise ValueError(f"Invalid document URI: {uri}")
return match.group(1), match.group(2)Documentation/Clarity Improvements10. CHANGELOG Update NeededPer CLAUDE.md guidelines, this significant feature proposal should have a CHANGELOG entry: ## [Unreleased]
### Added
- MCP (Model Context Protocol) interface proposal for read-only public resource access
- 4 resource types: corpus, document, annotation, thread
- 7 tools for discovery and retrieval
- Anonymous user permission model
- Location: `docs/mcp/mcp_interface_proposal.md`11. ConversationTypeChoices ReferenceLine 632: Uses from opencontractserver.types.enums import ConversationTypeChoicesCode Quality AssessmentBest Practices ✅
Testing Coverage ✅
Performance ✅
Security AnalysisStrengths
Recommendations
RecommendationsBefore Merging (Documentation PR)
For Implementation Phase
Performance ConsiderationsPotential Bottlenecks
Alignment with Repository GuidelinesCLAUDE.md Compliance ✅
Branch Strategy ✅
Final VerdictThis is production-quality documentation that demonstrates excellent understanding of the codebase. The async/sync issues and missing imports are the only blockers for code examples. Once those are addressed, this proposal is ready for implementation. Suggested Next Steps
Score: 9/10 - Excellent work! The minor issues are easily fixable and don't detract from the overall quality of the proposal. Review generated by Claude Code using repository conventions from CLAUDE.md |
- Convert async tool functions to sync with sync_to_async wrapper pattern - Add URIParser class with regex-based URI validation for safety - Add complete helper function implementations (format_corpus_summary, format_document_summary, format_annotation, format_thread_summary, format_message, format_message_with_replies, text_search_fallback) - Fix NotFoundError references to use proper Django exceptions - Add prefetch_related for threaded messages to avoid N+1 queries - Add proper imports to all resource and tool functions - Update CHANGELOG with MCP interface proposal entry Addresses review feedback on PR #726 for issue #387
Signed-off-by: JSIV <[email protected]>
PR Review: MCP Interface Proposal (#726)Overall AssessmentThis is an excellent, well-thought-out proposal for adding MCP (Model Context Protocol) interface to OpenContracts. The design is elegant, follows established patterns from the codebase, and demonstrates deep understanding of the permissioning model. The documentation is comprehensive and production-ready. Recommendation: Approve with minor suggestions Strengths1. Architecture & Design Excellence
2. Performance Awareness
3. Security & Permissioning
4. Code Quality
Issues & SuggestionsCritical IssuesNone found - This is a proposal document, not production code, so no blocking issues. Medium Priority Suggestions1. Sync vs Async Pattern Consistency (Lines 337-363, 945-1150)The proposal uses Suggestion: Add note about Django ORM thread-safety: # Note: Django ORM is not thread-safe across async contexts.
# Always use sync_to_async for ORM operations in async handlers.
# Each wrapped call runs in a separate thread pool executor.2. Error Response Standardization (Throughout tool implementations)Different error types (DoesNotExist, ValueError) are raised but there's no unified error response format for MCP clients. Suggestion: Add error handling wrapper: def mcp_error_handler(func):
"""Wrap tool functions to return consistent error responses."""
async def wrapper(*args, **kwargs):
try:
return await func(*args, **kwargs)
except Model.DoesNotExist as e:
return {"error": "not_found", "message": str(e)}
except ValueError as e:
return {"error": "invalid_input", "message": str(e)}
except PermissionDenied as e:
return {"error": "forbidden", "message": "Access denied"}
return wrapper3. Missing Model Method Documentation (Lines 372, 651)The proposal assumes methods like Recommendation:
4. Message Prefetch Depth (Line 299, 879).prefetch_related('replies__replies') # Prefetch 2 levelsSuggestion: Document the depth limitation more clearly and consider making it configurable: # Prefetch depth = 2 levels (root -> reply -> nested reply)
# For deeper threads, replies at depth > 2 will trigger additional queries
# Adjust PREFETCH_DEPTH in config.py if needed
PREFETCH_DEPTH = getattr(settings, 'MCP_MESSAGE_PREFETCH_DEPTH', 2)
prefetch_string = '__'.join(['replies'] * PREFETCH_DEPTH)Low Priority / Nice-to-Have5. Bounding Box Format (Lines 165-169)"bounding_box": {
"top": 120,
"left": 50,
"right": 450,
"bottom": 145
}Question: Does this match the PAWLs format used elsewhere in OpenContracts? Consider documenting the coordinate system (pixels, points, normalized 0-1, etc.) 6. Thread Conversation Type (Lines 756-759)conversation_type=ConversationTypeChoices.THREAD,Suggestion: Verify 7. Testing Coverage (Lines 1308-1396)Excellent test examples, but consider adding:
Code Examples Reviewsearch_corpus Implementation (Lines 635-677)Excellent, but one potential issue: embedder_path, query_vector = corpus.embed_text(query)
if not query_vector:
# Fallback to text search if embeddings unavailable
return text_search_fallback(corpus, query, limit, anonymous)Question: Does # Verify this exists in opencontractserver/corpuses/models.py
if hasattr(corpus, 'embedder') and corpus.embedder:
query_vector = corpus.embedder.embed_text(query)
else:
return text_search_fallback(...)format_message_with_replies (Lines 901-924)Excellent recursion handling with depth limits and prefetch awareness. One suggestion: if current_depth >= max_depth:
# Stop recursion at max depth
formatted["replies"] = []
formatted["has_more_replies"] = message.replies.exists() if hasattr(message, 'replies') else False
return formattedThe formatted["has_more_replies"] = len(list(message.replies.all())) > 0 if hasattr(message, 'replies') else FalseOr better, just check the prefetched cache: formatted["has_more_replies"] = bool(getattr(message, '_prefetched_objects_cache', {}).get('replies'))Alignment with CLAUDE.md Guidelines✅ Follows Guidelines
✅ Code Quality Standards
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Add Model Context Protocol (MCP) server module that provides AI assistants with structured access to public OpenContracts data: - Resources: corpus, document, annotation, thread (via custom URI schemes) - Tools: list_public_corpuses, list_documents, get_document_text, list_annotations, search_corpus, list_threads, get_thread_messages - Anonymous user model ensures only public resources are accessible - Uses existing visible_to_user() and AnnotationQueryOptimizer patterns - Rate limiting via Django cache - Comprehensive unit tests (19 tests) Closes #726
Code Review: MCP Interface ImplementationThis is an excellent, well-architected proposal for adding MCP support to OpenContracts. Strengths
CRITICAL Issues1. Rate Limiter Bug (permissions.py:75-89)
2. Missing Dependency Version (requirements/base.txt:81)
HIGH Priority Issues3. Exception Handling Too Broad (tools.py:131-135, 228-251)
4. Missing Input Validation (tools.py:144-198)
5. embed_text() Return Assumption (tools.py:228)
MEDIUM Priority Issues
Security Considerations
RecommendationsBefore Merge (Critical)
Follow-up (Separate PRs)
Overall AssessmentQuality Score: 8.5/10 High-quality work with excellent architecture and documentation. Issues are mostly minor refinements. Permission model is sound, performance optimizations appropriate, follows project conventions. Recommendation: Approve with minor changes - Fix critical issues (rate limiter, dependency), then merge. Great work! The one-corpus-at-a-time model is elegant and the anonymous user permission strategy is exactly right. |
- Add ASGI routing for /mcp/* endpoints in config/asgi.py - Implement SSE transport handlers (GET /mcp/sse/, POST /mcp/messages/) - Refactor server.py to use factory pattern for cleaner initialization - Apply code formatting fixes from pre-commit hooks
Code Review - PR #726: MCP Interface ImplementationThis is an impressive and comprehensive implementation of an MCP (Model Context Protocol) interface for OpenContracts. The design is elegant, well-documented, and follows the project's architecture patterns. Below is my detailed feedback. 🎯 Overall AssessmentSummary: High-quality implementation with excellent documentation. The code follows OpenContracts conventions and implements a clean read-only public API. A few minor issues to address before merging. Recommendation: Approve with minor revisions ✅ Strengths1. Excellent Documentation
2. Security-First Design
3. Performance Optimization
4. Clean Architecture
5. Test Coverage
|
- Add docs/mcp/README.md with getting started guide, tool/resource reference, transport options, and architecture overview - Remove proposal document (replaced by actual documentation) - Switch to Streamable HTTP transport (stateless mode) for reliability - Use DocumentPath for corpus document membership (source of truth) - Add CorpusSettings embed model configuration UI
PR Review: MCP Interface ImplementationThis is a well-designed implementation of a Model Context Protocol server for OpenContracts. The code quality is high, security considerations are solid, and the architecture aligns well with project conventions. StrengthsArchitecture & Design
Security Model
Performance
Test Coverage
Critical Issues1. Missing MCP Dependency (BLOCKING) requirements/base.txt adds mcp>=1.0.0, but this package does not exist on PyPI yet. ACTION REQUIRED: Update requirements/base.txt with working dependency source or wait for PyPI release. 2. Frontend Permission Logic Complexity (HIGH) CorpusSettings.tsx (lines 645-668) has complex ownership detection with two fallback mechanisms that is fragile. RECOMMENDATION: Add isOwner boolean field to GraphQL CorpusType instead of inferring from permissions. 3. Rate Limiting Not Enforced (MEDIUM) RateLimiter class exists but is not used in MCP server handlers. RECOMMENDATION: Add rate limiting to call_tool and read_resource handlers. 4. Missing creator.id Field (MEDIUM) CorpusSettings.tsx checks corpus.creator.id - verify GraphQL fragment includes this field. 5. Potential N+1 in Document Search (LOW) tools.py:78 fetches corpus document IDs then filters - use subquery instead for large corpora. Recommendations SummaryMust Fix Before Merge:
Should Fix (High Priority): Nice to Have: VerdictOverall Assessment: High-quality work with excellent architecture and security design. Recommendation: Approve with required changes Great work! The code demonstrates strong understanding of both MCP protocol and OpenContracts architecture. With the dependency issue resolved, this will be a valuable addition. Files Reviewed: 16 | Lines Added: 1959 | Lines Removed: 22 |
- Add tests for tools.py: list_documents, get_document_text, list_annotations, search_corpus, list_threads, get_thread_messages - Add tests for resources.py: document, annotation, thread resources - Add tests for formatters.py: all formatter functions now at 100% - Add tests for server.py components: lifespan manager, URI parser Coverage improvements: - tools.py: 18% → 92% - resources.py: 22% → 98% - formatters.py: 37% → 100% - Overall MCP module: 59.76% → 94%
Code Review - PR #726: MCP Interface ImplementationThis is an excellent implementation of the Model Context Protocol (MCP) server for OpenContracts. The code quality is high, architecture is sound, and testing is comprehensive. Below are my detailed findings: ✅ Strengths1. Architecture & Design
2. Security
3. Performance
4. Test Coverage
5. Code Quality
|
This proposal designs an elegant, performant MCP (Model Context Protocol)
interface for OpenContracts that provides read-only access to public resources.
Key features:
Implementation includes:
list_annotations, search_corpus, list_threads, get_thread_messages)
Addresses: #387