-
Notifications
You must be signed in to change notification settings - Fork 484
353 add GitHub service caching only #432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
353 add GitHub service caching only #432
Conversation
- Add subscribe_to_repository_webhook method to automatically subscribe to webhooks when caching - Add clear_repository_cache method for cache invalidation - Update check_public_repo to subscribe to webhooks after caching - Use potpie.ai as webhook URL base
WalkthroughThe changes add a Redis-based caching mechanism and webhook subscription process to the GitHub repository visibility check in the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GithubService
participant Redis
participant GitHubAPI
Client->>+GithubService: check_public_repo(repo_name)
GithubService->>Redis: Get visibility status from cache
alt Cache hit
Redis-->>GithubService: Return cached status
GithubService-->>Client: Return status
else Cache miss
GithubService->>GitHubAPI: Fetch repository visibility
GitHubAPI-->>GithubService: Return visibility
GithubService->>Redis: Store visibility status (TTL 1 week)
GithubService->>GithubService: subscribe_to_repository_webhook(repo_name)
GithubService-->>Client: Return status
end
Possibly related issues
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
@dhirenmathur please review this pr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
app/modules/code_provider/github/github_service.py (1)
772-890: Consider adding rate limiting and monitoringThe caching implementation looks good overall. For production readiness, consider:
- Rate limiting: Add rate limiting for webhook subscription calls to avoid hitting GitHub API limits
- Monitoring: Add metrics for cache hit/miss rates to monitor cache effectiveness
- Cache warming: Consider implementing a cache warming strategy for frequently accessed repositories
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/modules/code_provider/github/github_service.py(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
app/modules/code_provider/github/github_service.py (6)
app/modules/intelligence/tools/web_tools/github_update_branch.py (1)
get_public_github_instance(62-66)app/modules/intelligence/tools/web_tools/github_add_pr_comment.py (1)
get_public_github_instance(91-95)app/modules/intelligence/tools/web_tools/github_create_branch.py (1)
get_public_github_instance(59-63)app/modules/intelligence/tools/web_tools/github_create_pr.py (1)
get_public_github_instance(71-75)app/modules/intelligence/tools/web_tools/github_tool.py (1)
get_public_github_instance(105-109)app/modules/code_provider/code_provider_service.py (1)
get_repo(19-20)
🪛 Ruff (0.12.2)
app/modules/code_provider/github/github_service.py
838-838: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (5)
app/modules/code_provider/github/github_service.py (5)
2-2: LGTM!The new imports are appropriate for the caching functionality.
Also applies to: 8-8
32-35: LGTM!The cache configuration constants are well-defined with an appropriate TTL.
810-826: LGTM!The synchronous helper methods are correctly implemented for use with the async executor pattern.
879-890: LGTM!The cache clearing method is correctly implemented with proper error handling.
772-809: Optimize webhook subscription to avoid redundant callsThe webhook subscription is called on every request, even when data is served from cache. This could lead to unnecessary API calls and potential rate limiting issues.
Move the webhook subscription inside the cache miss block:
# Call GitHub API if not cached is_public = await asyncio.get_event_loop().run_in_executor( self.executor, self._fetch_repository_visibility_sync, repo_name ) # Cache result with repository name as key cache_data = { "is_public": is_public, "cached_at": datetime.utcnow().isoformat(), "repo_name": repo_name } await asyncio.get_event_loop().run_in_executor( self.executor, self._store_cache_value_sync, cache_key, json.dumps(cache_data) ) - # Subscribe to webhook for this repository - await self.subscribe_to_repository_webhook(repo_name) + # Subscribe to webhook for this repository only on cache miss + await self.subscribe_to_repository_webhook(repo_name) return is_publicLikely an incorrect or invalid review comment.
| async def subscribe_to_repository_webhook(self, repo_name: str): | ||
| """Subscribe to repository webhook for visibility change notifications""" | ||
| try: | ||
| # Extract owner and repo from repo_name | ||
| if '/' not in repo_name: | ||
| logger.error(f"Invalid repository name format: {repo_name}") | ||
| return False | ||
|
|
||
| owner, repo = repo_name.split('/', 1) | ||
|
|
||
| # Webhook configuration | ||
| webhook_url = f"https://potpie.ai/api/v1/github/webhook" | ||
| webhook_config = { | ||
| "name": "web", | ||
| "active": True, | ||
| "events": ["public", "repository"], | ||
| "config": { | ||
| "url": webhook_url, | ||
| "content_type": "json", | ||
| "insecure_ssl": "0" | ||
| } | ||
| } | ||
|
|
||
| # Get GitHub token for API access | ||
| github = self.get_public_github_instance() | ||
|
|
||
| # Create webhook via GitHub API | ||
| api_url = f"https://api.github.com/repos/{owner}/{repo}/hooks" | ||
| headers = { | ||
| "Authorization": f"token {github._Github__requester._Requester__auth.token}", | ||
| "Accept": "application/vnd.github.v3+json", | ||
| "Content-Type": "application/json" | ||
| } | ||
|
|
||
| response = requests.post(api_url, json=webhook_config, headers=headers) | ||
|
|
||
| if response.status_code == 201: | ||
| webhook_data = response.json() | ||
| logger.info(f"Successfully subscribed to webhook for {repo_name}. Webhook ID: {webhook_data.get('id')}") | ||
| return True | ||
| elif response.status_code == 422: | ||
| # Webhook already exists | ||
| logger.info(f"Webhook already exists for {repo_name}") | ||
| return True | ||
| else: | ||
| logger.error(f"Failed to subscribe to webhook for {repo_name}. Status: {response.status_code}, Response: {response.text}") | ||
| return False | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Error subscribing to webhook for {repo_name}: {e}") | ||
| return False | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider making webhook URL configurable and avoid accessing private attributes
- The webhook URL is hardcoded, which could cause issues in different environments (dev/staging/prod).
- Accessing private GitHub library attributes (
github._Github__requester._Requester__auth.token) is fragile and could break with library updates.
Consider these improvements:
+ # Get webhook URL from configuration
+ webhook_url = config_provider.get_github_webhook_url() or "https://potpie.ai/api/v1/github/webhook"
webhook_config = {
"name": "web",
"active": True,
"events": ["public", "repository"],
"config": {
"url": webhook_url,
"content_type": "json",
"insecure_ssl": "0"
}
}For the token access, check if there's a public API method available:
🏁 Script executed:
#!/bin/bash
# Description: Search for public methods to get the GitHub token from the PyGithub library
# Search for token-related methods in the codebase
rg -A 5 "get.*token|auth.*token" --type py
# Search for how GitHub authentication is used elsewhere in the codebase
ast-grep --pattern 'Github($$$)'Length of output: 18660
🏁 Script executed:
#!/bin/bash
rg "def get_public_github_instance" -n app/modules/code_provider/github/github_service.py -A5Length of output: 309
Refactor webhook URL and token handling
The hardcoded webhook URL and private-token access are brittle. Please:
• Make the webhook URL configurable (e.g. via environment or injected settings).
• Avoid poking into github._Github__requester._Requester__auth.token. Instead, expose the token alongside the Github instance or provide a helper:
# e.g. in GitHubService
@classmethod
def get_public_github_instance_and_token(cls) -> tuple[Github, str]:
if not cls.gh_token_list:
cls.initialize_tokens()
token = random.choice(cls.gh_token_list)
return Github(token), tokenThen in subscribe_to_repository_webhook:
- github = self.get_public_github_instance()
- headers = {
- "Authorization": f"token {github._Github__requester._Requester__auth.token}",
+ github, token = self.get_public_github_instance_and_token()
+ headers = {
+ "Authorization": f"token {token}",
"Accept": "application/vnd.github.v3+json",
"Content-Type": "application/json"
}And load the webhook URL from your configuration provider, falling back to the existing default.
🧰 Tools
🪛 Ruff (0.12.2)
838-838: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In app/modules/code_provider/github/github_service.py around lines 827 to 878,
the webhook URL is hardcoded and the GitHub token is accessed via private
attributes, which is fragile. Refactor by making the webhook URL configurable
through environment variables or injected settings with a fallback default.
Modify the class to provide a method that returns both the Github instance and
its token together, avoiding direct access to private attributes. Update
subscribe_to_repository_webhook to use this new method to get the token and
Github instance, and use the configurable webhook URL instead of the hardcoded
one.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
|
@dhirenmathur please check |




Add Redis Caching and GitHub Webhook Subscription for Repository Visibility
Overview
This PR implements Redis caching for repository visibility checks and automatic GitHub webhook subscription to handle repository visibility changes.
Changes Made
Enhanced
check_public_repomethodrepo_visibility:{repo_name}New Methods Added
_fetch_repository_visibility_sync_get_cache_value_syncand_store_cache_value_syncsubscribe_to_repository_webhookpublicandrepositoryeventshttps://potpie.ai/api/v1/github/webhookclear_repository_cacheTechnical Details
Flow
Testing
Next Steps
Follow-up PR will implement webhook event handler to process incoming visibility change notifications and invalidate cache accordingly.
Summary by CodeRabbit