Skip to content

Conversation

@uphargaur
Copy link

@uphargaur uphargaur commented Jul 19, 2025

Add Redis Caching and GitHub Webhook Subscription for Repository Visibility

Overview

This PR implements Redis caching for repository visibility checks and automatic GitHub webhook subscription to handle repository visibility changes.

Changes Made

Enhanced check_public_repo method

  • Added Redis caching layer to reduce GitHub API calls
  • Cache key format: repo_visibility:{repo_name}
  • Stores cached data with timestamp and repo metadata
  • Falls back to GitHub API on cache miss

New Methods Added

_fetch_repository_visibility_sync

  • Synchronous GitHub API call for repository visibility check
  • Returns boolean indicating if repository is public

_get_cache_value_sync and _store_cache_value_sync

  • Redis cache operations using synchronous methods
  • Integrated with async executor pattern

subscribe_to_repository_webhook

  • Automatically subscribes to GitHub webhooks for repository visibility changes
  • Listens to public and repository events
  • Webhook endpoint: https://potpie.ai/api/v1/github/webhook
  • Handles duplicate webhook scenarios

clear_repository_cache

  • Method to clear cache for specific repository
  • Used for cache invalidation scenarios

Technical Details

  • Uses thread executor for Redis operations in async context
  • JSON serialization for cache data structure
  • Comprehensive error handling and logging
  • Cache TTL configuration support

Flow

  1. Check Redis cache for repository visibility
  2. If cache miss, call GitHub API
  3. Store result in Redis with TTL
  4. Subscribe to repository webhook
  5. Return visibility status

Testing

  • Tested with LocalTunnel for webhook endpoint accessibility
  • Verified cache operations and webhook subscription

Next Steps

Follow-up PR will implement webhook event handler to process incoming visibility change notifications and invalidate cache accordingly.

Summary by CodeRabbit

  • New Features
    • Improved efficiency of GitHub repository visibility checks by introducing caching, reducing repeated API calls.
    • Repository visibility status is now kept up-to-date through automatic webhook subscriptions, enabling event-driven updates.
    • Added ability to clear cached repository visibility information for more accurate results.

- Add subscribe_to_repository_webhook method to automatically subscribe to webhooks when caching
- Add clear_repository_cache method for cache invalidation
- Update check_public_repo to subscribe to webhooks after caching
- Use potpie.ai as webhook URL base
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 19, 2025

Walkthrough

The changes add a Redis-based caching mechanism and webhook subscription process to the GitHub repository visibility check in the GithubService class. The check_public_repo method now caches visibility status, subscribes to webhooks for visibility changes, and provides methods to manage cache and webhook integration.

Changes

File(s) Change Summary
app/modules/code_provider/github/github_service.py Added async caching to check_public_repo, webhook subscription for repo visibility, cache helpers, and cache clearing method.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant GithubService
    participant Redis
    participant GitHubAPI

    Client->>+GithubService: check_public_repo(repo_name)
    GithubService->>Redis: Get visibility status from cache
    alt Cache hit
        Redis-->>GithubService: Return cached status
        GithubService-->>Client: Return status
    else Cache miss
        GithubService->>GitHubAPI: Fetch repository visibility
        GitHubAPI-->>GithubService: Return visibility
        GithubService->>Redis: Store visibility status (TTL 1 week)
        GithubService->>GithubService: subscribe_to_repository_webhook(repo_name)
        GithubService-->>Client: Return status
    end
Loading

Possibly related issues

Poem

A cache now guards the repo’s face,
With Redis speed, it finds its place.
Webhooks listen, quick to act,
On public flips, they stay intact.
Code hops forward, swift and bright—
A rabbit’s joy at features right!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6323b2b and 273ae00.

📒 Files selected for processing (1)
  • app/modules/code_provider/github/github_service.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/modules/code_provider/github/github_service.py
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@uphargaur
Copy link
Author

@dhirenmathur please review this pr

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
app/modules/code_provider/github/github_service.py (1)

772-890: Consider adding rate limiting and monitoring

The caching implementation looks good overall. For production readiness, consider:

  1. Rate limiting: Add rate limiting for webhook subscription calls to avoid hitting GitHub API limits
  2. Monitoring: Add metrics for cache hit/miss rates to monitor cache effectiveness
  3. Cache warming: Consider implementing a cache warming strategy for frequently accessed repositories
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 213eaf0 and 6323b2b.

📒 Files selected for processing (1)
  • app/modules/code_provider/github/github_service.py (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
app/modules/code_provider/github/github_service.py (6)
app/modules/intelligence/tools/web_tools/github_update_branch.py (1)
  • get_public_github_instance (62-66)
app/modules/intelligence/tools/web_tools/github_add_pr_comment.py (1)
  • get_public_github_instance (91-95)
app/modules/intelligence/tools/web_tools/github_create_branch.py (1)
  • get_public_github_instance (59-63)
app/modules/intelligence/tools/web_tools/github_create_pr.py (1)
  • get_public_github_instance (71-75)
app/modules/intelligence/tools/web_tools/github_tool.py (1)
  • get_public_github_instance (105-109)
app/modules/code_provider/code_provider_service.py (1)
  • get_repo (19-20)
🪛 Ruff (0.12.2)
app/modules/code_provider/github/github_service.py

838-838: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (5)
app/modules/code_provider/github/github_service.py (5)

2-2: LGTM!

The new imports are appropriate for the caching functionality.

Also applies to: 8-8


32-35: LGTM!

The cache configuration constants are well-defined with an appropriate TTL.


810-826: LGTM!

The synchronous helper methods are correctly implemented for use with the async executor pattern.


879-890: LGTM!

The cache clearing method is correctly implemented with proper error handling.


772-809: Optimize webhook subscription to avoid redundant calls

The webhook subscription is called on every request, even when data is served from cache. This could lead to unnecessary API calls and potential rate limiting issues.

Move the webhook subscription inside the cache miss block:

            # Call GitHub API if not cached
            is_public = await asyncio.get_event_loop().run_in_executor(
                self.executor, self._fetch_repository_visibility_sync, repo_name
            )
            
            # Cache result with repository name as key
            cache_data = {
                "is_public": is_public,
                "cached_at": datetime.utcnow().isoformat(),
                "repo_name": repo_name
            }
            await asyncio.get_event_loop().run_in_executor(
                self.executor, self._store_cache_value_sync, cache_key, json.dumps(cache_data)
            )
            
-            # Subscribe to webhook for this repository
-            await self.subscribe_to_repository_webhook(repo_name)
+            # Subscribe to webhook for this repository only on cache miss
+            await self.subscribe_to_repository_webhook(repo_name)
            
            return is_public

Likely an incorrect or invalid review comment.

Comment on lines 827 to 878
async def subscribe_to_repository_webhook(self, repo_name: str):
"""Subscribe to repository webhook for visibility change notifications"""
try:
# Extract owner and repo from repo_name
if '/' not in repo_name:
logger.error(f"Invalid repository name format: {repo_name}")
return False

owner, repo = repo_name.split('/', 1)

# Webhook configuration
webhook_url = f"https://potpie.ai/api/v1/github/webhook"
webhook_config = {
"name": "web",
"active": True,
"events": ["public", "repository"],
"config": {
"url": webhook_url,
"content_type": "json",
"insecure_ssl": "0"
}
}

# Get GitHub token for API access
github = self.get_public_github_instance()

# Create webhook via GitHub API
api_url = f"https://api.github.com/repos/{owner}/{repo}/hooks"
headers = {
"Authorization": f"token {github._Github__requester._Requester__auth.token}",
"Accept": "application/vnd.github.v3+json",
"Content-Type": "application/json"
}

response = requests.post(api_url, json=webhook_config, headers=headers)

if response.status_code == 201:
webhook_data = response.json()
logger.info(f"Successfully subscribed to webhook for {repo_name}. Webhook ID: {webhook_data.get('id')}")
return True
elif response.status_code == 422:
# Webhook already exists
logger.info(f"Webhook already exists for {repo_name}")
return True
else:
logger.error(f"Failed to subscribe to webhook for {repo_name}. Status: {response.status_code}, Response: {response.text}")
return False

except Exception as e:
logger.error(f"Error subscribing to webhook for {repo_name}: {e}")
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Consider making webhook URL configurable and avoid accessing private attributes

  1. The webhook URL is hardcoded, which could cause issues in different environments (dev/staging/prod).
  2. Accessing private GitHub library attributes (github._Github__requester._Requester__auth.token) is fragile and could break with library updates.

Consider these improvements:

+            # Get webhook URL from configuration
+            webhook_url = config_provider.get_github_webhook_url() or "https://potpie.ai/api/v1/github/webhook"
             webhook_config = {
                 "name": "web",
                 "active": True,
                 "events": ["public", "repository"],
                 "config": {
                     "url": webhook_url,
                     "content_type": "json",
                     "insecure_ssl": "0"
                 }
             }

For the token access, check if there's a public API method available:


🏁 Script executed:

#!/bin/bash
# Description: Search for public methods to get the GitHub token from the PyGithub library

# Search for token-related methods in the codebase
rg -A 5 "get.*token|auth.*token" --type py

# Search for how GitHub authentication is used elsewhere in the codebase
ast-grep --pattern 'Github($$$)'

Length of output: 18660


🏁 Script executed:

#!/bin/bash
rg "def get_public_github_instance" -n app/modules/code_provider/github/github_service.py -A5

Length of output: 309


Refactor webhook URL and token handling
The hardcoded webhook URL and private-token access are brittle. Please:

• Make the webhook URL configurable (e.g. via environment or injected settings).
• Avoid poking into github._Github__requester._Requester__auth.token. Instead, expose the token alongside the Github instance or provide a helper:

# e.g. in GitHubService
@classmethod
def get_public_github_instance_and_token(cls) -> tuple[Github, str]:
    if not cls.gh_token_list:
        cls.initialize_tokens()
    token = random.choice(cls.gh_token_list)
    return Github(token), token

Then in subscribe_to_repository_webhook:

- github = self.get_public_github_instance()
- headers = {
-     "Authorization": f"token {github._Github__requester._Requester__auth.token}",
+ github, token = self.get_public_github_instance_and_token()
+ headers = {
+     "Authorization": f"token {token}",
      "Accept": "application/vnd.github.v3+json",
      "Content-Type": "application/json"
  }

And load the webhook URL from your configuration provider, falling back to the existing default.

🧰 Tools
🪛 Ruff (0.12.2)

838-838: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In app/modules/code_provider/github/github_service.py around lines 827 to 878,
the webhook URL is hardcoded and the GitHub token is accessed via private
attributes, which is fragile. Refactor by making the webhook URL configurable
through environment variables or injected settings with a fallback default.
Modify the class to provide a method that returns both the Github instance and
its token together, avoiding direct access to private attributes. Update
subscribe_to_repository_webhook to use this new method to get the token and
Github instance, and use the configurable webhook URL instead of the hardcoded
one.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@uphargaur
Copy link
Author

@dhirenmathur please check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants