-
Notifications
You must be signed in to change notification settings - Fork 484
fix: Validate Repository Name Format in get_branch_list() #394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughA new private method Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GithubService
Client->>GithubService: get_branch_list(repo_name)
GithubService->>GithubService: _validate_repo_name(repo_name)
alt Validation passes
GithubService-->>Client: Proceed with branch retrieval
else Validation fails
GithubService-->>Client: Exception propagated
end
Client->>GithubService: check_public_repo(repo_name)
GithubService->>GithubService: _validate_repo_name(repo_name)
alt Validation passes
GithubService-->>Client: Proceed with repo check
else Validation fails
GithubService->>GithubService: Log error
GithubService-->>Client: HTTP 400 error with message
end
Assessment against linked issues
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/modules/code_provider/github/github_service.py (1)
781-789: Improve exception handling with exception chaining.The exception handling is good, but would benefit from using exception chaining to preserve the original exception context.
except ValueError as ve: logger.error(f"Invalid repository name format: {str(ve)}") - raise HTTPException(status_code=400, detail=str(ve)) + raise HTTPException(status_code=400, detail=str(ve)) from veThis preserves the original exception traceback which helps with debugging.
🧰 Tools
🪛 Ruff (0.8.2)
789-789: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/modules/code_provider/github/github_service.py(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
app/modules/code_provider/github/github_service.py (2)
app/modules/intelligence/tools/web_tools/github_add_pr_comment.py (1)
get_public_github_instance(91-95)app/modules/code_provider/local_repo/local_repo_service.py (1)
get_repo(26-31)
🪛 Ruff (0.8.2)
app/modules/code_provider/github/github_service.py
789-789: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (2)
app/modules/code_provider/github/github_service.py (2)
54-83: Well-designed repository name validation implementation.The new
_validate_repo_namemethod is well-structured and comprehensive. It properly handles both local paths and GitHub repositories, with appropriate validation of the owner/repo format according to GitHub conventions.The validation checks for:
- Non-empty string inputs
- Local directory paths
- Proper "owner/repo" format for GitHub repositories
- Character restrictions and length limits for each segment
- Proper formatting (no leading/trailing dashes, no consecutive dashes)
This is a robust implementation that will help prevent issues caused by malformed repository names.
501-503: LGTM - Proper validation placement.Adding validation at the beginning of the method ensures invalid inputs are rejected early before any resource-intensive operations are performed.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
app/modules/code_provider/tests/github_servicet_test_.py (2)
1-1: Remove unused imports.The
osandunittest.mock.patchimports are not being used in this test file. Whileosmight be indirectly used throughtmp_pathin pytest, there's no direct usage in your code.-import os import pytest -from unittest.mock import Mock, patch +from unittest.mock import Mock from fastapi import HTTPExceptionAlso applies to: 3-3
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
osimported but unusedRemove unused import:
os(F401)
55-61: Clarify local repository path validation bypass.The test suggests that local repository paths bypass validation entirely. Consider adding comments in both the test and the implementation to explain this design decision, as it might not be immediately obvious why local paths are exempt from the format validation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/modules/code_provider/tests/github_servicet_test_.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
app/modules/code_provider/tests/github_servicet_test_.py
1-1: os imported but unused
Remove unused import: os
(F401)
3-3: unittest.mock.patch imported but unused
Remove unused import: unittest.mock.patch
(F401)
🔇 Additional comments (1)
app/modules/code_provider/tests/github_servicet_test_.py (1)
17-86: Extensive test coverage for repository name validation.The test cases for the repository name validation are comprehensive, covering various validation scenarios:
- Valid repository formats (owner/repo with alphanumeric characters, dashes, and underscores)
- Invalid formats (empty strings, malformed paths, invalid characters)
- Edge cases (dash placement, consecutive dashes, length constraints)
- Type validation (non-string inputs)
- Integration with public methods and proper exception handling
This thorough approach helps ensure the validation logic is robust and correctly integrated.
| @pytest.mark.asyncio | ||
| async def test_get_branch_list_validation(self, github_service): | ||
| """Test that get_branch_list properly handles validation errors""" | ||
| with pytest.raises(HTTPException) as exc_info: | ||
| await github_service.get_branch_list("invalid-repo") | ||
|
|
||
| assert exc_info.value.status_code == 404 | ||
| assert "Repository not found or error fetching branches" in str(exc_info.value.detail) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Inconsistent error handling between validation methods.
There's an inconsistency between the error handling in check_public_repo (400 status code with validation error message) and get_branch_list (404 status code with "Repository not found" message). If both methods use the same validation logic, consider standardizing the error responses for consistency unless there's a specific reason for this difference.
🏁 Script executed:
#!/bin/bash
# Let's check the implementation of both methods to understand the inconsistency
echo "Checking implementation of check_public_repo and get_branch_list methods:"
rg -A 15 -B 5 "async def (check_public_repo|get_branch_list)" --type py app/modules/code_provider/github/github_service.pyLength of output: 1760
🏁 Script executed:
#!/bin/bash
# Extract the full implementation of get_branch_list
echo "===== get_branch_list implementation ====="
rg -A 200 "async def get_branch_list" --type py app/modules/code_provider/github/github_service.py
# Extract the implementation of _validate_repo_name
echo -e "\n===== _validate_repo_name implementation ====="
rg -A 5 -B 5 "def _validate_repo_name" --type py app/modules/code_provider/github/github_service.pyLength of output: 8500
Standardize validation error handling in get_branch_list
Both check_public_repo and get_branch_list use the same _validate_repo_name logic, but invalid repo names currently yield a 400 in one method and a 404 in the other. Let’s align their behavior by explicitly catching ValueError in get_branch_list and returning a 400 for validation failures:
• File: app/modules/code_provider/github/github_service.py
Update get_branch_list to handle ValueError before the generic exception:
async def get_branch_list(self, repo_name: str):
try:
self._validate_repo_name(repo_name)
…
except HTTPException as he:
raise he
+ except ValueError as ve:
+ # Invalid repo-name format should be a 400, not a 404
+ raise HTTPException(status_code=400, detail=str(ve))
except Exception as e:
logger.error(
f"Error fetching branches for repo {repo_name}: {str(e)}", exc_info=True
)
raise HTTPException(
status_code=404,
detail=f"Repository not found or error fetching branches: {str(e)}",
)This ensures malformed names consistently return HTTP 400, while missing or inaccessible repos continue to return HTTP 404.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @pytest.mark.asyncio | |
| async def test_get_branch_list_validation(self, github_service): | |
| """Test that get_branch_list properly handles validation errors""" | |
| with pytest.raises(HTTPException) as exc_info: | |
| await github_service.get_branch_list("invalid-repo") | |
| assert exc_info.value.status_code == 404 | |
| assert "Repository not found or error fetching branches" in str(exc_info.value.detail) | |
| async def get_branch_list(self, repo_name: str): | |
| try: | |
| self._validate_repo_name(repo_name) | |
| # … existing logic to fetch/return branches … | |
| except HTTPException as he: | |
| raise he | |
| except ValueError as ve: | |
| # Invalid repo-name format should be a 400, not a 404 | |
| raise HTTPException(status_code=400, detail=str(ve)) | |
| except Exception as e: | |
| logger.error( | |
| f"Error fetching branches for repo {repo_name}: {str(e)}", exc_info=True | |
| ) | |
| raise HTTPException( | |
| status_code=404, | |
| detail=f"Repository not found or error fetching branches: {str(e)}", | |
| ) |
dhirenmathur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR!
This only handles local repos not remote ones, please update the logic



Description
get_branch_list()andcheck_public_repo()Summary by CodeRabbit