fix: Validate Repository Name Format in get_branch_list() #394

harshit078 · 2025-04-26T21:02:33Z

Description

this PR fixed Validate Repository Name Format in get_branch_list() and check_public_repo() #354
Added validation pattern for repo_name parameter in get_branch_list() and check_public_repo()
Added better error handling for validation pattern checks

Summary by CodeRabbit

Bug Fixes
- Improved validation of GitHub repository names to prevent errors with invalid formats.
- Enhanced error messages for invalid repository names, providing clearer feedback when input does not meet required criteria.
Tests
- Added comprehensive tests to verify repository name validation and error handling for GitHub service methods.

coderabbitai · 2025-04-26T21:02:39Z

Walkthrough

A new private method _validate_repo_name has been introduced to the GithubService class to enforce repository name validation according to GitHub's conventions. This method checks if the repository name is a non-empty string and matches the required "owner/repo" format, applying additional constraints to each segment. The validation is now invoked at the start of both get_branch_list and check_public_repo methods. In check_public_repo, validation errors are caught and returned as HTTP 400 errors, while in get_branch_list, exceptions are propagated normally. No other logic or control flow changes were made.

Changes

File(s)	Change Summary
app/modules/code_provider/github/github_service.py	Added `_validate_repo_name` method to validate repository names. Integrated validation into `get_branch_list` and `check_public_repo` methods, with error handling in `check_public_repo`.
app/modules/code_provider/tests/github_servicet_test_.py	Added `TestGithubServiceValidation` test class with tests for valid/invalid repo names, local path bypass, error handling in async methods, and non-string input validation. Added pytest fixture for `GithubService`.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant GithubService

    Client->>GithubService: get_branch_list(repo_name)
    GithubService->>GithubService: _validate_repo_name(repo_name)
    alt Validation passes
        GithubService-->>Client: Proceed with branch retrieval
    else Validation fails
        GithubService-->>Client: Exception propagated
    end

    Client->>GithubService: check_public_repo(repo_name)
    GithubService->>GithubService: _validate_repo_name(repo_name)
    alt Validation passes
        GithubService-->>Client: Proceed with repo check
    else Validation fails
        GithubService->>GithubService: Log error
        GithubService-->>Client: HTTP 400 error with message
    end

Assessment against linked issues

Objective	Addressed	Explanation
Implement validation in get_branch_list() and check_public_repo() (#354)	✅
Raise ValueError with descriptive message for invalid repo_name (#354)	✅
Error handling: HTTP 400 for invalid repo_name in check_public_repo() (#354)	✅
Unit tests for valid/invalid repo names (#354)	✅
Documentation/docstring updates for validation logic (#354)	❓	No mention of documentation or docstring changes in the provided summary.

Poem

In the codey warren, I hop with glee,
Validating repos—just as they should be!
No dashes astray, no slashes amiss,
"owner/repo" is the format, I insist!
If you wander from the rules, I’ll kindly say,
“Bad Request, dear coder, please try the right way!”
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

app/modules/code_provider/github/github_service.py (1)
781-789: Improve exception handling with exception chaining.

The exception handling is good, but would benefit from using exception chaining to preserve the original exception context.
        except ValueError as ve:
            logger.error(f"Invalid repository name format: {str(ve)}")
-           raise HTTPException(status_code=400, detail=str(ve))
+           raise HTTPException(status_code=400, detail=str(ve)) from ve
This preserves the original exception traceback which helps with debugging.

🧰 Tools

🪛 Ruff (0.8.2)

789-789: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de99181 and 81482af.

📒 Files selected for processing (1)

app/modules/code_provider/github/github_service.py (3 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

app/modules/code_provider/github/github_service.py (2)

app/modules/intelligence/tools/web_tools/github_add_pr_comment.py (1)

get_public_github_instance (91-95)

app/modules/code_provider/local_repo/local_repo_service.py (1)

get_repo (26-31)

🪛 Ruff (0.8.2)

app/modules/code_provider/github/github_service.py

789-789: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (2)

app/modules/code_provider/github/github_service.py (2)

54-83: Well-designed repository name validation implementation.

The new _validate_repo_name method is well-structured and comprehensive. It properly handles both local paths and GitHub repositories, with appropriate validation of the owner/repo format according to GitHub conventions.

The validation checks for:

Non-empty string inputs

Local directory paths

Proper "owner/repo" format for GitHub repositories

Character restrictions and length limits for each segment

Proper formatting (no leading/trailing dashes, no consecutive dashes)

This is a robust implementation that will help prevent issues caused by malformed repository names.

501-503: LGTM - Proper validation placement.

Adding validation at the beginning of the method ensures invalid inputs are rejected early before any resource-intensive operations are performed.

sonarqubecloud · 2025-04-26T22:02:14Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

app/modules/code_provider/tests/github_servicet_test_.py (2)
1-1: Remove unused imports.

The os and unittest.mock.patch imports are not being used in this test file. While os might be indirectly used through tmp_path in pytest, there's no direct usage in your code.
-import os
import pytest
-from unittest.mock import Mock, patch
+from unittest.mock import Mock
from fastapi import HTTPException
Also applies to: 3-3

🧰 Tools

🪛 Ruff (0.8.2)

1-1: os imported but unused

Remove unused import: os

(F401)

55-61: Clarify local repository path validation bypass.

The test suggests that local repository paths bypass validation entirely. Consider adding comments in both the test and the implementation to explain this design decision, as it might not be immediately obvious why local paths are exempt from the format validation.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81482af and b3794b5.

📒 Files selected for processing (1)

app/modules/code_provider/tests/github_servicet_test_.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.8.2)

app/modules/code_provider/tests/github_servicet_test_.py

1-1: os imported but unused

Remove unused import: os

(F401)

3-3: unittest.mock.patch imported but unused

Remove unused import: unittest.mock.patch

(F401)

🔇 Additional comments (1)

app/modules/code_provider/tests/github_servicet_test_.py (1)

17-86: Extensive test coverage for repository name validation.

The test cases for the repository name validation are comprehensive, covering various validation scenarios:

Valid repository formats (owner/repo with alphanumeric characters, dashes, and underscores)

Invalid formats (empty strings, malformed paths, invalid characters)

Edge cases (dash placement, consecutive dashes, length constraints)

Type validation (non-string inputs)

Integration with public methods and proper exception handling

This thorough approach helps ensure the validation logic is robust and correctly integrated.

coderabbitai · 2025-04-26T22:03:51Z

app/modules/code_provider/tests/github_servicet_test_.py

+    @pytest.mark.asyncio
+    async def test_get_branch_list_validation(self, github_service):
+        """Test that get_branch_list properly handles validation errors"""
+        with pytest.raises(HTTPException) as exc_info:
+            await github_service.get_branch_list("invalid-repo")
+
+        assert exc_info.value.status_code == 404
+        assert "Repository not found or error fetching branches" in str(exc_info.value.detail)
+


💡 Verification agent

🧩 Analysis chain

Inconsistent error handling between validation methods.

There's an inconsistency between the error handling in check_public_repo (400 status code with validation error message) and get_branch_list (404 status code with "Repository not found" message). If both methods use the same validation logic, consider standardizing the error responses for consistency unless there's a specific reason for this difference.

🏁 Script executed:

#!/bin/bash # Let's check the implementation of both methods to understand the inconsistency echo "Checking implementation of check_public_repo and get_branch_list methods:" rg -A 15 -B 5 "async def (check_public_repo|get_branch_list)" --type py app/modules/code_provider/github/github_service.py

Length of output: 1760

🏁 Script executed:

#!/bin/bash # Extract the full implementation of get_branch_list echo "===== get_branch_list implementation =====" rg -A 200 "async def get_branch_list" --type py app/modules/code_provider/github/github_service.py # Extract the implementation of _validate_repo_name echo -e "\n===== _validate_repo_name implementation =====" rg -A 5 -B 5 "def _validate_repo_name" --type py app/modules/code_provider/github/github_service.py

Length of output: 8500

Standardize validation error handling in get_branch_list

Both check_public_repo and get_branch_list use the same _validate_repo_name logic, but invalid repo names currently yield a 400 in one method and a 404 in the other. Let’s align their behavior by explicitly catching ValueError in get_branch_list and returning a 400 for validation failures:

• File: app/modules/code_provider/github/github_service.py
Update get_branch_list to handle ValueError before the generic exception:

async def get_branch_list(self, repo_name: str): try: self._validate_repo_name(repo_name) … except HTTPException as he: raise he + except ValueError as ve: + # Invalid repo-name format should be a 400, not a 404 + raise HTTPException(status_code=400, detail=str(ve)) except Exception as e: logger.error( f"Error fetching branches for repo {repo_name}: {str(e)}", exc_info=True ) raise HTTPException( status_code=404, detail=f"Repository not found or error fetching branches: {str(e)}", )

This ensures malformed names consistently return HTTP 400, while missing or inaccessible repos continue to return HTTP 404.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@pytest.mark.asyncio

async def test_get_branch_list_validation(self, github_service):

"""Test that get_branch_list properly handles validation errors"""

with pytest.raises(HTTPException) as exc_info:

await github_service.get_branch_list("invalid-repo")

assert exc_info.value.status_code == 404

assert "Repository not found or error fetching branches" in str(exc_info.value.detail)

async def get_branch_list(self, repo_name: str):

try:

self._validate_repo_name(repo_name)

# … existing logic to fetch/return branches …

except HTTPException as he:

raise he

except ValueError as ve:

# Invalid repo-name format should be a 400, not a 404

raise HTTPException(status_code=400, detail=str(ve))

except Exception as e:

logger.error(

f"Error fetching branches for repo {repo_name}: {str(e)}", exc_info=True

)

raise HTTPException(

status_code=404,

detail=f"Repository not found or error fetching branches: {str(e)}",

)

dhirenmathur

Thanks for your PR!
This only handles local repos not remote ones, please update the logic

Added validation in repo_name parameter and improved error handling

81482af

coderabbitai bot reviewed Apr 26, 2025

View reviewed changes

Added unit test

b3794b5

coderabbitai bot reviewed Apr 26, 2025

View reviewed changes

dhirenmathur requested changes May 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Validate Repository Name Format in get_branch_list() #394

fix: Validate Repository Name Format in get_branch_list() #394

Uh oh!

harshit078 commented Apr 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 26, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

sonarqubecloud bot commented Apr 26, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 26, 2025

Uh oh!

dhirenmathur left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    @pytest.mark.asyncio
-    async def test_get_branch_list_validation(self, github_service):
-        """Test that get_branch_list properly handles validation errors"""
-        with pytest.raises(HTTPException) as exc_info:
-            await github_service.get_branch_list("invalid-repo")
-        assert exc_info.value.status_code == 404
-        assert "Repository not found or error fetching branches" in str(exc_info.value.detail)
+ async def get_branch_list(self, repo_name: str):
+     try:
+         self._validate_repo_name(repo_name)
+         # … existing logic to fetch/return branches …
+     except HTTPException as he:
+         raise he
+     except ValueError as ve:
+         # Invalid repo-name format should be a 400, not a 404
+         raise HTTPException(status_code=400, detail=str(ve))
+     except Exception as e:
+         logger.error(
+             f"Error fetching branches for repo {repo_name}: {str(e)}", exc_info=True
+         )
+         raise HTTPException(
+             status_code=404,
+             detail=f"Repository not found or error fetching branches: {str(e)}",
+         )

fix: Validate Repository Name Format in get_branch_list() #394

Are you sure you want to change the base?

fix: Validate Repository Name Format in get_branch_list() #394

Uh oh!

Conversation

harshit078 commented Apr 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Assessment against linked issues

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Apr 26, 2025

Quality Gate passed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 26, 2025

Choose a reason for hiding this comment

Uh oh!

dhirenmathur left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

harshit078 commented Apr 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 26, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)