Skip to content

Conversation

@DecentMakeover
Copy link

@DecentMakeover DecentMakeover commented Oct 1, 2025

Summary

This PR fixes #347 by enforcing email validation across all Shared Access APIs and adding comprehensive test coverage.

Changes

  • Add email validation to ShareChatService.remove_access()
  • Add 18 unit tests for ShareChatService, covering:
    • Valid and invalid email formats in share_chat and remove_access
    • Edge cases: empty emails, whitespace, malformed formats
    • Error handling: non-existent conversations and unauthorized access
  • Add 30+ unit tests for email_helper.is_valid_email()
  • Ensure consistent validation across all sharing endpoints

Test Coverage

The tests include:

  • Valid email formats (dots, plus signs, numbers, subdomains, etc.)
  • Invalid formats (missing @, domain, TLD, spaces, invalid characters, etc.)
  • Edge cases (very long emails, empty strings, whitespace-only, etc.)

Summary by CodeRabbit

  • Bug Fixes

    • Stricter email validation for sharing and access removal; invalid addresses now return clear 400 errors before any changes.
    • Ensures atomic operations: if any email is invalid, no updates are made.
  • Documentation

    • Expanded endpoint docs detailing email rules, error cases, and atomic behavior.
  • Tests

    • Comprehensive tests for valid/invalid email formats and share/remove access flows to ensure correct behavior and prevent partial updates.

This commit fixes potpie-ai#347 by enforcing email validation across Shared Access APIs
and adding comprehensive unit tests.

Changes:
- Add email validation to ShareChatService.remove_access()
- Add unit tests for ShareChatService (18 cases covering valid/invalid emails,
  empty/whitespace, malformed formats, non-existent conversations, and
  unauthorized access)
- Add unit tests for email_helper.is_valid_email() (30+ cases)
- Ensure consistent validation across all sharing endpoints
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 1, 2025

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The pull request successfully adds a robust email validation function, integrates it into the remove_access method, updates API documentation, and provides comprehensive unit tests, but it does not modify the share_chat implementation to enforce validation as required by the linked issue. Please add pre-validation of email addresses to the share_chat method in ShareChatService so that share_chat also raises a 400 error on invalid inputs in accordance with the linked issue’s requirements.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely describes the core code addition—email validation in remove_access—and notes the broadened test coverage without extraneous detail or ambiguity, making the change’s primary intent clear to reviewers.
Out of Scope Changes Check ✅ Passed All of the code changes directly relate to enforcing email validation, enhancing the validation utility, updating documentation, and expanding test coverage as specified in the issue, with no unrelated modifications introduced.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
app/modules/utils/tests/email_helper_test.py (1)

127-130: Consider validating None explicitly rather than relying on AttributeError.

The test expects AttributeError when None is passed, which tests an implementation detail (regex behavior on None) rather than the validation contract.

Consider either:

  • Adding explicit None handling in is_valid_email to return False
  • Documenting that is_valid_email expects a string and will raise AttributeError for None

If you want to handle None gracefully, apply this diff to app/modules/utils/email_helper.py:

 def is_valid_email(email: str) -> bool:
     """Simple regex-based email validation."""
+    if email is None:
+        return False
     pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
     return re.match(pattern, email) is not None

And update the test:

     def test_none_value(self):
         """Test None value handling."""
-        with pytest.raises(AttributeError):
-            is_valid_email(None)
+        assert is_valid_email(None) is False
app/modules/conversations/access/tests/access_service_test.py (1)

46-46: Consider using more explicit mock setup for query chains.

The pattern mock_db.query().filter_by().first.return_value = mock_conversation relies on MagicMock's auto-chaining. While this works, it doesn't verify that query(Conversation) receives the correct model class. For stricter unit testing, you could set up the chain explicitly:

mock_query = MagicMock()
mock_filter = MagicMock()
mock_query.filter_by.return_value = mock_filter
mock_filter.first.return_value = mock_conversation
mock_db.query.return_value = mock_query

However, the current approach is acceptable for unit tests focused on service logic rather than ORM interactions.

Also applies to: 161-161, 207-207

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26cb74b and 20da8e1.

📒 Files selected for processing (3)
  • app/modules/conversations/access/access_service.py (1 hunks)
  • app/modules/conversations/access/tests/access_service_test.py (1 hunks)
  • app/modules/utils/tests/email_helper_test.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
app/modules/utils/tests/email_helper_test.py (1)
app/modules/utils/email_helper.py (1)
  • is_valid_email (54-57)
app/modules/conversations/access/tests/access_service_test.py (2)
app/modules/conversations/access/access_service.py (4)
  • ShareChatService (18-139)
  • ShareChatServiceError (14-15)
  • share_chat (22-78)
  • remove_access (93-139)
app/modules/conversations/conversation/conversation_model.py (2)
  • Conversation (24-61)
  • Visibility (19-21)
app/modules/conversations/access/access_service.py (1)
app/modules/utils/email_helper.py (1)
  • is_valid_email (54-57)
🔇 Additional comments (8)
app/modules/utils/tests/email_helper_test.py (3)

8-46: LGTM!

The valid email test cases are comprehensive and cover common formats including special characters, subdomains, and various domain configurations.


50-86: LGTM!

The invalid email test cases correctly cover missing components, malformed syntax, and improper spacing.

Also applies to: 99-110


114-125: LGTM!

The edge case tests appropriately cover boundary conditions including very long emails, minimum valid format, comprehensive special characters, and invalid numeric/localhost patterns.

Also applies to: 132-138

app/modules/conversations/access/access_service.py (2)

108-113: LGTM! Email validation correctly prevents invalid input.

The validation loop properly checks each email before proceeding with removal logic and provides clear error messages.


93-113: Verify the validation order is intentional.

Email validation occurs after fetching the conversation (lines 97-106), meaning a 404 error for non-existent conversations takes precedence over a 400 error for invalid emails. This is consistent with share_chat (lines 29-37 fetch, then 52-57 validate), but you may want to validate emails first for faster failure on invalid input.

If you prefer to validate emails before the database query, apply this diff:

     async def remove_access(
         self, conversation_id: str, user_id: str, emails_to_remove: List[str]
     ) -> bool:
         """Remove access for specified emails from a conversation."""
+        # Validate all emails first
+        for email in emails_to_remove:
+            if not is_valid_email(email):
+                raise HTTPException(
+                    status_code=400, detail=f"Invalid email address: {email}"
+                )
+
         chat = (
             self.db.query(Conversation)
             .filter_by(id=conversation_id, user_id=user_id)
             .first()
         )
         if not chat:
             raise HTTPException(
                 status_code=404,
                 detail="Chat does not exist or you are not authorized to access it.",
             )
 
-        # Validate all emails first
-        for email in emails_to_remove:
-            if not is_valid_email(email):
-                raise HTTPException(
-                    status_code=400, detail=f"Invalid email address: {email}"
-                )
-

Note: This would require updating tests that expect 404 to be raised before 400.

app/modules/conversations/access/tests/access_service_test.py (3)

18-36: LGTM!

The fixtures are well-structured and provide appropriate mocks for testing the service layer in isolation.


38-194: LGTM!

The share_chat tests comprehensively cover valid scenarios, invalid email formats, edge cases (empty/whitespace), mixed inputs, public visibility, error cases (404), and duplicate handling. The assertion commit.assert_not_called() in line 136 correctly verifies atomic validation behavior.


195-342: LGTM!

The remove_access tests thoroughly cover valid removal, invalid email formats, edge cases (empty/whitespace), mixed inputs, error cases (404, no shared emails, email not in list), and atomic validation behavior (line 291: commit.assert_not_called()).

Comment on lines +87 to +97
def test_double_dots_in_domain(self):
"""Test email with consecutive dots in domain."""
assert is_valid_email("[email protected]") is False

def test_starting_with_dot(self):
"""Test email starting with dot."""
assert is_valid_email("[email protected]") is False

def test_ending_with_dot(self):
"""Test email ending with dot before @."""
assert is_valid_email("[email protected]") is False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify test expectations against the regex pattern.

The is_valid_email regex is ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$. This pattern:

  • Allows consecutive dots in the domain part (line 89: [email protected] may pass validation)
  • Allows a trailing dot before @ in the username (line 97: [email protected] may pass validation)

These tests may fail because the regex does not explicitly reject these patterns.

Run the following script to verify the actual behavior:


🏁 Script executed:

#!/bin/bash
# Test the actual behavior of is_valid_email with edge cases

python3 << 'EOF'
import re

def is_valid_email(email: str) -> bool:
    """Simple regex-based email validation."""
    pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
    return re.match(pattern, email) is not None

# Test cases that may not behave as expected
test_cases = [
    ("[email protected]", "consecutive dots in domain"),
    ("[email protected]", "leading dot in username"),
    ("[email protected]", "trailing dot in username"),
]

for email, description in test_cases:
    result = is_valid_email(email)
    print(f"{email:30} ({description:35}) -> {result}")
EOF

Length of output: 230


Fix email validation or tests for dot-placement edge cases
The regex ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ currently accepts

  • [email protected]
  • [email protected]
  • [email protected]
    so the tests asserting these should be invalid will fail. Decide on the intended behavior and either:
  • Update the regex (e.g., add negative lookaheads to forbid consecutive dots in the domain and leading/trailing dots in the local part), or
  • Change these tests to assert True instead of False.
🤖 Prompt for AI Agents
In app/modules/utils/tests/email_helper_test.py around lines 87 to 97, the tests
expect emails with consecutive dots in the domain and leading/trailing dots in
the local part to be invalid but the current regex allows them; update the email
validation (not the tests) so it rejects these cases by changing the regex to
disallow consecutive dots anywhere in the domain and to forbid a local part that
starts or ends with a dot (e.g., add negative lookaheads to prevent "\.\." and
require the local part to not begin or end with a dot), apply that regex in the
is_valid_email function, and rerun tests to confirm the three failing cases are
now False.

Address CodeRabbit review feedback:
- Fixed regex to reject consecutive dots in domain (e.g., [email protected])
- Fixed regex to reject leading/trailing dots in local part (e.g., [email protected], [email protected])
- Added graceful None handling (returns False instead of raising AttributeError)
- Added comprehensive API documentation for share_chat and remove_access endpoints
- Updated tests to expect False for None input instead of AttributeError

Result:
✅ All 30 email validation tests now pass successfully.
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
B Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20da8e1 and d907233.

📒 Files selected for processing (3)
  • app/modules/conversations/conversations_router.py (2 hunks)
  • app/modules/utils/email_helper.py (1 hunks)
  • app/modules/utils/tests/email_helper_test.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • app/modules/conversations/conversations_router.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/modules/utils/tests/email_helper_test.py
🔇 Additional comments (2)
app/modules/utils/email_helper.py (2)

54-61: Helpful docstring, but consider noting validation limitations.

The docstring clearly documents the validation rules. However, it might be worth noting that this is a simplified validation (not full RFC 5322 compliance) to set appropriate expectations for callers.


62-63: Good defensive check for None input.

The None check prevents potential AttributeError exceptions and makes the function more robust.

Comment on lines +65 to +77
# Regex pattern that:
# - Ensures local part doesn't start/end with dot: (?![.])[a-zA-Z0-9._%+-]+(?<![.])
# - Ensures domain doesn't have consecutive dots: checked via negative lookahead
# - Requires proper TLD of at least 2 characters
pattern = r"^(?![.])[a-zA-Z0-9._%+-]+(?<![.])@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

if not re.match(pattern, email):
return False

# Additional check: domain part should not contain consecutive dots
domain = email.split('@')[1] if '@' in email else ''
if '..' in domain:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Regex pattern allows invalid email formats.

The current pattern has several issues that allow invalid emails:

  1. Local part consecutive dots: The pattern prevents leading/trailing dots but allows consecutive dots in the middle (e.g., [email protected]), which violates RFC 5321.

  2. Domain label validation: The domain pattern [a-zA-Z0-9.-]+ allows domains to start/end with hyphens or dots (e.g., [email protected], [email protected]), which are invalid per DNS rules.

  3. Inconsistent consecutive dots check: Line 76 checks for consecutive dots in the domain but not in the local part.

Apply this diff to fix the validation:

 def is_valid_email(email: str) -> bool:
     """Simple regex-based email validation.
 
     Validates email format ensuring:
     - Local part (before @) doesn't start or end with a dot
+    - Local part doesn't contain consecutive dots
     - Domain part doesn't contain consecutive dots
     - Proper email structure with @ symbol and valid TLD
+    - Domain labels don't start or end with hyphens
     """
     if email is None:
         return False
 
-    # Regex pattern that:
-    # - Ensures local part doesn't start/end with dot: (?![.])[a-zA-Z0-9._%+-]+(?<![.])
-    # - Ensures domain doesn't have consecutive dots: checked via negative lookahead
-    # - Requires proper TLD of at least 2 characters
-    pattern = r"^(?![.])[a-zA-Z0-9._%+-]+(?<![.])@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
+    # Stricter pattern ensuring:
+    # - Local part: valid chars with optional dots between segments (no consecutive dots)
+    # - Domain: labels start/end with alphanumeric, hyphens only in middle
+    # - TLD: at least 2 alpha characters
+    local_part = r"[a-zA-Z0-9_%+-]+(?:\.[a-zA-Z0-9_%+-]+)*"
+    domain_label = r"[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?"
+    domain_part = f"{domain_label}(?:\\.{domain_label})*"
+    tld_part = r"\.[a-zA-Z]{2,}"
+    pattern = f"^{local_part}@{domain_part}{tld_part}$"
 
     if not re.match(pattern, email):
         return False
 
-    # Additional check: domain part should not contain consecutive dots
-    domain = email.split('@')[1] if '@' in email else ''
-    if '..' in domain:
-        return False
-
     return True

This improved pattern:

  • Prevents consecutive dots in both local and domain parts
  • Ensures domain labels start and end with alphanumeric characters
  • Allows hyphens only in the middle of domain labels
  • Eliminates the need for the post-regex consecutive dots check
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Regex pattern that:
# - Ensures local part doesn't start/end with dot: (?![.])[a-zA-Z0-9._%+-]+(?<![.])
# - Ensures domain doesn't have consecutive dots: checked via negative lookahead
# - Requires proper TLD of at least 2 characters
pattern = r"^(?![.])[a-zA-Z0-9._%+-]+(?<![.])@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
if not re.match(pattern, email):
return False
# Additional check: domain part should not contain consecutive dots
domain = email.split('@')[1] if '@' in email else ''
if '..' in domain:
return False
def is_valid_email(email: str) -> bool:
"""Simple regex-based email validation.
Validates email format ensuring:
- Local part (before @) doesn't start or end with a dot
- Local part doesn't contain consecutive dots
- Domain part doesn't contain consecutive dots
- Proper email structure with @ symbol and valid TLD
- Domain labels don't start or end with hyphens
"""
if email is None:
return False
# Stricter pattern ensuring:
# - Local part: valid chars with optional dots between segments (no consecutive dots)
# - Domain: labels start/end with alphanumeric, hyphens only in middle
# - TLD: at least 2 alpha characters
local_part = r"[a-zA-Z0-9_%+-]+(?:\.[a-zA-Z0-9_%+-]+)*"
domain_label = r"[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?"
domain_part = f"{domain_label}(?:\\.{domain_label})*"
tld_part = r"\.[a-zA-Z]{2,}"
pattern = f"^{local_part}@{domain_part}{tld_part}$"
if not re.match(pattern, email):
return False
return True
🤖 Prompt for AI Agents
In app/modules/utils/email_helper.py around lines 65 to 77, the current regex
allows consecutive dots in the local part and domain labels that start/end with
hyphens or dots and relies on a separate '..' check for the domain; replace the
pattern with a stricter full-match regex that (1) forbids consecutive dots in
the local part, (2) enforces domain labels to start and end with alphanumeric
characters and only allow internal hyphens, and (3) requires a TLD of at least
two letters; also use re.fullmatch instead of re.match and remove the separate
post-regex consecutive-dots domain check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enforce email validation for Shared Access apis

1 participant