-
Notifications
You must be signed in to change notification settings - Fork 484
Add email validation to remove_access and improve test coverage #453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add email validation to remove_access and improve test coverage #453
Conversation
This commit fixes potpie-ai#347 by enforcing email validation across Shared Access APIs and adding comprehensive unit tests. Changes: - Add email validation to ShareChatService.remove_access() - Add unit tests for ShareChatService (18 cases covering valid/invalid emails, empty/whitespace, malformed formats, non-existent conversations, and unauthorized access) - Add unit tests for email_helper.is_valid_email() (30+ cases) - Ensure consistent validation across all sharing endpoints
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
app/modules/utils/tests/email_helper_test.py (1)
127-130: Consider validating None explicitly rather than relying on AttributeError.The test expects
AttributeErrorwhenNoneis passed, which tests an implementation detail (regex behavior on None) rather than the validation contract.Consider either:
- Adding explicit None handling in
is_valid_emailto returnFalse- Documenting that
is_valid_emailexpects a string and will raiseAttributeErrorfor NoneIf you want to handle None gracefully, apply this diff to
app/modules/utils/email_helper.py:def is_valid_email(email: str) -> bool: """Simple regex-based email validation.""" + if email is None: + return False pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" return re.match(pattern, email) is not NoneAnd update the test:
def test_none_value(self): """Test None value handling.""" - with pytest.raises(AttributeError): - is_valid_email(None) + assert is_valid_email(None) is Falseapp/modules/conversations/access/tests/access_service_test.py (1)
46-46: Consider using more explicit mock setup for query chains.The pattern
mock_db.query().filter_by().first.return_value = mock_conversationrelies on MagicMock's auto-chaining. While this works, it doesn't verify thatquery(Conversation)receives the correct model class. For stricter unit testing, you could set up the chain explicitly:mock_query = MagicMock() mock_filter = MagicMock() mock_query.filter_by.return_value = mock_filter mock_filter.first.return_value = mock_conversation mock_db.query.return_value = mock_queryHowever, the current approach is acceptable for unit tests focused on service logic rather than ORM interactions.
Also applies to: 161-161, 207-207
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/modules/conversations/access/access_service.py(1 hunks)app/modules/conversations/access/tests/access_service_test.py(1 hunks)app/modules/utils/tests/email_helper_test.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
app/modules/utils/tests/email_helper_test.py (1)
app/modules/utils/email_helper.py (1)
is_valid_email(54-57)
app/modules/conversations/access/tests/access_service_test.py (2)
app/modules/conversations/access/access_service.py (4)
ShareChatService(18-139)ShareChatServiceError(14-15)share_chat(22-78)remove_access(93-139)app/modules/conversations/conversation/conversation_model.py (2)
Conversation(24-61)Visibility(19-21)
app/modules/conversations/access/access_service.py (1)
app/modules/utils/email_helper.py (1)
is_valid_email(54-57)
🔇 Additional comments (8)
app/modules/utils/tests/email_helper_test.py (3)
8-46: LGTM!The valid email test cases are comprehensive and cover common formats including special characters, subdomains, and various domain configurations.
50-86: LGTM!The invalid email test cases correctly cover missing components, malformed syntax, and improper spacing.
Also applies to: 99-110
114-125: LGTM!The edge case tests appropriately cover boundary conditions including very long emails, minimum valid format, comprehensive special characters, and invalid numeric/localhost patterns.
Also applies to: 132-138
app/modules/conversations/access/access_service.py (2)
108-113: LGTM! Email validation correctly prevents invalid input.The validation loop properly checks each email before proceeding with removal logic and provides clear error messages.
93-113: Verify the validation order is intentional.Email validation occurs after fetching the conversation (lines 97-106), meaning a 404 error for non-existent conversations takes precedence over a 400 error for invalid emails. This is consistent with
share_chat(lines 29-37 fetch, then 52-57 validate), but you may want to validate emails first for faster failure on invalid input.If you prefer to validate emails before the database query, apply this diff:
async def remove_access( self, conversation_id: str, user_id: str, emails_to_remove: List[str] ) -> bool: """Remove access for specified emails from a conversation.""" + # Validate all emails first + for email in emails_to_remove: + if not is_valid_email(email): + raise HTTPException( + status_code=400, detail=f"Invalid email address: {email}" + ) + chat = ( self.db.query(Conversation) .filter_by(id=conversation_id, user_id=user_id) .first() ) if not chat: raise HTTPException( status_code=404, detail="Chat does not exist or you are not authorized to access it.", ) - # Validate all emails first - for email in emails_to_remove: - if not is_valid_email(email): - raise HTTPException( - status_code=400, detail=f"Invalid email address: {email}" - ) -Note: This would require updating tests that expect 404 to be raised before 400.
app/modules/conversations/access/tests/access_service_test.py (3)
18-36: LGTM!The fixtures are well-structured and provide appropriate mocks for testing the service layer in isolation.
38-194: LGTM!The
share_chattests comprehensively cover valid scenarios, invalid email formats, edge cases (empty/whitespace), mixed inputs, public visibility, error cases (404), and duplicate handling. The assertioncommit.assert_not_called()in line 136 correctly verifies atomic validation behavior.
195-342: LGTM!The
remove_accesstests thoroughly cover valid removal, invalid email formats, edge cases (empty/whitespace), mixed inputs, error cases (404, no shared emails, email not in list), and atomic validation behavior (line 291:commit.assert_not_called()).
| def test_double_dots_in_domain(self): | ||
| """Test email with consecutive dots in domain.""" | ||
| assert is_valid_email("[email protected]") is False | ||
|
|
||
| def test_starting_with_dot(self): | ||
| """Test email starting with dot.""" | ||
| assert is_valid_email("[email protected]") is False | ||
|
|
||
| def test_ending_with_dot(self): | ||
| """Test email ending with dot before @.""" | ||
| assert is_valid_email("[email protected]") is False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify test expectations against the regex pattern.
The is_valid_email regex is ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$. This pattern:
- Allows consecutive dots in the domain part (line 89:
[email protected]may pass validation) - Allows a trailing dot before
@in the username (line 97:[email protected]may pass validation)
These tests may fail because the regex does not explicitly reject these patterns.
Run the following script to verify the actual behavior:
🏁 Script executed:
#!/bin/bash
# Test the actual behavior of is_valid_email with edge cases
python3 << 'EOF'
import re
def is_valid_email(email: str) -> bool:
"""Simple regex-based email validation."""
pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
return re.match(pattern, email) is not None
# Test cases that may not behave as expected
test_cases = [
("[email protected]", "consecutive dots in domain"),
("[email protected]", "leading dot in username"),
("[email protected]", "trailing dot in username"),
]
for email, description in test_cases:
result = is_valid_email(email)
print(f"{email:30} ({description:35}) -> {result}")
EOFLength of output: 230
Fix email validation or tests for dot-placement edge cases
The regex ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ currently accepts
[email protected][email protected][email protected]
so the tests asserting these should be invalid will fail. Decide on the intended behavior and either:- Update the regex (e.g., add negative lookaheads to forbid consecutive dots in the domain and leading/trailing dots in the local part), or
- Change these tests to assert
Trueinstead ofFalse.
🤖 Prompt for AI Agents
In app/modules/utils/tests/email_helper_test.py around lines 87 to 97, the tests
expect emails with consecutive dots in the domain and leading/trailing dots in
the local part to be invalid but the current regex allows them; update the email
validation (not the tests) so it rejects these cases by changing the regex to
disallow consecutive dots anywhere in the domain and to forbid a local part that
starts or ends with a dot (e.g., add negative lookaheads to prevent "\.\." and
require the local part to not begin or end with a dot), apply that regex in the
is_valid_email function, and rerun tests to confirm the three failing cases are
now False.
Address CodeRabbit review feedback: - Fixed regex to reject consecutive dots in domain (e.g., [email protected]) - Fixed regex to reject leading/trailing dots in local part (e.g., [email protected], [email protected]) - Added graceful None handling (returns False instead of raising AttributeError) - Added comprehensive API documentation for share_chat and remove_access endpoints - Updated tests to expect False for None input instead of AttributeError Result: ✅ All 30 email validation tests now pass successfully.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/modules/conversations/conversations_router.py(2 hunks)app/modules/utils/email_helper.py(1 hunks)app/modules/utils/tests/email_helper_test.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- app/modules/conversations/conversations_router.py
🚧 Files skipped from review as they are similar to previous changes (1)
- app/modules/utils/tests/email_helper_test.py
🔇 Additional comments (2)
app/modules/utils/email_helper.py (2)
54-61: Helpful docstring, but consider noting validation limitations.The docstring clearly documents the validation rules. However, it might be worth noting that this is a simplified validation (not full RFC 5322 compliance) to set appropriate expectations for callers.
62-63: Good defensive check for None input.The None check prevents potential AttributeError exceptions and makes the function more robust.
| # Regex pattern that: | ||
| # - Ensures local part doesn't start/end with dot: (?![.])[a-zA-Z0-9._%+-]+(?<![.]) | ||
| # - Ensures domain doesn't have consecutive dots: checked via negative lookahead | ||
| # - Requires proper TLD of at least 2 characters | ||
| pattern = r"^(?![.])[a-zA-Z0-9._%+-]+(?<![.])@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" | ||
|
|
||
| if not re.match(pattern, email): | ||
| return False | ||
|
|
||
| # Additional check: domain part should not contain consecutive dots | ||
| domain = email.split('@')[1] if '@' in email else '' | ||
| if '..' in domain: | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex pattern allows invalid email formats.
The current pattern has several issues that allow invalid emails:
-
Local part consecutive dots: The pattern prevents leading/trailing dots but allows consecutive dots in the middle (e.g.,
[email protected]), which violates RFC 5321. -
Domain label validation: The domain pattern
[a-zA-Z0-9.-]+allows domains to start/end with hyphens or dots (e.g.,[email protected],[email protected]), which are invalid per DNS rules. -
Inconsistent consecutive dots check: Line 76 checks for consecutive dots in the domain but not in the local part.
Apply this diff to fix the validation:
def is_valid_email(email: str) -> bool:
"""Simple regex-based email validation.
Validates email format ensuring:
- Local part (before @) doesn't start or end with a dot
+ - Local part doesn't contain consecutive dots
- Domain part doesn't contain consecutive dots
- Proper email structure with @ symbol and valid TLD
+ - Domain labels don't start or end with hyphens
"""
if email is None:
return False
- # Regex pattern that:
- # - Ensures local part doesn't start/end with dot: (?![.])[a-zA-Z0-9._%+-]+(?<![.])
- # - Ensures domain doesn't have consecutive dots: checked via negative lookahead
- # - Requires proper TLD of at least 2 characters
- pattern = r"^(?![.])[a-zA-Z0-9._%+-]+(?<![.])@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
+ # Stricter pattern ensuring:
+ # - Local part: valid chars with optional dots between segments (no consecutive dots)
+ # - Domain: labels start/end with alphanumeric, hyphens only in middle
+ # - TLD: at least 2 alpha characters
+ local_part = r"[a-zA-Z0-9_%+-]+(?:\.[a-zA-Z0-9_%+-]+)*"
+ domain_label = r"[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?"
+ domain_part = f"{domain_label}(?:\\.{domain_label})*"
+ tld_part = r"\.[a-zA-Z]{2,}"
+ pattern = f"^{local_part}@{domain_part}{tld_part}$"
if not re.match(pattern, email):
return False
- # Additional check: domain part should not contain consecutive dots
- domain = email.split('@')[1] if '@' in email else ''
- if '..' in domain:
- return False
-
return TrueThis improved pattern:
- Prevents consecutive dots in both local and domain parts
- Ensures domain labels start and end with alphanumeric characters
- Allows hyphens only in the middle of domain labels
- Eliminates the need for the post-regex consecutive dots check
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Regex pattern that: | |
| # - Ensures local part doesn't start/end with dot: (?![.])[a-zA-Z0-9._%+-]+(?<![.]) | |
| # - Ensures domain doesn't have consecutive dots: checked via negative lookahead | |
| # - Requires proper TLD of at least 2 characters | |
| pattern = r"^(?![.])[a-zA-Z0-9._%+-]+(?<![.])@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" | |
| if not re.match(pattern, email): | |
| return False | |
| # Additional check: domain part should not contain consecutive dots | |
| domain = email.split('@')[1] if '@' in email else '' | |
| if '..' in domain: | |
| return False | |
| def is_valid_email(email: str) -> bool: | |
| """Simple regex-based email validation. | |
| Validates email format ensuring: | |
| - Local part (before @) doesn't start or end with a dot | |
| - Local part doesn't contain consecutive dots | |
| - Domain part doesn't contain consecutive dots | |
| - Proper email structure with @ symbol and valid TLD | |
| - Domain labels don't start or end with hyphens | |
| """ | |
| if email is None: | |
| return False | |
| # Stricter pattern ensuring: | |
| # - Local part: valid chars with optional dots between segments (no consecutive dots) | |
| # - Domain: labels start/end with alphanumeric, hyphens only in middle | |
| # - TLD: at least 2 alpha characters | |
| local_part = r"[a-zA-Z0-9_%+-]+(?:\.[a-zA-Z0-9_%+-]+)*" | |
| domain_label = r"[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?" | |
| domain_part = f"{domain_label}(?:\\.{domain_label})*" | |
| tld_part = r"\.[a-zA-Z]{2,}" | |
| pattern = f"^{local_part}@{domain_part}{tld_part}$" | |
| if not re.match(pattern, email): | |
| return False | |
| return True |
🤖 Prompt for AI Agents
In app/modules/utils/email_helper.py around lines 65 to 77, the current regex
allows consecutive dots in the local part and domain labels that start/end with
hyphens or dots and relies on a separate '..' check for the domain; replace the
pattern with a stricter full-match regex that (1) forbids consecutive dots in
the local part, (2) enforces domain labels to start and end with alphanumeric
characters and only allow internal hyphens, and (3) requires a TLD of at least
two letters; also use re.fullmatch instead of re.match and remove the separate
post-regex consecutive-dots domain check.




Summary
This PR fixes #347 by enforcing email validation across all Shared Access APIs and adding comprehensive test coverage.
Changes
ShareChatService.remove_access()ShareChatService, covering:share_chatandremove_accessemail_helper.is_valid_email()Test Coverage
The tests include:
Summary by CodeRabbit
Bug Fixes
Documentation
Tests