fix(utils): consistently use regex for user provided expressions#18524
fix(utils): consistently use regex for user provided expressions#18524nijel merged 1 commit intoWeblateOrg:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR standardizes handling of user-provided regular expressions by routing evaluation through the third-party regex engine with an explicit timeout, aiming to prevent long-running matches and provide consistent behavior across the app.
Changes:
- Added
weblate.utils.regexhelpers (compile_regex,regex_match,regex_findall,regex_sub) and a sharedREGEX_TIMEOUT. - Updated multiple call sites (validators, discovery, component/unit variant handling, key filtering) to use timeout-bounded regex evaluation and handle
TimeoutError. - Added tests to cover timeout behavior in validators, component list auto-assignment, and discovery.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| weblate/utils/validators.py | Switch regex validation to regex compilation and enforce timeout when evaluating empty-string matches. |
| weblate/utils/tests/test_validators.py | Adds a unit test ensuring validator surfaces a clear error on timeout. |
| weblate/utils/regex.py | New shared utility module for compiling and running regex operations with a timeout. |
| weblate/trans/models/unit.py | Wraps variant-regex evaluation with timeout handling during unit variant updates. |
| weblate/trans/models/translation.py | Applies timeout-bounded matching for key_filter_re during sync and logs on timeout. |
| weblate/trans/models/componentlist.py | Uses timeout-bounded matching for auto component list assignment. |
| weblate/trans/models/component.py | Uses timeout-bounded matching for language regex and variant regex processing; key filter compilation uses compile_regex. |
| weblate/trans/discovery.py | Uses timeout-bounded regex matching during component discovery and logs on timeouts. |
| weblate/trans/tests/test_models.py | Adds test coverage for timeout behavior during auto component list assignment. |
| weblate/trans/tests/test_discovery.py | Adds test coverage for timeout behavior during discovery matching. |
|
There was a problem hiding this comment.
Pull request overview
This PR standardizes handling of user-provided regular expressions by routing matching through the third-party regex engine and enforcing a time limit to prevent long-running evaluations across validators and several translation/component workflows.
Changes:
- Added
weblate.utils.regexhelpers (compile_regex,regex_match,regex_findall,regex_sub) with a sharedREGEX_TIMEOUT. - Updated multiple call sites (validators, component discovery, component lists, translation sync, variant linking) to use timeout-aware regex matching and handle
TimeoutError. - Added tests covering timeout behavior across validators, discovery, component validation, and component list auto-assignment.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| weblate/utils/validators.py | Uses compile_regex + timeout-limited evaluation for user regex validation and raises a clearer error on timeouts. |
| weblate/utils/tests/test_validators.py | Adds unit test ensuring regex timeout surfaces as a ValidationError. |
| weblate/utils/regex.py | Introduces centralized regex helpers and a shared timeout constant. |
| weblate/trans/models/unit.py | Uses timeout-limited findall for variant regex matching and logs on timeout. |
| weblate/trans/models/translation.py | Uses timeout-limited matching for key_filter and skips units on timeout with a warning. |
| weblate/trans/models/componentlist.py | Uses timeout-limited matching for auto component list assignment; logs and skips on timeout. |
| weblate/trans/models/component.py | Adds timeout-aware language regex matching; changes some regex compilation to regex; extends validation to handle timeouts. |
| weblate/trans/discovery.py | Uses timeout-limited matching for discovery path/language regexes; logs and skips on timeout. |
| weblate/trans/tests/test_models.py | Adds integration test ensuring component auto-assignment is skipped on regex timeout. |
| weblate/trans/tests/test_discovery.py | Adds test ensuring discovery returns no matches on regex timeout. |
| weblate/trans/tests/test_component.py | Adds test ensuring component validation fails with the expected message on regex timeout. |
5941621 to
01c174b
Compare
There was a problem hiding this comment.
Pull request overview
This PR standardizes handling of user-provided regular expressions by routing matching/compilation through a shared weblate.utils.regex helper (using the regex module) and enforcing a per-match timeout to mitigate expensive evaluations. It also surfaces regex-timeout errors in discovery preview and adds tests around timeout behavior.
Changes:
- Added
weblate.utils.regexhelpers (compile_regex,regex_match,regex_findall,regex_sub) with a sharedREGEX_TIMEOUT. - Updated validators, component discovery, component list auto-matching, key filtering, and variant matching to use the timeout-aware helpers and handle
TimeoutError. - Added/updated tests and UI preview output to cover and display regex timeout errors.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| weblate/utils/validators.py | Use timeout-aware compilation/matching for regex validation and raise a clearer ValidationError on timeout. |
| weblate/utils/regex.py | Introduces centralized regex helper functions and shared timeout constant. |
| weblate/utils/tests/test_validators.py | Adds a unit test ensuring regex validation handles TimeoutError as a ValidationError. |
| weblate/trans/models/component.py | Uses timeout-aware regex matching for language filtering, key filtering, and variant processing; adds timeout-to-ValidationError path in clean(). |
| weblate/trans/models/translation.py | Applies timeout-aware key filter matching during sync and handles timeouts by skipping units. |
| weblate/trans/models/unit.py | Uses timeout-aware findall when deciding whether to trigger variant updates from a unit. |
| weblate/trans/models/componentlist.py | Uses timeout-aware matching for auto component list assignment and logs/report errors on timeout. |
| weblate/trans/discovery.py | Uses timeout-aware matching for discovery path/language regexes and records deduplicated preview errors. |
| weblate/addons/forms.py | Uses centralized regex compilation for discovery form preview rendering and passes discovery errors to template context. |
| weblate/templates/addons/discovery_preview.html | Displays discovery preview errors and makes slug/mask display conditional. |
| weblate/addons/tests.py | Adds an integration test asserting discovery preview shows timeout error messages. |
| weblate/trans/tests/test_models.py | Adds a test ensuring auto component list assignment does not add components on regex timeout. |
| weblate/trans/tests/test_discovery.py | Adds a test ensuring discovery timeout yields a recorded error and no matches. |
| weblate/trans/tests/test_component.py | Adds a test ensuring component validation fails with a clear message on language regex timeout. |
| try: | ||
| key_filter_match = regex_match( | ||
| self.component.key_filter_re, unit.context | ||
| ) | ||
| except TimeoutError: | ||
| report_error( | ||
| "Component key filter regex timed out", | ||
| project=self.component.project, | ||
| ) | ||
| self.component.log_warning( |
weblate/trans/discovery.py
Outdated
| try: | ||
| matches = regex_match(self.path_match, path) | ||
| except TimeoutError: | ||
| report_error( | ||
| "Component discovery path regex timed out", | ||
| project=self.component.project if self.component else None, | ||
| ) | ||
| self.add_error( | ||
| gettext( | ||
| "The regular expression used to match discovered files is too complex and took too long to evaluate." | ||
| ), | ||
| mask=self.match, | ||
| ) | ||
| LOGGER.warning( | ||
| "Regex matching timed out for discovery path: %s", path | ||
| ) | ||
| continue |
This provides consistent experience and allows us to limit time used to evaluate the expressions.
This provides consistent experience and allows us to limit time used to evaluate the expressions.