Skip to content

fix(utils): consistently use regex for user provided expressions#18524

Merged
nijel merged 1 commit intoWeblateOrg:mainfrom
nijel:re-validator
Mar 19, 2026
Merged

fix(utils): consistently use regex for user provided expressions#18524
nijel merged 1 commit intoWeblateOrg:mainfrom
nijel:re-validator

Conversation

@nijel
Copy link
Member

@nijel nijel commented Mar 18, 2026

This provides consistent experience and allows us to limit time used to evaluate the expressions.

@nijel nijel added this to the 5.17 milestone Mar 18, 2026
@nijel nijel requested a review from Copilot March 18, 2026 18:40
@nijel nijel self-assigned this Mar 18, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes handling of user-provided regular expressions by routing evaluation through the third-party regex engine with an explicit timeout, aiming to prevent long-running matches and provide consistent behavior across the app.

Changes:

  • Added weblate.utils.regex helpers (compile_regex, regex_match, regex_findall, regex_sub) and a shared REGEX_TIMEOUT.
  • Updated multiple call sites (validators, discovery, component/unit variant handling, key filtering) to use timeout-bounded regex evaluation and handle TimeoutError.
  • Added tests to cover timeout behavior in validators, component list auto-assignment, and discovery.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
weblate/utils/validators.py Switch regex validation to regex compilation and enforce timeout when evaluating empty-string matches.
weblate/utils/tests/test_validators.py Adds a unit test ensuring validator surfaces a clear error on timeout.
weblate/utils/regex.py New shared utility module for compiling and running regex operations with a timeout.
weblate/trans/models/unit.py Wraps variant-regex evaluation with timeout handling during unit variant updates.
weblate/trans/models/translation.py Applies timeout-bounded matching for key_filter_re during sync and logs on timeout.
weblate/trans/models/componentlist.py Uses timeout-bounded matching for auto component list assignment.
weblate/trans/models/component.py Uses timeout-bounded matching for language regex and variant regex processing; key filter compilation uses compile_regex.
weblate/trans/discovery.py Uses timeout-bounded regex matching during component discovery and logs on timeouts.
weblate/trans/tests/test_models.py Adds test coverage for timeout behavior during auto component list assignment.
weblate/trans/tests/test_discovery.py Adds test coverage for timeout behavior during discovery matching.

@codecov
Copy link

codecov bot commented Mar 18, 2026

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

@nijel nijel requested a review from Copilot March 19, 2026 09:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes handling of user-provided regular expressions by routing matching through the third-party regex engine and enforcing a time limit to prevent long-running evaluations across validators and several translation/component workflows.

Changes:

  • Added weblate.utils.regex helpers (compile_regex, regex_match, regex_findall, regex_sub) with a shared REGEX_TIMEOUT.
  • Updated multiple call sites (validators, component discovery, component lists, translation sync, variant linking) to use timeout-aware regex matching and handle TimeoutError.
  • Added tests covering timeout behavior across validators, discovery, component validation, and component list auto-assignment.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
weblate/utils/validators.py Uses compile_regex + timeout-limited evaluation for user regex validation and raises a clearer error on timeouts.
weblate/utils/tests/test_validators.py Adds unit test ensuring regex timeout surfaces as a ValidationError.
weblate/utils/regex.py Introduces centralized regex helpers and a shared timeout constant.
weblate/trans/models/unit.py Uses timeout-limited findall for variant regex matching and logs on timeout.
weblate/trans/models/translation.py Uses timeout-limited matching for key_filter and skips units on timeout with a warning.
weblate/trans/models/componentlist.py Uses timeout-limited matching for auto component list assignment; logs and skips on timeout.
weblate/trans/models/component.py Adds timeout-aware language regex matching; changes some regex compilation to regex; extends validation to handle timeouts.
weblate/trans/discovery.py Uses timeout-limited matching for discovery path/language regexes; logs and skips on timeout.
weblate/trans/tests/test_models.py Adds integration test ensuring component auto-assignment is skipped on regex timeout.
weblate/trans/tests/test_discovery.py Adds test ensuring discovery returns no matches on regex timeout.
weblate/trans/tests/test_component.py Adds test ensuring component validation fails with the expected message on regex timeout.

@nijel nijel force-pushed the re-validator branch 2 times, most recently from 5941621 to 01c174b Compare March 19, 2026 09:55
@nijel nijel requested a review from Copilot March 19, 2026 09:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes handling of user-provided regular expressions by routing matching/compilation through a shared weblate.utils.regex helper (using the regex module) and enforcing a per-match timeout to mitigate expensive evaluations. It also surfaces regex-timeout errors in discovery preview and adds tests around timeout behavior.

Changes:

  • Added weblate.utils.regex helpers (compile_regex, regex_match, regex_findall, regex_sub) with a shared REGEX_TIMEOUT.
  • Updated validators, component discovery, component list auto-matching, key filtering, and variant matching to use the timeout-aware helpers and handle TimeoutError.
  • Added/updated tests and UI preview output to cover and display regex timeout errors.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
weblate/utils/validators.py Use timeout-aware compilation/matching for regex validation and raise a clearer ValidationError on timeout.
weblate/utils/regex.py Introduces centralized regex helper functions and shared timeout constant.
weblate/utils/tests/test_validators.py Adds a unit test ensuring regex validation handles TimeoutError as a ValidationError.
weblate/trans/models/component.py Uses timeout-aware regex matching for language filtering, key filtering, and variant processing; adds timeout-to-ValidationError path in clean().
weblate/trans/models/translation.py Applies timeout-aware key filter matching during sync and handles timeouts by skipping units.
weblate/trans/models/unit.py Uses timeout-aware findall when deciding whether to trigger variant updates from a unit.
weblate/trans/models/componentlist.py Uses timeout-aware matching for auto component list assignment and logs/report errors on timeout.
weblate/trans/discovery.py Uses timeout-aware matching for discovery path/language regexes and records deduplicated preview errors.
weblate/addons/forms.py Uses centralized regex compilation for discovery form preview rendering and passes discovery errors to template context.
weblate/templates/addons/discovery_preview.html Displays discovery preview errors and makes slug/mask display conditional.
weblate/addons/tests.py Adds an integration test asserting discovery preview shows timeout error messages.
weblate/trans/tests/test_models.py Adds a test ensuring auto component list assignment does not add components on regex timeout.
weblate/trans/tests/test_discovery.py Adds a test ensuring discovery timeout yields a recorded error and no matches.
weblate/trans/tests/test_component.py Adds a test ensuring component validation fails with a clear message on language regex timeout.

Comment on lines +524 to +533
try:
key_filter_match = regex_match(
self.component.key_filter_re, unit.context
)
except TimeoutError:
report_error(
"Component key filter regex timed out",
project=self.component.project,
)
self.component.log_warning(
Comment on lines +158 to +174
try:
matches = regex_match(self.path_match, path)
except TimeoutError:
report_error(
"Component discovery path regex timed out",
project=self.component.project if self.component else None,
)
self.add_error(
gettext(
"The regular expression used to match discovered files is too complex and took too long to evaluate."
),
mask=self.match,
)
LOGGER.warning(
"Regex matching timed out for discovery path: %s", path
)
continue
This provides consistent experience and allows us to limit time used to
evaluate the expressions.
@nijel nijel enabled auto-merge (rebase) March 19, 2026 11:56
@nijel nijel merged commit 4dfdf69 into WeblateOrg:main Mar 19, 2026
47 checks passed
@nijel nijel deleted the re-validator branch March 19, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants