fix(ext/web): handle Windows file paths in URL parsing#33097
Open
renezander030 wants to merge 2 commits intodenoland:mainfrom
Open
fix(ext/web): handle Windows file paths in URL parsing#33097renezander030 wants to merge 2 commits intodenoland:mainfrom
renezander030 wants to merge 2 commits intodenoland:mainfrom
Conversation
Implement WHATWG URL spec change (url#874) to detect Windows drive letter patterns (e.g., C:\path\file.txt) in the URL parser's scheme start state and automatically convert them to file:/// URLs (file:///C:/path/file.txt). The spec adds a check: when parsing encounters a single ASCII alpha letter as the scheme buffer, followed by ':' and '\', it recognizes this as a Windows drive path rather than a URL scheme. The parser then sets the scheme to "file", the host to empty string, and transitions to path state with backslashes normalized to forward slashes. This is implemented as a preprocessing step in the Rust parse_url function, before the input reaches the rust-url crate parser.
Contributor
|
Please check this comment: |
Author
|
Thanks for the pointer. I'm aware of that comment. The difference here is that this implements the WHATWG URL spec change (url#874), which is being tracked by Chromium, Gecko, and WebKit as well. The upstream Rust The implementation is intentionally minimal and isolated (one function). Easy to remove once the upstream crate picks it up. Happy to hear from the maintainers on whether they'd prefer to wait for upstream or ship early. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the WHATWG URL spec change (whatwg/url#874) to handle Windows-style file paths in URL parsing.
When
new URL("C:\\path\\file.txt")is called, the parser now detects the Windows drive letter pattern (single ASCII alpha +:+\) and converts it to afile:///URL with normalized forward slashes:file:///C:/path/file.txt.Changes
ext/web/url.rs: Addedmaybe_convert_windows_path_to_file_url()preprocessing function that detects Windows drive letter patterns before the input reachesrust-url's parser. Called fromparse_url()for bothop_url_parseandop_url_parse_with_basecode paths.tests/unit/url_test.ts: Added tests covering basic paths, different drive letters, lowercase drives, mixed separators, paths with base URL, andURL.parse().Spec details
The WHATWG URL spec change adds a check in the "scheme start state": when the parser encounters a single ASCII alpha character as a potential scheme, followed by
:and\, it recognizes this as a Windows drive path rather than a URL scheme. It then sets scheme tofile, host to empty string, and transitions to path state.Test plan
tests/unit/url_test.tsFixes #30363