Skip to content

fix: reject Windows file paths in can_parse #957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions include/ada/helpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,21 @@ inline int fast_digit_count(uint32_t x) noexcept {
42949672960, 42949672960};
return int((x + table[int_log2(x)]) >> 32);
}

/**
* Checks if the given input string is a Windows file path.
* A Windows file path starts with a drive letter followed by a colon and a
* backslash or forward slash. Example: "C:\path\file.node" or
* "D:/folder/file.exe"
*
* @param input The string to check
* @return true if the input is a Windows file path, false otherwise
*/
inline bool is_windows_file_path(std::string_view input) noexcept {
return input.length() >= 3 && std::isalpha(input[0]) && input[1] == ':' &&
(input[2] == '\\' || input[2] == '/');
}

} // namespace ada::helpers

#endif // ADA_HELPERS_H
5 changes: 5 additions & 0 deletions src/implementation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include "ada/parser.h"
#include "ada/url.h"
#include "ada/url_aggregator.h"
#include "ada/helpers.h"

namespace ada {

Expand Down Expand Up @@ -49,6 +50,10 @@ std::string href_from_file(std::string_view input) {
}

bool can_parse(std::string_view input, const std::string_view* base_input) {
if (helpers::is_windows_file_path(input)) {
return false;
}
Comment on lines +53 to +55
Copy link
Member

@anonrig anonrig Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point us to where in the URL spec this is returning false? Chrome and Safari returns true...

> URL.canParse("file:///C:/path/file.node")
true
> URL.canParse("C:\\path\\file.node")
true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually allowed. https://url.spec.whatwg.org/#file-host-state

If state override is not given and buffer is a Windows drive letter, file-invalid-Windows-drive-letter-host validation error, set state to path state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you apply the same change to URL parser, not the can_parse() method, you'll see that it breaks web-platform tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, instead of rejecting Windows file paths
Is it a healthy method to normalize them with the file:/// protocol ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the library provides is WHATWG URL support. If the result disagrees with your expectations make sure first to check the WHATWG URL specification. If you disagree with it, you should take your concerns with WHATWG.

We may have bugs but a bug is a disagreement with WHATWG.


ada::url_aggregator base_aggregator;
ada::url_aggregator* base_pointer = nullptr;

Expand Down
2 changes: 2 additions & 0 deletions tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ else()
add_executable(url_components url_components.cpp)
add_executable(basic_tests basic_tests.cpp)
add_executable(from_file_tests from_file_tests.cpp)
add_executable(helpers_tests helpers_tests.cpp)
add_executable(ada_c ada_c.cpp)
add_executable(url_search_params url_search_params.cpp)

Expand All @@ -71,6 +72,7 @@ else()
target_link_libraries(url_components PRIVATE simdjson GTest::gtest_main)
target_link_libraries(basic_tests PRIVATE simdjson GTest::gtest_main)
target_link_libraries(from_file_tests PRIVATE simdjson GTest::gtest_main)
target_link_libraries(helpers_tests PRIVATE simdjson GTest::gtest_main)
target_link_libraries(ada_c PRIVATE simdjson GTest::gtest_main)
target_link_libraries(url_search_params PRIVATE simdjson GTest::gtest_main)

Expand Down
7 changes: 7 additions & 0 deletions tests/basic_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,13 @@ TEST(basic_tests, can_parse) {
std::string_view invalid_base = "!!!!!!!1";
ASSERT_FALSE(ada::can_parse("/hello", &invalid_base));
ASSERT_FALSE(ada::can_parse("!!!"));

ASSERT_FALSE(ada::can_parse("C:\\path\\file.node"));
ASSERT_FALSE(ada::can_parse("D:\\folder\\file.exe"));
ASSERT_FALSE(ada::can_parse("C:/path/file.node"));
ASSERT_TRUE(ada::can_parse("file:///C:/path/file.node"));
ASSERT_TRUE(ada::can_parse("https://example.com"));

SUCCEED();
}

Expand Down
31 changes: 31 additions & 0 deletions tests/helpers_tests.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#include "ada.h"
#include "gtest/gtest.h"

TEST(helpers_tests, is_windows_file_path) {
// Valid Windows paths
ASSERT_TRUE(ada::helpers::is_windows_file_path("C:\\path\\file.node"));
ASSERT_TRUE(ada::helpers::is_windows_file_path("D:\\folder\\file.exe"));
ASSERT_TRUE(ada::helpers::is_windows_file_path("C:/path/file.node"));
ASSERT_TRUE(
ada::helpers::is_windows_file_path("Z:\\deep\\nested\\path\\file.txt"));
ASSERT_TRUE(ada::helpers::is_windows_file_path("A:\\file"));

// Invalid paths
ASSERT_FALSE(ada::helpers::is_windows_file_path("")); // Empty string
ASSERT_FALSE(ada::helpers::is_windows_file_path("C")); // Too short
ASSERT_FALSE(ada::helpers::is_windows_file_path("C:")); // Missing slash
ASSERT_FALSE(
ada::helpers::is_windows_file_path("file:///C:/path/file.node")); // URL
ASSERT_FALSE(
ada::helpers::is_windows_file_path("https://example.com")); // URL
ASSERT_FALSE(
ada::helpers::is_windows_file_path("/path/to/file")); // Unix path
ASSERT_FALSE(
ada::helpers::is_windows_file_path("relative/path")); // Relative path
ASSERT_FALSE(
ada::helpers::is_windows_file_path("1:\\path")); // Invalid drive letter
ASSERT_FALSE(
ada::helpers::is_windows_file_path("C|\\path")); // Invalid separator
ASSERT_FALSE(ada::helpers::is_windows_file_path(
"C:\\")); // Just drive letter and slash
}
Loading