-
Notifications
You must be signed in to change notification settings - Fork 425
[Data Liberation] wp_rewrite_urls() #1893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
1ef710f
Data liberation: Kickoff the project
adamziel 234a8bf
Port the URL rewriters from adamziel/site-transfer-protocol
adamziel 819febd
Port WP_HTML_Processor et al. from WordPress
adamziel 0a6167b
Move WordPress core files
adamziel 826fe75
Outline the next steps
adamziel 0633e6f
Add PHPCS and CBF
adamziel 4406fcf
Update HTML API, fix unit tests
adamziel 0cfd334
Merge branch 'trunk' into data-liberation-bring-in-php-parsers
adamziel b90a9d6
Bump CI PHP version to 8.1
adamziel 081535b
Adjust the CI setup for PHP
adamziel aca88fe
Run npm instlal insteaf of installing just nx
adamziel 897af50
Use the correct nx project name
adamziel f7679b0
Remove the network functions and only lint the src directory
adamziel 5b9ec7d
Remove special casing for direct matching pathname prefixes
adamziel 97fed71
Fix linting errors
adamziel 96c1ce4
Move the additional functions to pbpcbf.php
adamziel e15408a
Replace iterate_urls with url_matches
adamziel b788eea
Lint PHP
adamziel b83933c
Thoroughly test WP_URL_In_Text_Processor
adamziel fb0204c
Enable tests for WP_Block_Markup_Processor
adamziel b1ea8dc
Enable all PHPUnit tests
adamziel 4335044
Enable URLParserWHATWGComplianceTests
adamziel 91863ca
move $is_relative declaration clsoer to where it's used
adamziel d2aeea4
Add a single tricky test case for wp_rewrite_urls()
adamziel 60db1e1
Preserve urlencoded data in the rewritten path
adamziel 2da0386
Unit test urldecoding UTF-8 data
adamziel 54bea02
Lint
adamziel 54c901d
Remove messing with private WP_HTML_Tag_Processor attributes
adamziel a62532b
Remove the commented out dead code from WP_URL_In_Text_Processor
adamziel 238decd
Uncomment the public suffix list verification
adamziel 37622ab
PHP 8.1 compat
adamziel e12190f
PHP 8.1 compliance
adamziel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,7 @@ | ||
| [submodule "isomorphic-git"] | ||
| path="isomorphic-git" | ||
| url=https://github.com/adamziel/isomorphic-git.git | ||
| [submodule "wp-html-api"] | ||
| path="wp-html-api" | ||
| url=https://github.com/WordPress/wordpress-develop | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -33,5 +33,6 @@ | |
| "C_Cpp.errorSquiggles": "disabled", | ||
| "git.branchProtection": [ | ||
| "trunk" | ||
| ] | ||
| ], | ||
| "php.version": "7.2" | ||
| } | ||
41 changes: 41 additions & 0 deletions
41
packages/playground/data-liberation/bin/regenerate_public_suffix_list.php
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| <?php | ||
| /** | ||
| * This script regenerates the public suffix list from the publicsuffix.org website. | ||
| */ | ||
|
|
||
| $suffixes = file_get_contents('https://publicsuffix.org/list/public_suffix_list.dat'); | ||
| $lines = explode("\n", $suffixes); | ||
| $tlds = array(); | ||
| foreach ($lines as $line) { | ||
| if ( empty( $line ) || $line[0] === '/' ) { | ||
| continue; | ||
| } | ||
| if ( strpos( $line, '.' ) !== false ) { | ||
| continue; | ||
| } | ||
| $tlds[] = $line; | ||
| } | ||
|
|
||
|
|
||
| $php_file_path = __DIR__ . '/../src/public_suffix_list.php'; | ||
|
|
||
| $new_php_file_path = $php_file_path.'.swp'; | ||
| $fp = fopen($new_php_file_path, 'w'); | ||
| fwrite($fp, "<?php\n\n"); | ||
| fwrite($fp, "/**"); | ||
| fwrite($fp, "\n * Public suffix list for detecting URLs with known domains within text."); | ||
| fwrite($fp, "\n * This file is automatically generated by regenerate_public_suffix_list.php."); | ||
| fwrite($fp, "\n * Do not edit it directly."); | ||
| fwrite($fp, "\n * @TODO: Process wildcards and exceptions, not just raw TLDs."); | ||
| fwrite($fp, "\n */\n\n"); | ||
| fwrite($fp, "return array(\n"); | ||
| foreach($tlds as $tld) { | ||
| fwrite($fp, "\t'".$tld."' => 1,\n"); | ||
| } | ||
|
|
||
| fwrite($fp, ");\n"); | ||
|
|
||
| if(file_exists($php_file_path)) { | ||
| unlink($php_file_path); | ||
| } | ||
| rename($new_php_file_path, $php_file_path); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| <?php | ||
|
|
||
| require_once __DIR__ . "/../bootstrap.php"; | ||
|
|
||
| if ( $argc < 2 ) { | ||
| echo "Usage: php script.php <command> --file <input-file> --current-site-url <current site url> --new-site-url <target url>\n"; | ||
| echo "Commands:\n"; | ||
| echo " list_urls: List all the URLs found in the input file.\n"; | ||
| echo " migrate_urls: Migrate all the URLs found in the input file from the current site to the target site.\n"; | ||
| exit( 1 ); | ||
| } | ||
|
|
||
| $command = $argv[1]; | ||
| $options = []; | ||
|
|
||
| for ( $i = 2; $i < $argc; $i ++ ) { | ||
| if ( str_starts_with( $argv[ $i ], '--' ) && isset( $argv[ $i + 1 ] ) ) { | ||
| $options[ substr( $argv[ $i ], 2 ) ] = $argv[ $i + 1 ]; | ||
| $i ++; | ||
| } | ||
| } | ||
|
|
||
| if ( ! isset( $options['file'] ) ) { | ||
| echo "The file option is required.\n"; | ||
| exit( 1 ); | ||
| } | ||
|
|
||
| $inputFile = $options['file']; | ||
| if ( ! file_exists( $inputFile ) ) { | ||
| echo "The file $inputFile does not exist.\n"; | ||
| exit( 1 ); | ||
| } | ||
| $block_markup = file_get_contents( $inputFile ); | ||
|
|
||
| // @TODO: Decide – should the current site URL be always required to | ||
| // populate $base_url? | ||
| $base_url = $options['current-site-url'] ?? 'https://playground.internal'; | ||
| $p = new WP_Block_Markup_Url_Processor( $block_markup, $base_url ); | ||
|
|
||
| switch ( $command ) { | ||
| case 'list_urls': | ||
| echo "URLs found in the markup:\n\n"; | ||
| wp_list_urls_in_block_markup( [ 'block_markup' => $block_markup, 'base_url' => $base_url ]); | ||
| echo "\n"; | ||
| break; | ||
| case 'migrate_urls': | ||
| if ( ! isset( $options['current-site-url'] ) ) { | ||
| echo "The --current-site-url option is required for the migrate_urls command.\n"; | ||
| exit( 1 ); | ||
| } | ||
| if ( ! isset( $options['new-site-url'] ) ) { | ||
| echo "The --new-site-url option is required for the migrate_urls command.\n"; | ||
| exit( 1 ); | ||
| } | ||
|
|
||
| echo "Replacing $base_url with " . $options['new-site-url'] . " in the input.\n\n"; | ||
| if (!is_dir('./assets')) { | ||
| mkdir('./assets/', 0777, true); | ||
| } | ||
| $result = wp_rewrite_urls( array( | ||
| 'block_markup' => $block_markup, | ||
| 'base_url' => $base_url, | ||
| 'current-site-url' => $options['current-site-url'], | ||
| 'new-site-url' => $options['new-site-url'], | ||
| ) ); | ||
| if(!is_string($result)) { | ||
| echo "Error! \n"; | ||
| print_r($result); | ||
| exit( 1 ); | ||
| } | ||
| echo $result; | ||
| break; | ||
| } | ||
|
|
||
| function wp_list_urls_in_block_markup( $options ) { | ||
| $block_markup = $options['block_markup']; | ||
| $base_url = $options['base_url'] ?? 'https://playground.internal'; | ||
| $p = new WP_Block_Markup_Url_Processor( $block_markup, $base_url ); | ||
| while ( $p->next_url() ) { | ||
| // Skip empty relative URLs. | ||
| if ( ! trim( $p->get_raw_url() ) ) { | ||
| continue; | ||
| } | ||
| echo '* '; | ||
| switch ( $p->get_token_type() ) { | ||
| case '#tag': | ||
| echo 'In <' . $p->get_tag() . '> tag attribute "' . $p->get_inspected_attribute_name() . '": '; | ||
| break; | ||
| case '#block-comment': | ||
| echo 'In a ' . $p->get_block_name() . ' block attribute "' . $p->get_block_attribute_key() . '": '; | ||
| break; | ||
| case '#text': | ||
| echo 'In #text: '; | ||
| break; | ||
| } | ||
| echo $p->get_raw_url() . "\n"; | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,67 @@ | ||
| <?php | ||
|
|
||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-token.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-span.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-text-replacement.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-decoder.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-attribute-token.php"; | ||
|
|
||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-decoder.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-tag-processor.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-open-elements.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-token-map.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/html5-named-character-references.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-active-formatting-elements.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-processor-state.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-unsupported-exception.php"; | ||
| require_once __DIR__ . "/src/wordpress-core-html-api/class-wp-html-processor.php"; | ||
|
|
||
| require_once __DIR__ . '/src/WP_Block_Markup_Processor.php'; | ||
| require_once __DIR__ . '/src/WP_Block_Markup_Url_Processor.php'; | ||
| require_once __DIR__ . '/src/WP_URL_In_Text_Processor.php'; | ||
| require_once __DIR__ . '/src/WP_URL.php'; | ||
| require_once __DIR__ . '/vendor/autoload.php'; | ||
|
|
||
|
|
||
| // Polyfill WordPress core functions | ||
| function _doing_it_wrong() { | ||
|
|
||
| } | ||
|
|
||
| function __($input) { | ||
| return $input; | ||
| } | ||
|
|
||
| function esc_attr($input) { | ||
| return htmlspecialchars($input); | ||
| } | ||
|
|
||
| function esc_html($input) { | ||
| return htmlspecialchars($input); | ||
| } | ||
|
|
||
| function esc_url($url) { | ||
| return htmlspecialchars($url); | ||
| } | ||
|
|
||
| function wp_kses_uri_attributes() { | ||
| return array( | ||
| 'action', | ||
| 'archive', | ||
| 'background', | ||
| 'cite', | ||
| 'classid', | ||
| 'codebase', | ||
| 'data', | ||
| 'formaction', | ||
| 'href', | ||
| 'icon', | ||
| 'longdesc', | ||
| 'manifest', | ||
| 'poster', | ||
| 'profile', | ||
| 'src', | ||
| 'usemap', | ||
| 'xmlns', | ||
| ); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| { | ||
| "name": "wordpress/data-liberation", | ||
| "prefer-stable": true, | ||
| "require": { | ||
| "ext-json": "*", | ||
| "php": ">=7.2", | ||
| "rowbot/url": "^4.0" | ||
| }, | ||
| "require-dev": { | ||
| "yoast/phpunit-polyfills": "2.0.0", | ||
| "squizlabs/php_codesniffer": "3.*", | ||
| "wp-coding-standards/wpcs": "3.1.0", | ||
| "phpcompatibility/php-compatibility": "*" | ||
| }, | ||
| "config": { | ||
| "optimize-autoloader": true, | ||
| "preferred-install": "dist", | ||
| "allow-plugins": { | ||
| "dealerdirect/phpcodesniffer-composer-installer": true | ||
| } | ||
| }, | ||
| "autoload": { | ||
| "classmap": [ | ||
| "src/" | ||
| ], | ||
| "psr-4": { | ||
| "WordPress\\DataLiberation\\": "src/WordPress" | ||
| }, | ||
| "files": [ | ||
| "src/functions.php" | ||
| ] | ||
| }, | ||
| "autoload-dev": { | ||
| "classmap": [ | ||
| "tests/" | ||
| ] | ||
| }, | ||
| "authors": [ | ||
| { | ||
| "name": "WordPress Contributors", | ||
| "email": "contributors@wordpress.org" | ||
| } | ||
| ] | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.