Skip to content

Conversation

@SethFalco
Copy link
Member

@SethFalco SethFalco commented Aug 25, 2025

Splits our regression tests into separate processes. This was already 2 steps before, but now it'll be 3 steps.

Step Description Reporting
1 Extract files from SVGO Test Suite. N/A
2 Optimize all SVGs. Metrics like bytes saved, time taken, and memory allocation.
3 Compare all SVGs. Test Results like matches and mismatches.

This will use more disk space on both the development environment and on CI. In theory, this will use a little more time overall as well as we now need to write the optimized vectors to disk, whereas before they were handled in memory only.

Motivation

To have more focused metrics. When we compare metrics like time taken and peak memory allocation between runs, there will be less noise/overhead from Playwright or any other libraries used when during the comparison step.

Results

I've done two test runs in a row, and the results are certainly much more consistent.

Test 1

SVGO Test Suite Version: 30ba6479ef7570d3748a30bbbd39f8c7

▶ Test Results
              Match: 4,626 / 4,626
  Expected Mismatch: 126 / 155
            Ignored: 32 / 81
            Skipped: 1

▶ Metrics
        Bytes Saved: 746.644 MiB
         Time Taken: 16m21s
  Peak Memory Alloc: 3.604 GiB

Test 2

SVGO Test Suite Version: 30ba6479ef7570d3748a30bbbd39f8c7

▶ Test Results
              Match: 4,626 / 4,626
  Expected Mismatch: 126 / 155
            Ignored: 29 / 81
            Skipped: 1

▶ Metrics
        Bytes Saved: 746.644 MiB
         Time Taken: 16m31s
  Peak Memory Alloc: 3.605 GiB

Caching SVGO Test Suite

We use the ETag header to determine if we really need to redownload SVGO Test Suite locally. We're currently not using this on CI, but I may consider adding it in future.

On local, this reuses the already extracted SVGs if the ETag of the local copy and the version currently hosted on GitHub Pages matches.

Temporary File API

In most projects I've worked on recently, I favor using the temporary file API if the standard library supports it, rather than creating disposable/git-ignored files in the workspace.

In this PR, I use a temp directory to write the test report. I've also moved all regression fixtures, diffs, etc. to the temp directory as well.

I've confirmed that this works as expected on both Linux and Windows.

We log the location of the temporary directory we use to STDOUT, so that it's easy for a contributor to open this location and review the contents.

File Protocol in Playwright

Before, we used the http protocol because we needed to optimize SVGs during the comparison, and this seemed worthwhile over saving the optimized SVG to disk.

I had explored using the file protocol for the raw, and then using a data URI for the optimized SVG, however many of our SVGs are far too large to fit in the browsers URL bar.

However, the context has changed, as recently we decided to save the optimized SVG to disk before we invoke compare.js. As we have both the raw and optimized SVG on disk, we might as well use file:// instead of http:// now.

This doesn't actually net a significant speed benefit, but it does allow us to remove a lot of web server cruft from compare.js which simplifies the script a lot.

Fixing Runs on Windows

In the last iteration, regression test results did not look as expected on Windows, because we specify POSIX style paths in our file lists, but then compare them to Windows styled paths at runtime.

This also updates it so that we map paths to POSIX style for:

  • Comparisons with file paths we've read from disk.
  • Any persisted data, such as the checksums in svgo-test-report.json.

Remove Timeout on Playwright

I've removed the timeout given to Playwright because we no longer call #optimize behind Playwright.

We invoke #optimize in one process, and use Playwright to render the SVG in another. Given we disable JavaScript, animations, etc., and Playwright has significantly less work to do since it doesn't have to wait on the optimization of SVG anymore, there seems to be no need for this.

We can consider adding a timeout for the invocations of #optimize in future. This may still be warranted, but I will not do this eagerly.

Related

@SethFalco SethFalco merged commit 10350e2 into svg:main Aug 25, 2025
14 checks passed
@SethFalco SethFalco deleted the 2160-ii branch August 25, 2025 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant