Skip to content

Conversation

@Lorfdail
Copy link
Contributor

Small group of changes that at least locally reduce the regression test runtime by about 14% over a small set of 3 runs before and after.

As noted in regression.js this may lead to a failure of the test pipeline if pixelmatch is picky about the data sizes but otherwise should be a safe change to do even for future tests. Motivation for this change in particular were the large PNGs usually allocated for even the smallest SVGs in the test data which both increases the time spent allocating as well as the time pixelmatch would take for the comparison.

The other changes improve not only the test runtime but also the normal SVG processing given the input data has cases catched by removeHiddenElems which also has the by far most significant of the 3 changes. All the changes done for baseline performance optimization are very unlikely to cause any harm and would need specific sets of input SVGs mostly consisting of visibility checks regarding the element style to impact performance in a negative way. In my local tests these changes gave an additional ~7-8% time reduction with the svgo data set.

P.S. Your mileague may ofc vary im curious how the pipeline will respond to this as the intended target.

@SethFalco
Copy link
Member

Just to be sure that this affects SVGO and the regression test results the way we expect, we'll review this after we start creating detailed regression test reports:

That way, we can be sure this is indeed just a performance optimization, and isn't affecting the output in any way.

@SethFalco
Copy link
Member

SethFalco commented Aug 18, 2025

I've rebased this to resolve merge conflicts, since this also modified the regression test pipeline. I'd like to see how the proposal performs with my recent changes.

Just in case I did a botched job, I'm just noting the last commit reference of the original author, so we can always reset back: db93624

Some regression tests were failing before, though I'll have to separately check which ones. I would expect this to persist in the rebase, but we can investigate that separately.

For a baseline, this is the last report on main:

SVGO Test Suite Version: 30ba6479ef7570d3748a30bbbd39f8c7

▶ Test Results
              Match: 4,626 / 4,626
  Expected Mismatch: 155 / 155
            Ignored: 39 / 81
            Skipped: 1

▶ Metrics
        Bytes Saved: 746.644 MiB
         Time Taken: 26m31s
  Peak Memory Alloc: 3.857 GiB

PR report:

SVGO Test Suite Version: 30ba6479ef7570d3748a30bbbd39f8c7

▶ Test Results
              Match: 4,624 / 4,626
  Expected Mismatch: 155 / 155
            Ignored: 37 / 81
            Skipped: 1

▶ Metrics
        Bytes Saved: 746.644 MiB
         Time Taken: 22m41s
  Peak Memory Alloc: 3.913 GiB

▶ Expected match, but actually mismatched:
✖ wikimedia-commons/Aegean_sea_Anatolia_and_Armenian_highlands_regions_large_topographic_basemap.svg
✖ wikimedia-commons/Spain_languages-de.svg

@SethFalco
Copy link
Member

SethFalco commented Aug 25, 2025

I'd just merged the second iteration #2160, and have rebased this PR again.

Reference:

For a baseline, this is the last report on main (28m32s):

SVGO Test Suite Version: d962dcd84733a663afcd55bf0b55483b

▶ Test Results
              Match: 4,626 / 4,626
  Expected Mismatch: 155 / 155
            Ignored: 39 / 81
            Skipped: 1

▶ Metrics
        Bytes Saved: 746.644 MiB
         Time Taken: 16m40s
  Peak Memory Alloc: 3.49 GiB

PR report (22m07s):

SVGO Test Suite Version: d962dcd84733a663afcd55bf0b55483b

▶ Test Results
              Match: 4,624 / 4,626
  Expected Mismatch: 155 / 155
            Ignored: 39 / 81
            Skipped: 1

▶ Metrics
        Bytes Saved: 746.644 MiB
         Time Taken: 15m37s
  Peak Memory Alloc: 3.488 GiB

▶ Expected match, but actually mismatched:
✖ wikimedia-commons/Aegean_sea_Anatolia_and_Armenian_highlands_regions_large_topographic_basemap.svg
✖ wikimedia-commons/Spain_languages-de.svg

The difference is much less pronounced in the test report as this largely optimized the comparison step, but the metrics now only measure the optimization step. Refer to the test report headings where I note the CI runtime duration which includes the comparison step as well.

@valadaptive
Copy link

How important are the performance metrics here?

Running a 15-minute regression test every time I want to make a change to svgo is intimidating. I have a work-in-progress PR to parallelize the optimization step in worker threads, but the performance metrics will be less accurate. I feel like "performance testing" and "regression testing" should be two separate things, since the former is a nice-to-have and the latter is a matter of correctness.

@SethFalco
Copy link
Member

@valadaptive

Thank you very much for considering to contribute to SVGO! 💪(•ᴗ•💪)

How important are the performance metrics here?

They are kind of important (from my perspective anyway!), but mostly in an "Oh shi~… Why does optimization suddenly take an hour?" kind of way. The revamp of our regression tests is primarily to be more informed of changes to files in the test suite.

In other words, while I do treat their secondary purpose as a bit of an abstract real-world performance test, I concede implementation details, or host environment/availability will reduce accuracy even for the same implementation.

I actually wanted to track time taken for contributor-experience too! Of course if this results in real-world SVGO performance improvements, that's a huge bonus. I 💯% agree that 15-minutes is pretty long, so I would like to get it down before expanding the test suite further, though.

It's completely acceptable if the number loses accuracy imo! I'd just still expect it to raise an eyebrow if it ever jumps from 15 minutes to 40 minutes from a minor changes, for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants