-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Minor optimiztions to reduce regression test runtime #2135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Just to be sure that this affects SVGO and the regression test results the way we expect, we'll review this after we start creating detailed regression test reports: That way, we can be sure this is indeed just a performance optimization, and isn't affecting the output in any way. |
db93624 to
6a01878
Compare
|
I've rebased this to resolve merge conflicts, since this also modified the regression test pipeline. I'd like to see how the proposal performs with my recent changes. Just in case I did a botched job, I'm just noting the last commit reference of the original author, so we can always reset back: db93624 Some regression tests were failing before, though I'll have to separately check which ones. I would expect this to persist in the rebase, but we can investigate that separately. For a baseline, this is the last report on PR report: |
6a01878 to
d523322
Compare
…imply limiting the screenshot area of the output to the rendered svg
…ributes to check for once
…nd has been matched successfully
… have not catched on
d523322 to
b0d1d01
Compare
|
I'd just merged the second iteration #2160, and have rebased this PR again. Reference: For a baseline, this is the last report on PR report (22m07s): The difference is much less pronounced in the test report as this largely optimized the comparison step, but the metrics now only measure the optimization step. Refer to the test report headings where I note the CI runtime duration which includes the comparison step as well. |
|
How important are the performance metrics here? Running a 15-minute regression test every time I want to make a change to svgo is intimidating. I have a work-in-progress PR to parallelize the optimization step in worker threads, but the performance metrics will be less accurate. I feel like "performance testing" and "regression testing" should be two separate things, since the former is a nice-to-have and the latter is a matter of correctness. |
|
Thank you very much for considering to contribute to SVGO! 💪(•ᴗ•💪)
They are kind of important (from my perspective anyway!), but mostly in an "Oh shi~… Why does optimization suddenly take an hour?" kind of way. The revamp of our regression tests is primarily to be more informed of changes to files in the test suite. In other words, while I do treat their secondary purpose as a bit of an abstract real-world performance test, I concede implementation details, or host environment/availability will reduce accuracy even for the same implementation. I actually wanted to track time taken for contributor-experience too! Of course if this results in real-world SVGO performance improvements, that's a huge bonus. I 💯% agree that 15-minutes is pretty long, so I would like to get it down before expanding the test suite further, though. It's completely acceptable if the number loses accuracy imo! I'd just still expect it to raise an eyebrow if it ever jumps from 15 minutes to 40 minutes from a minor changes, for example. |
Small group of changes that at least locally reduce the regression test runtime by about 14% over a small set of 3 runs before and after.
As noted in regression.js this may lead to a failure of the test pipeline if pixelmatch is picky about the data sizes but otherwise should be a safe change to do even for future tests. Motivation for this change in particular were the large PNGs usually allocated for even the smallest SVGs in the test data which both increases the time spent allocating as well as the time pixelmatch would take for the comparison.
The other changes improve not only the test runtime but also the normal SVG processing given the input data has cases catched by removeHiddenElems which also has the by far most significant of the 3 changes. All the changes done for baseline performance optimization are very unlikely to cause any harm and would need specific sets of input SVGs mostly consisting of visibility checks regarding the element style to impact performance in a negative way. In my local tests these changes gave an additional ~7-8% time reduction with the svgo data set.
P.S. Your mileague may ofc vary im curious how the pipeline will respond to this as the intended target.