Skip to content

feat(tests): Update benchmark test to query newly introduced max_code_size() from fork config #1649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jochem-brouwer
Copy link
Member

@jochem-brouwer jochem-brouwer commented May 25, 2025

🗒️ Description

This PR expands the benchmark test which queries as many other accounts (EXTCODE* and CALL* operations) by also providing a max code size setting in the fork.

  • Adds max_code_size() to forks
  • Adds max_initcode_size() to forks
  • Edits benchmark test to calculate a correct gas limit w.r.t. deploying the to-be-attacked contracts vs. the actual contract

Note that the deposit costs for the EIP-7907 are much higher (relative) than the original costs. This includes the 200 gas/byte code deposit cost, but also the quadratic memory expansion cost.

The cost of querying one such contract (the "loop cost" of the benchmark) is calculated as 2687 gas/contract. The deposit however costs at least 52M gas due to the 200 gas/byte deposit cost of the 0x40000 bytes. So the cost to deposit such contract vs. querying it is about 20_000 times higher 👀

🔗 Related Issues

✅ Checklist

  • All: Set appropriate labels for the changes.
  • All: Considered squashing commits to improve commit history.
  • All: Added an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • Tests: All converted JSON/YML tests from ethereum/tests have been added to converted-ethereum-tests.txt.
  • Tests: A PR with removal of converted JSON/YML tests from ethereum/tests have been opened.
  • Tests: Included the type and version of evm t8n tool used to locally execute test cases: e.g., ref with commit hash or geth 1.13.1-stable-3f40e65.
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.

@jochem-brouwer jochem-brouwer changed the title Bench/worst bytecode raised max codes size feat(tests): Update benchmark test to query newly introduced max_code_size() from fork config May 25, 2025
@jochem-brouwer
Copy link
Member Author

@jsign I'm coming to this test from the EIP-7907 discussions (not zk related). I'm wondering how this benchmark should be ran? It thus includes the pre-setup of deploying these large contracts. It is clear that the final block contains the actual data we care about, but I'm not sure if test executors have a direct way to measure this. Should I just hack in a performance timer to measure the time it takes to execute the last block?

@jsign
Copy link
Collaborator

jsign commented May 26, 2025

@jsign I'm coming to this test from the EIP-7907 discussions (not zk related). I'm wondering how this benchmark should be ran? It thus includes the pre-setup of deploying these large contracts. It is clear that the final block contains the actual data we care about, but I'm not sure if test executors have a direct way to measure this. Should I just hack in a performance timer to measure the time it takes to execute the last block?

AFAIK, today test executors never have a special consideration for benchmark-like tests, so I think if you want to measure wall-clock time I think you prob have to do a hack yes.

Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it just needs a rebase to fix CI and conflicts.

Thanks for implementing this :)

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented May 29, 2025

I'll get this one ready for review either tonight or tomorrow 😄 👍 (But also feel free to pick this up and get it over the finish line, this comment to edit the predeploy gas limit back to the original high limit: #1649 (comment) should be taken into account)

@jochem-brouwer jochem-brouwer force-pushed the bench/worst_bytecode_raised_max_codes_size branch from 1574a08 to d5b19d2 Compare May 30, 2025 22:35
@jochem-brouwer
Copy link
Member Author

Have also edited the other tests which now depend on the max code size to read it from the fork.

Could, once this is merged, all other PRs which have the MAX_CODE_SIZE now also read this from the fork?

@jsign also see my comment on the "changed" gas limit here: #1649 (comment). I think it is more clear to make this a formula, but I am also fine by hardcoding a very high gas limit.

@marioevz Typecheck correctly complains that my max_code_size() returns int | None and that thus arithmetics like > do not work. I'm not sure, because Frontier for instance has no code size limit and also forks before Shanghai do not have an initcode limit. To hardcode a super high constant is a possibility, but it feels incorrect. Thoughts? Feel free to push the update here also. (I'll mark it ready to review here since this seems to be the last thing to fix here)

@jochem-brouwer jochem-brouwer marked this pull request as ready for review May 30, 2025 22:39
@jsign
Copy link
Collaborator

jsign commented May 31, 2025

@jochem-brouwer, replied the thread we had! (but overall lgtm).

Do you mind remembering before merging (or when fully ready) to ping me so I fill these changed tests and I double check cycles as a last check to not see any difference?

Could, once this is merged, all other PRs which have the MAX_CODE_SIZE now also read this from the fork?

Agree, I can do that after this is merged.

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Jun 3, 2025

I want to get this PR merged, but I need to solve this int | None problem: marking Frontier/Chainstart to have a code size limit is incorrect, but marking it as None will make the typecheck complain (because it's not a number). Should we handle this in tests to default to some reasonable value? For instance make all code sizes default to the current code size of EIP-170 (24 KiB)

@marioevz
Copy link
Member

marioevz commented Jun 3, 2025

I want to get this PR merged, but I need to solve this int | None problem: marking Frontier/Chainstart to have a code size limit is incorrect, but marking it as None will make the typecheck complain (because it's not a number). Should we handle this in tests to default to some reasonable value? For instance make all code sizes default to the current code size of EIP-170 (24 KiB)

Makes sense to me, default from frontier should be 24 KiB, and every test that uses this parameter should take this into account, although I doubt it will even be a problem.

@jochem-brouwer jochem-brouwer force-pushed the bench/worst_bytecode_raised_max_codes_size branch from d5e6317 to a14cb3d Compare June 3, 2025 01:45
@jochem-brouwer
Copy link
Member Author

Before this gets merged, could @jsign gives this an explicit ✔️ after:

Do you mind remembering before merging (or when fully ready) to ping me so I fill these changed tests and I double check cycles as a last check to not see any difference?

To see if I did not mess up any tests 😄 👍

Ready for review! 🚀

@jochem-brouwer jochem-brouwer requested review from jsign and marioevz June 3, 2025 01:47
@jochem-brouwer
Copy link
Member Author

I removed the EIP 7907 config from Osaka also, for experimenting this is fine but I don't think we want to activate it by default (?)

@jsign
Copy link
Collaborator

jsign commented Jun 3, 2025

Before this gets merged, could @jsign gives this an explicit ✔️ after:

Do you mind remembering before merging (or when fully ready) to ping me so I fill these changed tests and I double check cycles as a last check to not see any difference?

To see if I did not mess up any tests 😄 👍

Ready for review! 🚀

I'll do a full run of current master vs this branch to compare them and report back!

@jsign
Copy link
Collaborator

jsign commented Jun 3, 2025

@jochem-brouwer, something feels odd since seems like filling tests in this branch takes way longer than in main (still waiting to finish).

What I'm using to fill: uv run --from=Cancun --until=Cancun -m "zkevm and blockchain_test" --block-gas-limit 36000000 -n auto (use -n 8 if you don't have a lot of RAM)

I think that shouldn't be the case, no?

@jochem-brouwer
Copy link
Member Author

Definitely not! I do not have time to look at it today, but will pick this up tomorrow.

What filler are you using? Standard EELS? How many tests does this pick up?

The gas limit is set as 36M which is reasonable this might therefore also have found a performance degradation in the filler?

However my hunch says it has to do with the tests I raised the gas limit above the setting you provided. Maybe I messed up the calculation and it generates way too many contracts and because of the raised gas limit this is thus possible? Likely has to do with a test with elevated gas limits.

@jsign
Copy link
Collaborator

jsign commented Jun 3, 2025

Definitely not! I do not have time to look at it today, but will pick this up tomorrow.

What filler are you using? Standard EELS? How many tests does this pick up?

I'm using geth as a filler, but in theory shouldn't matter since I also used it for the main filling.

However my hunch says it has to do with the tests I raised the gas limit above the setting you provided. Maybe I messed up the calculation and it generates way too many contracts and because of the raised gas limit this is thus possible? Likely has to do with a test with elevated gas limits.

Yep, I suspect the same. Probably related with the only test that now calculates differently the number of contracts and maybe something odd there!

Copy link
Collaborator

@jsign jsign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared cycles and all looks good 👍 (assuming latest Mario code change is applied)

@jochem-brouwer jochem-brouwer force-pushed the bench/worst_bytecode_raised_max_codes_size branch from 608a439 to 0ed1dc1 Compare June 4, 2025 05:26
@jochem-brouwer
Copy link
Member Author

Once CI passes I will merge this one, the rebase conflicts with the max code sizes are getting annoying. After merge thus read max code size from fork instead of using a constant (will be enforced (likely) on existing files but should also be done on new test files)

@jochem-brouwer jochem-brouwer merged commit ee9b84d into ethereum:main Jun 4, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants