GDScript: Improve if statement performance when using and/or operators#120660
GDScript: Improve if statement performance when using and/or operators#120660aurpine wants to merge 4 commits into
if statement performance when using and/or operators#120660Conversation
and/or expressionsif statement performance when using and/or operators
| gen->start_expr_cond_buffer(false); | ||
| gen->start_expr_cond_buffer(true); |
There was a problem hiding this comment.
Why is there a false followed by a true call here?
There was a problem hiding this comment.
For the if and else branches. This allows us to patch both jumps. The true jumps go to the if block and the false jumps go to the else or the end if not supplied.
Any statement will call both start buffers. and and or do not call both because some cases are left to this call-site to patch (essentially a short-cut).
| virtual void write_start(GDScript *p_script, const StringName &p_function_name, bool p_static, Variant p_rpc_config, const GDScriptDataType &p_return_type) override; | ||
| virtual GDScriptFunction *write_end() override; | ||
|
|
||
| virtual void start_expr_cond_buffer(bool success) override; |
There was a problem hiding this comment.
We might want to consider using a two-value enum here instead of bool success. This allows calls to be more representable out of context. Consider:
start_expr_cond_buffer(BranchType.TAKEN)vs
start_expr_cond_buffer(true)This also avoids needing to document the meaning of bool success in multiple places.
There was a problem hiding this comment.
That makes sense. Could also do another state for both - though it's only used in that one case.
In the future we may need to flip these when implementing not. Not that it's a problem for the enums though.
Overview
Improves GDScript
ifstatement performance whenandandoroperators are used at the top level (includes nesting). Performance testing shows up to 2-3x speed-up in the most basic case. Note this does not account for condition calculation times. Real-world performance gains are likely lower.The change optimizes the compiler bytecode generation to remove a boolean assignment, and a conditional jump per operator used.
Problem
The compiler treats expressions as a black box when used in a conditional statement. Every logical operator assigns its result to a temporary variable which is then used by branching (or other operators!). Consider the simple statement
This compiles into the following bytecode.
Say if we were to decompile this, it currently looks like it came from the following
Instead, we should try to make it look like it came from
Solution
Instead of jumping to set a variable, we will directly jump to either the
ifbranch orelsebranch (if applicable).With the fix applied, the generated bytecode for the previous example is
The bytecode is 11 slots less for every operator used. We saved a net one conditional jump and one assignment (possibly two with the cleanup).
Implementation
We maintain two stacks of jumps for success and failure. These may be partially patched when the operator needs to do additional checks. Otherwise, some are left to the caller to patch directly to the if/else branch.
gdscript_byte_codegencontains the helpers to write the bytecode. These are pretty low level jumps.gdscript_compilerthis is where the new_parse_expr_condfunction uses the new helpers to create and set the jumps.Explanation on the helpers:
start_expr_cond_bufferflush_expr_cond_bufferstart_expr_cond_bufferto the current position.write_expr_cond_jump_ifwrite_expr_cond_jumpwrite_expr_cond_jump_endwrite_else).write_expr_cond_endwrite_endif)Benchmarks
To keep the benchmark simple only two cases are run with both boolean values set to false. The runtime scales with the number of necessary jump checks.
Scenario A (one check)
Scenario B (two checks)
Testing code
All tests are run on Windows 11, release templates. PR is built with
scons platform=windows target=template_release lto=full production=yes use_mingw=yes.The scenarios are run in a loop with
N=100_000_000. Run 10 times and averaged, with one run discarded as warm up. Due to the speed of if statements, the loop overhead is significant. Subtracting the loop overhead gives us a more accurate glimpse of the performance improvement.Future Work
We can add the changes to the following statements:
whilematchThe changes can be extended to the following expression types:
x if cond else ynot==,!=,>,>=,<,<=