Skip to content

Match or exceed v1 performance#580

Open
jhollinger wants to merge 20 commits intorelease-2.0from
jh/release-2.0-faster
Open

Match or exceed v1 performance#580
jhollinger wants to merge 20 commits intorelease-2.0from
jh/release-2.0-faster

Conversation

@jhollinger
Copy link
Copy Markdown
Contributor

@jhollinger jhollinger commented Apr 16, 2026

After V1's recent perf improvements, V2's perf was no longer looking so good. These changes make V2 competitive again:

  • Stopped using the extension API for conditionals and default values. While cool, there was an optimization ceiling.
  • Fewer method calls and allocations during serialization.
  • Removed & simplified some extraneous features not present in V1. Allowed removal of lots of tests.

Perf varies with the number of fields, objects, and collections. It can also vary between the top-level serialization being a collection or object (collections are faster b/c some overhead is shared across loops). Use of extensions also caries some overhead (none were used in the measurements below).

# Collection results (20 fields, 5 objects, 2 collections)
1,000 widgets 100x: V2 change: -54.44%
500 widgets 100x: V2 change: -57.05%
250 widgets 100x: V2 change: -56.86%
100 widgets 100x: V2 change: -57.49%
25 widgets 100x: V2 change: -58.59%
5 widgets 100x:  V2 change: -56.17%
1 widgets 100x:  V2 change: -49.90%

# Object results (20 fields, 5 objects, 2 collections)
1,000 widgets 100x: V2 change: -38.13%
500 widgets 100x: V2 change: -39.29%
250 widgets 100x: V2 change: -38.12%
100 widgets 100x: V2 change: -39.60%
25 widgets 100x: V2 change: -46.42%
5 widgets 100x:  V2 change: -51.04%
1 widgets 100x:  V2 change: -49.99%

One thing we lost here is the ability to have extensions that are initialized per-render. I think it's possible to add again, but it will be more complicated, so IMHO we should wait until someone needs it.

@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch 4 times, most recently from cf26f1a to 43f7db0 Compare April 16, 2026 19:21
@jhollinger jhollinger closed this Apr 18, 2026
@jhollinger jhollinger reopened this Apr 18, 2026
@jhollinger jhollinger changed the title Match v1 performance Match or exceed v1 performance Apr 18, 2026
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch 10 times, most recently from ecc17e3 to 9be161e Compare April 21, 2026 15:05
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch 6 times, most recently from f3556f1 to 6d2c7c1 Compare April 22, 2026 16:51
… internal details

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch 2 times, most recently from 98b34e4 to 5841499 Compare April 22, 2026 20:18
@jhollinger jhollinger marked this pull request as ready for review April 22, 2026 20:27
@jhollinger jhollinger requested a review from ritikesh as a code owner April 22, 2026 20:27
@jhollinger jhollinger requested a review from a team as a code owner April 22, 2026 20:27
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch 2 times, most recently from 30abb0f to 22c9256 Compare April 28, 2026 14:37
jhollinger and others added 4 commits April 28, 2026 10:37
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
Add a context-free serialization path for the common case where no
extension hooks, conditionals, default values, formatters, or Proc
extractors are configured.

Previously, every call to `serialize` allocated a `Context::Field` and
a `Context::Parent` struct unconditionally — even when nothing in the
serialization loop ever read them. Together these accounted for ~22% of
all object allocations and caused V2 to trigger 2× more GC runs than V1
under the same workload.

Changes:
- Add `Extractors::Property.extract_simple` to extract field values
  directly from an object/hash without a Context::Field
- Precompute `@needs_field_ctx` at blueprint load time in
  `find_used_hooks!` (requires `finalize_fields!` to run first, hence
  the reorder in `initialize`)
- Branch to `serialize_fast` when `@needs_field_ctx` is false, skipping
  the Context::Field allocation entirely
- Make `Context::Parent` lazy in both paths — created at most once per
  `serialize` call, only when an association field is actually encountered

Result on 500 widgets × 50 iterations (30 fields, 10 object associations,
5 collection associations):
  - Allocations:      −23% (3.3M → 2.56M objects)
  - GC runs:          −26% (85 → 63)
  - Context::Field:   eliminated from hot path (75k → 0 samples)
  - Context::Parent:  eliminated from hot path (75k → ~10 samples)

V2 now allocates fewer objects than V1 for the common case.

Made-with: Cursor
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
Reduce V2 serializer allocations by ~23% via fast path
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 22c9256 to 6057122 Compare April 28, 2026 14:37
…le* speed but should be easier to test & maintain

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 68bcf6d to 39cb22a Compare April 28, 2026 15:41
…wn blueprint options

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 0bdcc75 to 719637c Compare April 28, 2026 15:58
…rnal but prefix private fields with an "_"

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 9d5c4be to 6b70e88 Compare April 28, 2026 19:51
…lds`

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 8e86a45 to 4da5c95 Compare April 29, 2026 20:08
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
Will make if/unless/default/etc options Procs and field blocks behave more similarly to V1's. They can still access the Blueprint instance through `ctx`. Format blocks still use instance_exec, b/c otherwise they couldn't access the Blueprint. And that's new functionality so no compatibility concerns.

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 0232c77 to ca1a0e9 Compare April 30, 2026 00:54
Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
…il after that hook runs

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
…sn't make sense to allow classes or procs

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 499e3ad to 01c168d Compare May 1, 2026 13:10
jhollinger added 2 commits May 1, 2026 09:29
… place now

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
…e time/memory)

Signed-off-by: Jordan Hollinger <jordan.hollinger@procore.com>
@jhollinger jhollinger force-pushed the jh/release-2.0-faster branch from 01c168d to aad5212 Compare May 1, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants