Skip to content

DI: Bypass subclass overrides of Exception#backtrace_locations via UnboundMethod#5521

Open
p-datadog wants to merge 18 commits intomasterfrom
di-c-ext-exception-backtrace
Open

DI: Bypass subclass overrides of Exception#backtrace_locations via UnboundMethod#5521
p-datadog wants to merge 18 commits intomasterfrom
di-c-ext-exception-backtrace

Conversation

@p-datadog
Copy link
Copy Markdown
Member

@p-datadog p-datadog commented Mar 27, 2026

What does this PR do?

Adds DI::EXCEPTION_BACKTRACE_LOCATIONS, an UnboundMethod captured from Exception.instance_method(:backtrace_locations) at load time. Used via bind(exception).call to get the backtrace without dispatching through the exception's method table, bypassing subclass overrides.

Switches from Exception#backtrace (Array) to Exception#backtrace_locations (ArrayThread::Backtrace::Location). format_backtrace now reads .path, .lineno, .label directly from Location objects instead of regex-parsing formatted strings.

Follows up on this review comment on #5111.

Motivation:

DI instrumentation must never invoke customer code. serialize_throwable was calling exception.backtrace through Ruby method dispatch, which would hit subclass overrides.

The UnboundMethod approach bypasses subclass method tables entirely. It does not protect against monkeypatching Exception#backtrace_locations itself before dd-trace-rb loads, but the practical threat is subclass overrides, which it handles correctly.

Change log entry

None.

Additional Notes:

Follow-up to #5111.

How to test the change?

  • spec/datadog/di/ext/exception_backtrace_spec.rb — tests UnboundMethod bypass of subclass overrides
  • spec/datadog/di/probe_notification_builder_spec.rb — integration test verifying stacktrace formatting with Location objects

@p-datadog p-datadog added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Mar 27, 2026
@github-actions github-actions bot added the profiling Involves Datadog profiling label Mar 27, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

Typing analysis

Note: Ignored files are excluded from the next sections.

steep:ignore comments

This PR introduces 1 steep:ignore comment, and clears 1 steep:ignore comment.

steep:ignore comments (+1-1)Introduced:
lib/datadog/di/probe_notification_builder.rb:137
Cleared:
lib/datadog/di/probe_notification_builder.rb:141

Untyped methods

This PR introduces 2 untyped methods and 7 partially typed methods, and clears 2 untyped methods and 7 partially typed methods. It increases the percentage of typed methods from 61.69% to 61.71% (+0.02%).

Untyped methods (+2-2)Introduced:
sig/datadog/di/probe_notification_builder.rbs:51
└── def tags: () -> untyped
sig/datadog/di/probe_notification_builder.rbs:53
└── def serialized_tags: () -> untyped
Cleared:
sig/datadog/di/probe_notification_builder.rbs:50
└── def tags: () -> untyped
sig/datadog/di/probe_notification_builder.rbs:52
└── def serialized_tags: () -> untyped
Partially typed methods (+7-7)Introduced:
sig/datadog/di/probe_notification_builder.rbs:32
└── def build_snapshot_base: (Context context, ?evaluation_errors: Array[untyped]?, ?captures: untyped?, ?message: String?) -> Hash[Symbol,untyped]
sig/datadog/di/probe_notification_builder.rbs:34
└── def build_condition_evaluation_failed: (Context context, untyped expr, untyped exception) -> Hash[Symbol,untyped]
sig/datadog/di/probe_notification_builder.rbs:36
└── def build_status: (Probe probe, message: untyped, status: untyped, ?exception: Exception?) -> Hash[Symbol,untyped]
sig/datadog/di/probe_notification_builder.rbs:38
└── def format_caller_locations: (Array[untyped] callers) -> Array[Hash[Symbol,untyped]]
sig/datadog/di/probe_notification_builder.rbs:40
└── def evaluate_template: (untyped template, Context context) -> [String, Array[String]]
sig/datadog/di/probe_notification_builder.rbs:42
└── def tag_process_tags!: (Hash[Symbol | String, untyped] payload, Core::Configuration::Settings settings) -> void
sig/datadog/di/probe_notification_builder.rbs:46
└── def get_local_variables: (TracePoint trace_point) -> Hash[Symbol,untyped]
Cleared:
sig/datadog/di/probe_notification_builder.rbs:31
└── def build_snapshot_base: (Context context, ?evaluation_errors: Array[untyped]?, ?captures: untyped?, ?message: String?) -> Hash[Symbol,untyped]
sig/datadog/di/probe_notification_builder.rbs:33
└── def build_condition_evaluation_failed: (Context context, untyped expr, untyped exception) -> Hash[Symbol,untyped]
sig/datadog/di/probe_notification_builder.rbs:35
└── def build_status: (Probe probe, message: untyped, status: untyped, ?exception: Exception?) -> Hash[Symbol,untyped]
sig/datadog/di/probe_notification_builder.rbs:37
└── def format_caller_locations: (Array[untyped] callers) -> Array[Hash[Symbol,untyped]]
sig/datadog/di/probe_notification_builder.rbs:39
└── def evaluate_template: (untyped template, Context context) -> [String, Array[String]]
sig/datadog/di/probe_notification_builder.rbs:41
└── def tag_process_tags!: (Hash[Symbol | String, untyped] payload, Core::Configuration::Settings settings) -> void
sig/datadog/di/probe_notification_builder.rbs:45
└── def get_local_variables: (TracePoint trace_point) -> Hash[Symbol,untyped]

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept on the line before the definition to remove it from the stats.

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 27, 2026

Benchmarks

Benchmark execution time: 2026-03-30 13:08:52

Comparing candidate commit 59fe0b5 in PR branch di-c-ext-exception-backtrace with baseline commit 9220fe8 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 46 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

Same pattern as DI.exception_message: reads the internal `bt` ivar
directly via rb_ivar_get, bypassing any Ruby-level override of
Exception#backtrace. This ensures DI instrumentation never invokes
customer code when serializing exception data.

- Added exception_backtrace to ext/libdatadog_api/di.c
- Updated serialize_throwable to use DI.exception_backtrace
- Added RBS signature
- Added unit tests for the C extension method
- Added integration test for backtrace override bypass

Co-Authored-By: Claude <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the di-c-ext-exception-backtrace branch from 0aa8294 to d979c2b Compare March 27, 2026 18:10
The raw `bt` ivar on exceptions stores a Thread::Backtrace object,
not an Array<String>. Ruby's Exception#backtrace converts it lazily
via rb_backtrace_to_str_ary. On newer Ruby versions, `bt` may even
be nil with the actual data in `bt_locations` (lazy evaluation).

The original implementation returned the raw ivar value, which caused:
- Thread::Backtrace returned instead of Array (broke format_backtrace)
- nil returned for raised exceptions on newer Ruby (lazy evaluation)

Fix by replicating Ruby's conversion logic:
1. If bt is Array (set via set_backtrace), return as-is
2. If bt is Thread::Backtrace, convert via rb_backtrace_to_str_ary
3. If bt is nil, check bt_locations and convert if present

rb_backtrace_p and rb_backtrace_to_str_ary are Ruby internal C
functions (vm_backtrace.c), not customer code — safe to call from
DI instrumentation context.

Co-Authored-By: Claude <noreply@anthropic.com>
@p-datadog p-datadog force-pushed the di-c-ext-exception-backtrace branch from 74a5efb to 59efad8 Compare March 27, 2026 18:35
p-ddsign and others added 7 commits March 27, 2026 14:36
Style/RedundantBegin: the begin/rescue inside do/end blocks is
redundant — the block itself can contain rescue directly.

Co-Authored-By: Claude <noreply@anthropic.com>
Add test for the Array code path in the C extension, exercised when
Exception#set_backtrace has been called. This covers the RB_TYPE_P(bt,
T_ARRAY) early return that wasn't previously tested.

Also fix formatting: split keyword args onto separate lines for
consistency in probe_notification_builder_spec.rb.

Co-Authored-By: Claude <noreply@anthropic.com>
…tions

rb_backtrace_p and rb_backtrace_to_str_ary are not exported symbols
in Ruby's shared library, causing "undefined symbol: rb_backtrace_p"
at runtime on all Ruby versions.

Replace with UnboundMethod approach: capture
Exception.instance_method(:backtrace) once at init time, then use
bind+call to invoke the original C implementation on any exception.
This bypasses customer overrides (the UnboundMethod is captured from
Exception itself) while using only public Ruby API.

Uses bind + call (not bind_call) for Ruby 2.6 compatibility.
The UnboundMethod is registered with rb_gc_register_mark_object to
prevent GC collection.

Co-Authored-By: Claude <noreply@anthropic.com>
rb_backtrace_p and rb_backtrace_to_str_ary are internal Ruby functions
(vm_backtrace.c) that may not be exported as dynamic symbols. The
previous commit declared prototypes manually, which compiled but
failed at runtime with "undefined symbol: rb_backtrace_p" on all
Ruby versions.

Fix: use have_func('rb_backtrace_p') in extconf.rb to detect symbol
availability at compile time. When available, read the bt ivar
directly and convert via rb_backtrace_to_str_ary — no Ruby method
dispatch at all. When unavailable, fall back to calling
Exception#backtrace via an UnboundMethod captured from Exception at
init time, which invokes the original exc_backtrace (error.c)
regardless of subclass overrides.

The bt ivar after raise holds a Thread::Backtrace object. Ruby's
exc_backtrace converts it to Array<String> via rb_backtrace_to_str_ary.
If set via Exception#set_backtrace, bt already holds an Array<String>.

Co-Authored-By: Claude <noreply@anthropic.com>
…ations

Remove all C code for exception backtrace (rb_backtrace_p, have_func
guard, UnboundMethod fallback in di.c). The conversion functions
(rb_backtrace_to_str_ary, rb_backtrace_to_location_ary) are not
exported from libruby.so due to missing RUBY_SYMBOL_EXPORT markers
in internal/vm.h. Reimplementing via private VM headers is correct
but too much work for the gain.

Instead, capture Exception.instance_method(:backtrace_locations) as
an UnboundMethod at load time. bind(exception).call bypasses subclass
overrides — the practical threat model. Does not protect against
monkeypatching Exception itself before dd-trace-rb loads.

Switch from backtrace (Array<String>) to backtrace_locations
(Array<Thread::Backtrace::Location>). DI was regex-parsing the
formatted strings back into path/lineno/label — a pointless
round-trip. Location objects provide these directly.

backtrace_locations available since Ruby 2.6, DI requires 2.6+.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
No wrapper method needed. EXCEPTION_BACKTRACE_LOCATIONS.bind(exc).call
is called directly in probe_notification_builder.rb.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@p-datadog p-datadog changed the title DI: Add exception_backtrace C extension to avoid customer code dispatch DI: Bypass subclass overrides of Exception#backtrace_locations via UnboundMethod Mar 27, 2026
@p-datadog p-datadog marked this pull request as ready for review March 27, 2026 23:26
@p-datadog p-datadog requested review from a team as code owners March 27, 2026 23:26
…_PATTERN

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 bot commented Mar 27, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 95.36% (-0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: e574247 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

p-datadog pushed a commit that referenced this pull request Mar 28, 2026
* origin/di-c-ext-exception-backtrace:
  Fix Steep: update RBS for format_backtrace and remove BACKTRACE_FRAME_PATTERN
  Inline exception_backtrace: use constant directly at call site
  Fix RBS signature: exception_backtrace returns Location not String
  Replace C exception_backtrace with Ruby UnboundMethod + backtrace_locations
  Fix undefined symbol: use have_func to gate rb_backtrace_p
  Fix undefined symbol: use UnboundMethod instead of internal Ruby functions
  Add set_backtrace test and fix formatting in specs
  Fix StandardRB: remove redundant begin blocks
  Fix exception_backtrace to convert Thread::Backtrace to Array<String>
  Add DI.exception_backtrace C extension to avoid customer code dispatch
p-datadog pushed a commit that referenced this pull request Mar 28, 2026
* origin/di-c-ext-exception-backtrace:
  Fix Steep: update RBS for format_backtrace and remove BACKTRACE_FRAME_PATTERN
  Inline exception_backtrace: use constant directly at call site
  Fix RBS signature: exception_backtrace returns Location not String
  Replace C exception_backtrace with Ruby UnboundMethod + backtrace_locations
  Fix undefined symbol: use have_func to gate rb_backtrace_p
  Fix undefined symbol: use UnboundMethod instead of internal Ruby functions
  Add set_backtrace test and fix formatting in specs
  Fix StandardRB: remove redundant begin blocks
  Fix exception_backtrace to convert Thread::Backtrace to Array<String>
  Add DI.exception_backtrace C extension to avoid customer code dispatch
@p-datadog
Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

@p-datadog
Copy link
Copy Markdown
Member Author

@copilot review

@p-datadog p-datadog requested a review from Copilot March 31, 2026 03:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR bypasses subclass overrides of Exception#backtrace_locations by capturing an UnboundMethod from the core Exception class at load time. The change switches from parsing backtrace strings via regex to working directly with structured Thread::Backtrace::Location objects.

Changes:

  • Adds DI::EXCEPTION_BACKTRACE_LOCATIONS UnboundMethod constant to bypass subclass overrides
  • Updates format_backtrace to process Location objects directly instead of regex-parsing strings
  • Removes the now-unused BACKTRACE_FRAME_PATTERN regex constant
  • Updates type signatures and adds comprehensive test coverage

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lib/datadog/di.rb Adds EXCEPTION_BACKTRACE_LOCATIONS UnboundMethod constant
lib/datadog/di/probe_notification_builder.rb Updates backtrace handling to use Location objects and UnboundMethod
sig/datadog/di.rbs Declares EXCEPTION_BACKTRACE_LOCATIONS type
sig/datadog/di/probe_notification_builder.rbs Updates format_backtrace parameter type
spec/datadog/di/ext/exception_backtrace_spec.rb Tests UnboundMethod bypasses subclass overrides
spec/datadog/di/probe_notification_builder_spec.rb Integration test for stacktrace with overridden backtrace method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@p-datadog
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 59fe0b52c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

p-ddsign and others added 7 commits March 31, 2026 09:20
backtrace_locations returns nil when Exception#set_backtrace was called
with Array<String> — the VM cannot reconstruct Location objects from
formatted strings. This happens in exception wrapping patterns.

Add EXCEPTION_BACKTRACE UnboundMethod (same bypass trick) and reinstate
the string-parsing format_backtrace_strings as a fallback path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Exception.instance_method(:backtrace).bind(exception).call returns nil
when the subclass overrides #backtrace — unlike backtrace_locations,
the C implementation of #backtrace does not bypass Ruby-level overrides
via UnboundMethod.

This is acceptable because EXCEPTION_BACKTRACE is only used as a
fallback for the set_backtrace-with-strings case, where no subclass
override is involved. Update the test to document the actual behavior
and add a comment on the constant explaining the limitation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MRI's setup_exception (eval.c) calls rb_get_backtrace during raise,
which checks rb_method_basic_definition_p for #backtrace overrides.
When an override exists, it calls the override, gets a non-nil result,
and skips storing the real VM backtrace in @bt and @bt_locations.
The C function exc_backtrace then reads @bt (still nil from exc_init)
and returns nil.

setup_exception does NOT check for #backtrace_locations overrides —
only #backtrace. So the EXCEPTION_BACKTRACE_LOCATIONS UnboundMethod
works because the real backtrace is always stored in @bt_locations
(unless #backtrace is overridden).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…havior

Documents why UnboundMethod bypasses backtrace_locations overrides but
not backtrace overrides, traced through MRI's setup_exception (eval.c),
rb_get_backtrace (error.c), exc_backtrace, and exc_backtrace_locations.

Includes code examples for every combination of overrides and
set_backtrace, a summary table, and implications for DI's fallback path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- EXCEPTION_BACKTRACE_LOCATIONS: note that overriding #backtrace (not
  just #backtrace_locations) causes @bt_locations to also be nil
- EXCEPTION_BACKTRACE: enumerate all cases — set_backtrace works even
  with override, only gap is override + raise + no set_backtrace
- serialize_throwable: note the unrecoverable case

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos profiling Involves Datadog profiling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants