chore(llmobs): dac strip io from OpenAI #13791

jsimpher · 2025-06-26T18:07:24Z

Remove potentially sensitive i/o data from apm spans. This way, prompt and completion data will only appear on the llm obs spans, which are/will be subject to data access controls.

Mostly, this just removes io tag sets. A few things (mostly metrics) have llmobs tags dependent on span tags, so there is a bit more refactoring there.

Let me know if I removed anything that should really stay, or if I missed something that should be restricted.

This one does a lot that the others don't. I've left things like audio transcript and image/file retrieval that we don't duplicate.

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

github-actions · 2025-06-26T18:07:52Z

CODEOWNERS have been resolved as:

releasenotes/notes/remove-io-data-from-apm-span-openai-integration-81f3ae914a5d2faf.yaml  @DataDog/apm-python
ddtrace/contrib/internal/openai/_endpoint_hooks.py                      @DataDog/ml-observability
ddtrace/contrib/internal/openai/utils.py                                @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai.py                                  @DataDog/ml-observability
tests/contrib/openai/test_openai_v1.py                                  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_acompletion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_azure_openai_chat_completion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_azure_openai_completion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_azure_openai_embedding.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion_function_calling.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion_image_input.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_completion.json   @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_completion_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_create_moderation.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding.json    @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding_array_of_token_arrays.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding_string_array.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding_token_array.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_create.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_delete.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_download.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_list.json    @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_retrieve.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_image_b64_json_response.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_image_create.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_misuse.json       @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_model_delete.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_model_list.json   @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_model_retrieve.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response.json     @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_error.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_tools.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_tools_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_span_finish_on_stream_error.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_completion_stream_est_tokens.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_empty_streamed_chat_completion_resp_returns.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_empty_streamed_completion_resp_returns.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_empty_streamed_response_resp_returns.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_async.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[None-None].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[None-v0].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[None-v1].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[mysvc-None].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[mysvc-v0].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[mysvc-v1].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_sync.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents_streaming.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents_sync.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents_with_tool_error.json  @DataDog/ml-observability

github-actions · 2025-06-26T18:29:51Z

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 277 ± 3 ms.

The average import time from base is: 279 ± 3 ms.

The import time difference between this PR and base is: -1.8 ± 0.1 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 1.971 ms (0.71%)

ddtrace.bootstrap.sitecustomize 1.297 ms (0.47%)

ddtrace.bootstrap.preload 1.297 ms (0.47%)

ddtrace.internal.remoteconfig.client 0.647 ms (0.23%)

ddtrace 0.673 ms (0.24%)

ddtrace.internal._unpatched 0.030 ms (0.01%)

json 0.030 ms (0.01%)

json.decoder 0.030 ms (0.01%)

re 0.030 ms (0.01%)

enum 0.030 ms (0.01%)

types 0.030 ms (0.01%)

pr-commenter · 2025-06-26T19:01:49Z

Benchmarks

Benchmark execution time: 2025-06-26 21:47:55

Comparing candidate commit 06e2b01 in PR branch jsimpher/dac-strip-io-from-openai with baseline commit 82ca0cf in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 561 metrics, 3 unstable metrics.

datadog-datadog-prod-us1 · 2025-06-26T19:36:48Z

ddtrace/llmobs/_integrations/openai.py

@@ -164,7 +178,7 @@ def _llmobs_set_meta_tags_from_embedding(span: Span, kwargs: Dict[str, Any], res
        span._set_ctx_item(OUTPUT_VALUE, "[{} embedding(s) returned]".format(len(resp.data)))

    @staticmethod
-    def _extract_llmobs_metrics_tags(span: Span, resp: Any, span_kind: str) -> Dict[str, Any]:
+    def _extract_llmobs_metrics_tags(span: Span, resp: Any, span_kind: str) -> Optional[Dict[str, Any]]:


🟠 Code Quality Violation

do not use Any, use a concrete type (...read more)

Use the Any type very carefully. Most of the time, the Any type is used because we do not know exactly what type is being used. If you want to specify that a value can be of any type, use object instead of Any.

Learn More

Python documentation: the Any type

ncybul