Skip to content

feat: JSONify variant types when flattening records#3416

Draft
edgarrmondragon wants to merge 3 commits into
mainfrom
jsonsify-flattened-unknown-types
Draft

feat: JSONify variant types when flattening records#3416
edgarrmondragon wants to merge 3 commits into
mainfrom
jsonsify-flattened-unknown-types

Conversation

@edgarrmondragon
Copy link
Copy Markdown
Collaborator

@edgarrmondragon edgarrmondragon commented Dec 16, 2025

Summary by Sourcery

Coerce variant and typeless schema properties to JSON-serialized strings during flattening and ensure record flattening respects these markers.

New Features:

  • Mark combined (oneOf/anyOf/allOf) and typeless schema properties with an x-json-serialize flag and coerce them to nullable string type when flattening schemas.

Enhancements:

  • Update record flattening logic to JSON-encode values based on the x-json-serialize schema marker and to correctly use flattened property keys when deciding serialization.

Tests:

  • Expand flattening tests to cover combinator schemas, edge cases, and JSON serialization behavior for both simple and complex values, including end-to-end scenarios.

Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Dec 16, 2025

Reviewer's Guide

Adjusts schema flattening to coerce typeless and combined-schema fields (oneOf/anyOf/allOf) to JSON-serialized strings, and updates record flattening logic and tests to rely on an x-json-serialize marker for consistent JSON encoding of flattened values.

Flow diagram for schema flattening with x-json-serialize markers

flowchart TD
    A[Start flatten_schema] --> B[Call _flatten_schema with schema_node]
    B --> C[Iterate over field_schema entries]
    C --> D{field_schema has type key}
    D --> E[Normalize type to list]
    E --> F[Append new_key and field_schema or coerced type to items]
    F --> G[Next field]

    D --> H[No type key]
    H --> I[Handle oneOf anyOf allOf or typeless]
    I --> J[Append new_key type null or string x-json-serialize True to items]
    J --> G

    G --> K{More fields?}
    K --> C
    K --> L[Sort items and check duplicates]
    L --> M[Build flattened_schema.properties from items]
    M --> N[Return flattened_schema]
    N --> O[End]

    subgraph Legend
      L1[Fields with explicit type] --> L2[Flatten normally]
      L3[oneOf anyOf allOf or typeless fields] --> L4[Marked x-json-serialize True]
    end
Loading

File-Level Changes

Change Details Files
Coerce typeless and combined-schema properties to string with x-json-serialize for JSON encoding during schema flattening.
  • Update flatten_schema/_flatten_schema docstrings to document coercion of combined schemas and typeless properties to JSON-serialized strings.
  • Remove special-casing for oneOf/anyOf/allOf based on first element type and instead treat any such or typeless properties as nullable strings with x-json-serialize set to true.
  • Ensure flattened schemas now carry x-json-serialize on affected properties so downstream record flattening can detect fields needing JSON encoding.
singer_sdk/helpers/_flattening.py
Base JSON serialization decisions in record flattening on flattened_schema.properties and new x-json-serialize flag, and fix key used for lookup.
  • Change _flatten_record to pass the fully-qualified flattened key (new_key) into _should_jsondump_value instead of the original key.
  • Redefine _should_jsondump_value to always JSON-encode dict/list values and otherwise consult flattened_schema['properties'][key]['x-json-serialize'] when present.
  • Remove previous type-set-based JSON-dump heuristic that looked for object/array types directly on the flattened_schema root.
singer_sdk/helpers/_flattening.py
Update and extend tests to assert JSON serialization behavior for typeless and combined-schema fields in both schema and record flattening.
  • Adjust test_flatten_record_with_typeless_property_values expectations so both complex and simple typeless values are JSON-serialized (including simple strings).
  • Modify test_flatten_combined_schemas to expect nullable string types with x-json-serialize on name, address, phones, and id properties.
  • Add multiple new tests covering combined schemas: coercion of combinators to string type with x-json-serialize, record flattening behavior for combinator fields with various primitive values, end-to-end nested combinator scenarios, and edge cases such as empty combinators, missing types, and null values.
  • Assert that null values for x-json-serialize fields are JSON-encoded as the string 'null'.
tests/core/test_flattening.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.01%. Comparing base (5006cb5) to head (759c47f).

Files with missing lines Patch % Lines
singer_sdk/helpers/_flattening.py 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3416      +/-   ##
==========================================
- Coverage   94.07%   94.01%   -0.07%     
==========================================
  Files          69       69              
  Lines        5777     5769       -8     
  Branches      716      713       -3     
==========================================
- Hits         5435     5424      -11     
- Misses        239      241       +2     
- Partials      103      104       +1     
Flag Coverage Δ
core 81.83% <60.00%> (-0.08%) ⬇️
end-to-end 76.52% <20.00%> (+0.10%) ⬆️
optional-components 43.56% <20.00%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 16, 2025

CodSpeed Performance Report

Merging #3416 will not alter performance

Comparing jsonsify-flattened-unknown-types (759c47f) with main (ce5e41d)1

Summary

✅ 8 untouched

Footnotes

  1. No successful run was found on main (5006cb5) during the generation of this report, so ce5e41d was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@edgarrmondragon edgarrmondragon self-assigned this Dec 16, 2025
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant