Skip to content

v3.??? Data vs Serialized Examples in Example Objects #4647

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

handrews
Copy link
Member

@handrews handrews commented Jun 2, 2025

This is most of what we would do for the first and most essential of three Example Object related proposals I went over in the last TDC call. I'm posting it because:

  1. I cannot remember the last time I learned this much writing a PR. Using these new example fields helped me find errors in the examples and think through the spec and user implications more clearly. I think this shows the tremendous value these changes would bring compared to the confusing an inconsistently-supported status quo.
  2. This change fits the 3.2 theme of better data modeling and serialization. Without these changes, the other changes we are adding are harder to understand and use correctly. With these changes, most complex modeling and serializaton scenarios can now be shown clearly (the others require the other two parts of this overall proposal, but we could take just this part and not those and still add tremendous value).

Please look through this and particularly focus on how this allowed me to change our in-spec examples to make many things more clear, and show how OAD authors can similarly make their intent clear.

Note that the changes to the XML examples don't (I think) assume the nodeType change in PR #4592, but I might have lost track of that if you see something strange there. The point is that this helps tremendously in showing expected XML behavior.

  • schema changes are included in this pull request
  • schema changes are needed for this pull request but not done yet
  • no schema changes are needed for this pull request

@handrews handrews added param serialization Issues related to parameter and/or header serialization media and encoding Issues regarding media type support and how to encode data (outside of query/path params) example obj/keywords Issues with the Example Object or exampel(s) keywords labels Jun 2, 2025
@handrews
Copy link
Member Author

handrews commented Jun 4, 2025

@mikekistler fixed that example- it had been done with integers but those aren't interesting serialization cases! So I switched to strings.

I also added some text about it being easier/less ambiguous to parse the serialized example and validate the data in many cases than to try to go the other way around.

@handrews
Copy link
Member Author

handrews commented Jun 4, 2025

@mikekistler I have clarified the serialization rules in the latest added commit.

Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will help clarify what examples are and how they should be used.

My comments are intended to clarify some points and also to make this change a more evolutionary transition.

examples:
oneMinute:
dataValue: 60
serializedValue: 'X-Rate-Limit-Reset: 60'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that you have added all these examples!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly it was one of the best things about writing the PR and is what convinced me it's worth making a strong case for it.


##### Fixed Fields

| Field Name | Type | Description |
| ---- | :----: | ---- |
| <a name="example-summary"></a>summary | `string` | Short description for the example. |
| <a name="example-description"></a>description | `string` | Long description for the example. [CommonMark syntax](https://spec.commonmark.org/) MAY be used for rich text representation. |
| <a name="example-value"></a>value | Any | Embedded literal example. The `value` field and `externalValue` field are mutually exclusive. To represent examples of media types that cannot naturally represented in JSON or YAML, use a string value to contain the example, escaping where necessary. |
| <a name="example-external-value"></a>externalValue | `string` | A URI that identifies the literal example. This provides the capability to reference examples that cannot easily be included in JSON or YAML documents. The `value` field and `externalValue` field are mutually exclusive. See the rules for resolving [Relative References](#relative-references-in-api-description-uris). |
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). If this field is present, `externalDataValue`, `value`, and `externalValue` MUST be absent. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since JSON Schemas validate "instances", should we use "instance" instead of "data structure"?

Copy link
Member Author

@handrews handrews Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't decide which language was better. Easy to tweak at any point- I'll think about it as I do the other updates. [EDIT: I decided "instance" was too JSON Schema jargon-y, plus we go outside of the JSON Schema instance model with things like raw binary data.]

src/oas.md Outdated

##### Example Object Examples
Historically, the Example Object's `value` and `externalValue` field and the non-Schema Object singular `example` fields were intended to show examples of the serialized form, while allowing JSON or YAML examples to be included inline rather than as serialized strings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to the "original intent" for these fields -- I wasn't involved at the time. But I wonder if this was really consciously decided as the intent, and then somehow omitted from the spec, or if instead it was not really understood that there is a difference between the data value and serialized value and that is why the distinction was not included in the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler I understand that tooling vendors have implemented it differently, but these quotes from 3.0.3 seem pretty clear to me, and we've debated this before and always agreed that the encoded/serialized form was the intent. AFAIK, this effort by you right now is the first time anyone deeply involved has taken the opposite position in terms of what the wording means:

From the Parameter Object:

When example or examples are provided in conjunction with the schema object, the example MUST follow the prescribed serialization strategy for the parameter.

From the Media Type Object example field:

The example object SHOULD be in the correct format as specified by the media type.

From the Media Type Object examples field:

Examples of the media type. Each example object SHOULD match the media type and specified schema if present.

From the Example Object value field:

To represent examples of media types that cannot naturally represented in JSON or YAML, use a string value to contain the example, escaping where necessary.

From the Example Object externalValue field:

This provides the capability to reference examples that cannot easily be included in JSON or YAML documents.

From the Example Object after the fixed fields table:

In all cases, the example value is expected to be compatible with the type schema of its associated value. Tooling implementations MAY choose to validate compatibility automatically, and reject the example value(s) if incompatible.

I honestly don't see how all of this above, taken together, can be interpreted in any other way.

Comment on lines +2310 to +2314
{
"author": "A. Writer",
"title": "An Older Book",
"rating": 4.5
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this serialized value be a string, which condensed white-space (as was done above)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler it is a string, that's what the | at the end of the serializedValue: | does (YAML block literal).

As for minimizing, I did that for the parameter because it has to be shoved into the query string. This is presumed to be for a body, so it might be sent without that minimizing. But TBH I was just trying out all sorts of different things, I'm not super-attached to this.

Copy link
Member Author

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll work on an update hopefully tonight, or at least before the meeting tomorrow morning.

Comment on lines +2310 to +2314
{
"author": "A. Writer",
"title": "An Older Book",
"rating": 4.5
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler it is a string, that's what the | at the end of the serializedValue: | does (YAML block literal).

As for minimizing, I did that for the parameter because it has to be shoved into the query string. This is presumed to be for a body, so it might be sent without that minimizing. But TBH I was just trying out all sorts of different things, I'm not super-attached to this.


##### Fixed Fields

| Field Name | Type | Description |
| ---- | :----: | ---- |
| <a name="example-summary"></a>summary | `string` | Short description for the example. |
| <a name="example-description"></a>description | `string` | Long description for the example. [CommonMark syntax](https://spec.commonmark.org/) MAY be used for rich text representation. |
| <a name="example-value"></a>value | Any | Embedded literal example. The `value` field and `externalValue` field are mutually exclusive. To represent examples of media types that cannot naturally represented in JSON or YAML, use a string value to contain the example, escaping where necessary. |
| <a name="example-external-value"></a>externalValue | `string` | A URI that identifies the literal example. This provides the capability to reference examples that cannot easily be included in JSON or YAML documents. The `value` field and `externalValue` field are mutually exclusive. See the rules for resolving [Relative References](#relative-references-in-api-description-uris). |
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). If this field is present, `externalDataValue`, `value`, and `externalValue` MUST be absent. |
Copy link
Member Author

@handrews handrews Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't decide which language was better. Easy to tweak at any point- I'll think about it as I do the other updates. [EDIT: I decided "instance" was too JSON Schema jargon-y, plus we go outside of the JSON Schema instance model with things like raw binary data.]

examples:
oneMinute:
dataValue: 60
serializedValue: 'X-Rate-Limit-Reset: 60'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly it was one of the best things about writing the PR and is what convinced me it's worth making a strong case for it.

src/oas.md Outdated

##### Example Object Examples
Historically, the Example Object's `value` and `externalValue` field and the non-Schema Object singular `example` fields were intended to show examples of the serialized form, while allowing JSON or YAML examples to be included inline rather than as serialized strings.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler I understand that tooling vendors have implemented it differently, but these quotes from 3.0.3 seem pretty clear to me, and we've debated this before and always agreed that the encoded/serialized form was the intent. AFAIK, this effort by you right now is the first time anyone deeply involved has taken the opposite position in terms of what the wording means:

From the Parameter Object:

When example or examples are provided in conjunction with the schema object, the example MUST follow the prescribed serialization strategy for the parameter.

From the Media Type Object example field:

The example object SHOULD be in the correct format as specified by the media type.

From the Media Type Object examples field:

Examples of the media type. Each example object SHOULD match the media type and specified schema if present.

From the Example Object value field:

To represent examples of media types that cannot naturally represented in JSON or YAML, use a string value to contain the example, escaping where necessary.

From the Example Object externalValue field:

This provides the capability to reference examples that cannot easily be included in JSON or YAML documents.

From the Example Object after the fixed fields table:

In all cases, the example value is expected to be compatible with the type schema of its associated value. Tooling implementations MAY choose to validate compatibility automatically, and reject the example value(s) if incompatible.

I honestly don't see how all of this above, taken together, can be interpreted in any other way.

@handrews
Copy link
Member Author

handrews commented Jun 5, 2025

@mikekistler I have updated quite a few things, and left other things the same, with the upshot being that this PR is only about adding new fields to the Example Object and showing their use:

  • I have un-deprecated all of the fields, and done my best to retain the prior language around them- some language moved from just after the bit about x- extensions to right at the top of the "Working With Examples" section because otherwise it was just too awkward
  • This means keeping the "override" language- we can talk about changing that separately as it is completely independent of whether we add new fields
  • This also means not providing new guidance on how to interpret the old fields, as that needs to be done whether we add new fields or not
  • I decided against using "instance" as that is too jargon-y, and more a JSON Schema thing, and arguably we work with data beyond the JSON Schema instance model anyway (e.g. raw binary). If we decide to add the fields we can debate the language, but let's decide whether the fields are worth adding first, please.
  • I did fix the XML examples and various other small things.
  • I'm probably forgetting something, I'll check again in the morning

@lornajane
Copy link
Contributor

There's a lot of support for this change and especially for the examples that are added. We need to discuss naming before merging, but we think this can go into 3.2.

Co-authored-by: Dan Hudlow <[email protected]>
@handrews
Copy link
Member Author

handrews commented Jun 5, 2025

Comment on lines +1090 to +1093
examples:
number:
dataValue: 12345678
serializedValue: "token: 12345678"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a second example, demonstrating the use of an array coming from deserializing the simple style?

Suggested change
examples:
number:
dataValue: 12345678
serializedValue: "token: 12345678"
examples:
one number:
dataValue: 12345678
serializedValue: "token: 12345678"
two numbers:
dataValue:
- 12345678
- 99999999
serializedValue: "token: 12345678,99999999"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge as noted in the past TDC call, I'd like to handle newly added examples separately, otherwise we'll never finish.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge upon further thought, I'm going to split this PR up to separate the new fields from all of the extra example updates. I'll take this request into consideration then- I think that might be better than either allowing endless new example requests here or trying to keep them out of this process entirely.

# JSON Text Sequences require an unprintable character
# that cannot be escaped in a YAML string, and therefore
# must be placed in an external document shown below
externalValue: examples/log.json-seq
externalSerializedValue: examples/log.json-seq
Copy link
Member

@karenetheridge karenetheridge Jun 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really cool if we had the html renderer detect this keyword and add a hyperlink to a real file named "./examples/log.json-seq" in the github-pages repo.

Co-authored-by: Karen Etheridge <[email protected]>

##### Fixed Fields

| Field Name | Type | Description |
| ---- | :----: | ---- |
| <a name="example-summary"></a>summary | `string` | Short description for the example. |
| <a name="example-description"></a>description | `string` | Long description for the example. [CommonMark syntax](https://spec.commonmark.org/) MAY be used for rich text representation. |
| <a name="example-data-value"></a>dataValue | Any | An example of the data structure that MUST be valid according to the relevant [Schema Object](#schema-object). If this field is present, `externalDataValue`, `value`, and `externalValue` MUST be absent. |
| <a name="example-external-data-value"></a>externalDataValue | `string` | A URI that identifies the data example in a separate document, allowing for values not easily expressed in JSON or YAML. This is usually only needed when working with binary data. The value MUST be valid according to the relevant Schema Object. If this field is present, then `dataValue`, `value`, and `externalValue` MUST be absent. See also the rules for resolving [Relative URI References](#relative-references-in-api-description-uris). |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from some clarifications that could be postponed to another PR, my big concern is that I can't make heads or tails out of what a non-serialized external value looks like. In my mind, external values are inherently serialized: if you can provide externalDataValue: ./image.png and the image is treated opaquely then it implies that when you provide externalDataValue: ./data.json then the JSON document will be treated opaquely, and this would semantically match dataValue: "{ ... }" not dataValue: { ... }.

If you wanted the latter then the phrase "allowing for values not easily expressed in JSON or YAML" doesn't seem right, plus if all you want is to move a JSON data structure to an external URI, you can just use $ref at the usage point for the example value? Yes, that means the whole Example Object is externalized but... that seems fine?

It is my belief that we should merely clarify that externalValue is treated opaquely (i.e., as a serialized value) and skip adding externalDataValue and externalSerializedValue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

externalDataValue acts just like the $ref keyword -- the serialization level of the file is the same as the referencing document, but it's in a separate file solely because of its size (presumably), or to perhaps allow it to be modified at a different cadence than the referencing document itself (although that makes less sense here when the example needs to align with the schema).

I would imagine it would be the least-used of the four new keywords, but I see no reason to omit it.

Copy link

@hudlow hudlow Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge so you disagree with the text as written in this table and with the example on line 2328? If it works like this, why not just allow dataValue to be a reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then you can't have an example that uses a property named $ref, this has been discussed extensively in past issues, search for an issue allowing for combining examples.

}
```

###### Binary Examples
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might also be nice to show a data-URL-scheme example. My assumption is that what URL schemes an interpreter of an OpenAPI description supports are completely out of scope, but there's no reason one couldn't support the data scheme?

Something like:

externalSerializedValue: "_xhBQAAADhlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAAqACAAQAAAABAAAAAqADAAQAAAABAAAAAgAAAADO0J6QAAAAEElEQVQIHWP8zwACTGCSAQANHQEDqtPptQAAAABJRU5ErkJggg%3D%3D"`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hudlow can you please file an issue requesting that the OAS support this for external example fields (regardless of which external fields we end up having)? I think it is an interesting idea worth discussing but orthogonal to this PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #4674

Co-authored-by: Dan Hudlow <[email protected]>
@handrews
Copy link
Member Author

handrews commented Jun 9, 2025

I am closing this in favor of the following PRs, because this is too big and the conversation has a lot of complex threads that have already been resolved.

@handrews handrews closed this Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
example obj/keywords Issues with the Example Object or exampel(s) keywords media and encoding Issues regarding media type support and how to encode data (outside of query/path params) param serialization Issues related to parameter and/or header serialization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants