3.1 improve schema test coverage #4781

ralfhandl · 2025-07-15T12:25:03Z

This PR is a cross-port of

3.2 improve schema test coverage #4780

bringing schema test coverage to 100% with the new tool https://github.com/hyperjump-io/json-schema-coverage.

It includes the changes in

v3.2-dev: update from dev #4779

which can't be merged due to the insufficient schema test coverage.

schema changes are included in this pull request
schema changes are needed for this pull request but not done yet
no schema changes are needed for this pull request

Bumps [@hyperjump/json-schema](https://github.com/hyperjump-io/json-schema) from 1.16.0 to 1.16.1. - [Commits](hyperjump-io/json-schema@v1.16.0...v1.16.1) --- updated-dependencies: - dependency-name: "@hyperjump/json-schema" dependency-version: 1.16.1 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]>

…p/json-schema-1.16.1 Bump @hyperjump/json-schema from 1.16.0 to 1.16.1

dev: update from main

More invalid security scheme objects

tests/schema/fail/incomplete-info-license.yaml

tests/schema/fail/incomplete-info-object.yaml

handrews

I'm sorry, but the more I look at this the more I am skeptical that this is the direction we want, and I am extremely skeptical of trying to cram it in right at the end of a release cycle.

I think expanding negative cases needs a lot more discussion about costs and benefits, and it needs to be separated from 3.1.2 and 3.2. What we have now is far better than what we had before, and I am comfortable releasing 3.2 and 3.1.2 with the existing approach. I am just not comfortable with the cost-benefit tradeoff of this approach in this effort right now.

handrews · 2025-08-05T19:40:09Z

tests/schema/fail/invalid-components.yaml

+      allowEmptyValue: yes  # must be a boolean
+      allowReserved: no     # must be a boolean


Are these supposed to be testing that they are not booleans? because yes and no are booleans in yaml:

Python 3.10.10 (main, Feb 16 2023, 23:23:55) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from yaml import safe_load as sl >>> sl(""" ... foo: yes ... bar: no ... """) {'foo': True, 'bar': False} >>>

Is the coverage tool showing this testing the negative case for these fields?

handrews · 2025-08-05T19:48:28Z

tests/schema/fail/invalid-components.yaml

+      examples: true        # must be an object
+      explode: 42           # must be a boolean


I'm picking an arbitrary line to make this comment: I'm just still not sure what we're trying to accomplish here. This does not, in fact, test that these fields must be an object and a boolean, respectively. It tests that examples cannot be a boolean, and explode cannot be a number/integer.

What are we gaining here? We already test that an object is allowed here. If we want to test that nothing else is allowed, we'd need to do a lot more work. If we are just testing some random selection of things are not allowed... I question whether the benefit is worth the maintenance cost.

I have added test cases to the negative tests in the past, but I tested things like complex dependencies among fields. AFAICT, there is some measurement of coverage here and we're just throwing cases at the wall based on whatever that metric is and not based on what is actually cost-effective for our needs.

ralfhandl · 2025-08-07T15:24:01Z

I am comfortable releasing 3.2 and 3.1.2 with the existing approach

The existing approach has two drawbacks:

It is custom-built for us and based on an internal API of the schema validation tool we use, and has no guarantee to work with future versions of the schema validation tool.
- It already broke once in the past due to an incompatible change of the internal API, and we did not notice that for weeks.
- There is now an official schema test coverage tool that we definitely should use instead of the brittle custom approach.
We do not know whether a "fail" test case fails for the intended reason or for some other reason.
- The v3.2-dev branch currently has two test cases fail/encoding-enc-*-exclusion.yaml that fail because their structure is wrong. The deeply nested error of using mutually exclusive keywords side-by-side that is supposed to cause failure is never reached.
- I only noticed because the "negative" case of "do not use both keywords next to each other" was reported as not covered by the new coverage tool, which our custom approach completely missed.

ralfhandl · 2025-08-07T15:34:32Z

Which way forward should we take:

Stick with the brittle custom tool and only fix the two "fail" test cases to make them fail for the right reason.
Switch to the new, supported tool and set a coverage limit well below 100% that is reached with the current set of tests.
Switch to the new tool and bring coverage back up to 100% by adding more "fail" test cases that fail all of the keywords used in our schema files.

This 3.1 PR and its 3.2 "original" go the 3rd way.

And we can use them as a starting point for refactoring the test cases in any future direction while staying at 100%, which allows refactoring in baby steps without losing confidence into our tests.

mikekistler · 2025-08-07T16:36:41Z

My preference is to take option 2 now, and work towards option 3 over time but not making option 3 a requirement for releasing v3.2.

handrews · 2025-08-07T17:27:34Z

@ralfhandl

There is now an official schema test coverage tool that we definitely should use instead of the brittle custom approach.

No one is disputing this. It is only the question of what tests to add now.

I only noticed because the "negative" case of "do not use both keywords next to each other" was reported as not covered by the new coverage tool, which our custom approach completely missed.

And that is a useful thing. But I am not convinced that what this PR is doing in terms of test additions is as useful, nor am I convinced that the coverage tool is working as you seem to expect. e.g. this is not covering the condition it says it is covering and testing 1/6th-1/5th of the incorrect type conditions (depending on how you treat integer) is of questionable cost/benefit value to me.

I strongly dislike placating whatever a coverage tool says. Choosing test cases is about cost/benefit tradeoffs, and a coverage tool does not always understand those. In the case negative type checks, "100%" is not actually 100%. It's at best 20%. And I do not want to make policies about "100% coverage" when our 100% number is a lie. Honestly, I have concerns about some of our positive case coverage metrics as well, but that is a more complex discussion and I was fine to let it be.

As @mikekistler says:

My preference is to take option 2 now, and work towards option 3 over time but not making option 3 a requirement for releasing v3.2.

I don't have the energy to figure out what is going on with the cost-benefit issues here or why the coverage claims don't seem to line up with what I'm seeing. And I do not want to delay 3.2 to figure this out. We can ship with the old system (it worked well enough) and then change it, or we can change the system but not start a mandatory totally new test approach. I'm fine with either of those two, but not with requiring a new "100%" in a way that does not seem, to me, to align with actual thought-out test priorities.

jdesrosiers · 2025-08-07T19:02:40Z

I'll point out that you have some flexibility in how you enforce coverage requirements. There are three categories that can all be configured separately. There's "statements" (keywords), "functions" (subschemas), and "branches" (true/false result of keywords).

The current solution only checks that all keywords ("statements") have been visited. So, if you set only "statements" to 100%, you should get the same enforcement you have now, but you also have the benefit of a maintained tool and nice coverage reports. I believe you could make that change immediately with no disruption or additional tests.

I would suggest also setting "functions" to 100% as that ensures that things like unevaluatedProperties: false are covered, which is missed in the current solution. I think that could be added without controversy.

If "branch" coverage is controversial, you don't have to enforce it at all. It will still be reported and can be used as feedback to point to the gap and you can make a decision whether or not it's worth writing a test for, but it won't fail automated checks.

ralfhandl · 2025-08-07T22:18:37Z

I like @jdesrosiers's proposal.

Unfortunately we are not at 100% statement coverage with the current test cases, see https://github.com/OAI/OpenAPI-Specification/actions/runs/16374451434/job/46270758901?pr=4787.

Who would volunteer adding test cases for reaching 100% statement coverage?

handrews · 2025-08-07T22:30:46Z

Thanks @jdesrosiers, that is helpful! I think it's more that I want to understand the branch coverage more deeply rather than being dead-set against it. So "controversial for now, but very much worth revisiting" is probably where I am with it.

@ralfhandl we should continue this discussion somewhere (in a Discussion, perhaps?) and make a collective decision on what we're measuring and what that measurement needs to be. We are doing so much better with the schema than ever before, and I prefer to focus on that, and whether it is sufficient in its own right, rather than on what numbers we aren't hitting. The list of things I think we should be checking before putting out 3.2 (or a patch release) is very long and we aren't going to do most of them. But we're going to ship anyway, because we're at a point where it is better to ship than not. This is the same sort of call, Ithink.

dependabot bot and others added 20 commits July 9, 2025 07:45

Merge pull request OAI#4761 from OAI/dependabot/npm_and_yarn/hyperjum…

0004349

…p/json-schema-1.16.1 Bump @hyperjump/json-schema from 1.16.0 to 1.16.1

Merge pull request OAI#4764 from OAI/main

73103b7

dev: update from main

Switch to @hyperjump/json-schema-coverage for schema test coverage

cb39996

Add a check to fail the build if there isn't 100% test coverage

5f48809

Use file path for schema test instead of URI

ee625d6

Fix coverage with custom vocabulary

96f62b8

Update coverage to work on windows and don't enforce coverage on dev

059a4b0

Thresholds now work with json-schema-coverage

800542a

Merge branch 'pr/4762' into v3.2-dev

efe3b80

Improved coverage

5cff253

invalid security stuff

54e650a

More invalid security scheme objects

invalid components and tags

253bff7

schema.yaml: max coverage

b5d4acb

static coverage of schema.yaml

07ee9fa

coverage of meta.yaml

4160e31

Hard-coded test for otherwise unreachable branch

44e911d

3.1 adjustments of new test cases

0b3bc65

Merge branch 'pr/4762' into 3.1-schema-coverage

84620e4

Full coverage

9d94dbd

ralfhandl marked this pull request as ready for review July 16, 2025 11:57

ralfhandl requested review from a team as code owners July 16, 2025 11:57

ralfhandl added approved pr port PRs that just port an approved PR to another version Schema labels Jul 16, 2025

ralfhandl mentioned this pull request Jul 16, 2025

v3.1-dev: update from dev #4778

Closed

ralfhandl changed the title ~~3.1 improve schema coverage~~ 3.1 improve schema test coverage Jul 16, 2025

ralfhandl mentioned this pull request Jul 17, 2025

v3.1-dev: update from dev #4786

Closed

handrews requested changes Jul 19, 2025

View reviewed changes

tests/schema/fail/incomplete-info-license.yaml Outdated Show resolved Hide resolved

tests/schema/fail/incomplete-info-object.yaml Outdated Show resolved Hide resolved

ralfhandl marked this pull request as draft July 19, 2025 20:37

ralfhandl added 2 commits July 22, 2025 15:30

sync fail test cases with 3.2

c0cd4f0

comments for all test cases

666d10c

ralfhandl marked this pull request as ready for review July 23, 2025 10:22

ralfhandl requested review from handrews and a team July 23, 2025 10:23

handrews requested changes Aug 5, 2025

View reviewed changes

ralfhandl closed this Aug 7, 2025

ralfhandl deleted the 3.1-schema-coverage branch August 7, 2025 22:03

ralfhandl mentioned this pull request Aug 8, 2025

main: new schema coverage tool #4782

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

3.1 improve schema test coverage #4781

3.1 improve schema test coverage #4781

Uh oh!

ralfhandl commented Jul 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

handrews left a comment

Uh oh!

handrews Aug 5, 2025

Uh oh!

handrews Aug 5, 2025

Uh oh!

ralfhandl commented Aug 7, 2025 •

edited

Loading

Uh oh!

ralfhandl commented Aug 7, 2025

Uh oh!

mikekistler commented Aug 7, 2025 •

edited

Loading

Uh oh!

handrews commented Aug 7, 2025

Uh oh!

jdesrosiers commented Aug 7, 2025

Uh oh!

ralfhandl commented Aug 7, 2025

Uh oh!

handrews commented Aug 7, 2025

Uh oh!

Uh oh!

		allowEmptyValue: yes # must be a boolean
		allowReserved: no # must be a boolean

		examples: true # must be an object
		explode: 42 # must be a boolean

3.1 improve schema test coverage #4781

3.1 improve schema test coverage #4781

Uh oh!

Conversation

ralfhandl commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

handrews left a comment

Choose a reason for hiding this comment

Uh oh!

handrews Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

handrews Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

ralfhandl commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ralfhandl commented Aug 7, 2025

Uh oh!

mikekistler commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrews commented Aug 7, 2025

Uh oh!

jdesrosiers commented Aug 7, 2025

Uh oh!

ralfhandl commented Aug 7, 2025

Uh oh!

handrews commented Aug 7, 2025

Uh oh!

Uh oh!

ralfhandl commented Jul 15, 2025 •

edited

Loading

ralfhandl commented Aug 7, 2025 •

edited

Loading

mikekistler commented Aug 7, 2025 •

edited

Loading