Skip to content

Conversation

@mjherzog
Copy link
Member

This PR covers a rewrite of docs/tests.md to cover key definitions including:

  • test groups
  • test types
  • error handling
  • test output messages

Comment on lines +116 to +117
*We will need to clearly define the difference between an info message and
a warning message.*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any PURL implementations support info or warning messages? Do any consumers check those messages? Are there meaningful messages to generate? Implementations would need to either store these messages on the parsed PURL or return a wrapper that includes these messages.

Copy link
Member Author

@mjherzog mjherzog Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion section is not based on what current PURL implementations do. The proposed set of messages are from #614 which is on hold pending the conclusion of this discussion.
I originally used italics to call out open discussion sections, but that does not seem so clear so I switched to using a blockquote to mark the beginning and end of a discussion section.

validation severity messages:*
- *info: "Informational validation message"*
- *warning: "Warning validation message*
- *error: "Error validation message"*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary or beneficial for PURL to specify error message strings to be produced by implementations. It seems potentially reasonable to have an explanation of why the test fails, but specifying the errors produced in this way prohibits or discourages implementations from handling errors in better or more customary ways. Implementations languages that throw exceptions or return typed results/outcomes should return typed errors, ie a syntactically invalid PURL and a PURL that fails package-type-specific validation should result in different types or enum values such that the consuming software doesn't need to analyze a string to determine what's going on.

It's possible to specify that there is a conversion from however the implementation specifies errors into a string, but this seems more useful for the purposes of this test case than in actual usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of adding these messages was introduced with #614 which is pending the conclusion of this discussion.
The messages are not intended to replace the expected_failure_reason, but to provide additional information about the detected problem.
But the current approach for expected_failure_reason is pretty generic. For example the first set of test cases in tests/spec/specification-test.json have a pattern with a specific description but a generic failure reason, such as;

  • description: "a scheme is always required", and expected_failure_reason: "Should fail to parse a PURL from invalid purl input".
    Wouldn't it be more helpful to have a more specific failure reason like: "Should fail to parse a PURL with scheme component missing from the PURL input" ?

Comment on lines +108 to +109
better message could be:"The name component for a PyPI package is not case
senstive".*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are special rules. PURL actually has the wrong rules (#262), but the most important part of the provided example test is checking that the underscore is converted into a hyphen.

Copy link
Member Author

@mjherzog mjherzog Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to: "The name component for a PyPI package is not case sensitive and requires changing a consecutive dash, underscore or dot character to a single dash".

input string. See also `/docs/how-parse.md`.
- **roundtrip**: A test to parse an input PURL string and then rebuild it as a
canonical PURL output string.
- **validation**: A test to validate a PURL input string and report severity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between a validation test and a parsing test? They both parse a PURL and they both check that the expected values are produced.

Copy link
Member Author

@mjherzog mjherzog Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output from a parsing test is the set of decoded PURL components.
The output from a validation test (as currently proposed) is a PURL string.

There are four PURL test types:
- **build**: A test to build a canonical PURL output string from an input of
decoded PURL components. See also `/docs/how-build.md`.
- **parse**: A test to parse decoded components from a canonical PURL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parse tests historically do not need their inputs to be canonical.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and not sure why they should be.
Updated description of parse test type to: "A test to parse decoded components from a PURL input string."
from: "A test to parse decoded components from a canonical PURL input string."

There are four PURL test types:
- **build**: A test to build a canonical PURL output string from an input of
decoded PURL components. See also `/docs/how-build.md`.
- **parse**: A test to parse decoded components from a canonical PURL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests expect something that AFAIK is not specified anywhere. Implementations are expected to canonicalize during parsing and during building. If an implementation does not do this automatically (at least one implementation intentionally doesn't) then the test runner needs to do that or it will fail the tests. It's kind of obvious once you start looking at the test failures, but it would probably be good to specify that the implementation is expected to do that.

Copy link
Member Author

@mjherzog mjherzog Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added:
"Any PURL implementation tool is expected to canonicalize a PURL string or PURL components during a parsing or building operation." at the end of the intro section.

@mjherzog
Copy link
Member Author

@matt-phylum Thank you for your comments - see my replies above. I am still learning how to use GH PRs and Issues to manage a discussion document (instead of a gdoc with comments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants