Skip to content

Conversation

@frenchy64
Copy link
Collaborator

@frenchy64 frenchy64 commented Mar 28, 2025

Close #1166
Close #1173

This tightens up -parser for :and in several ways.

The essential insight is that there are two kinds of parsers, I'm calling transforming (e.g., :orn, -collection-schema) and simple (e.g., :any, -simple-schema). Simple parsers return identical input on success. Everything else is transforming.

In all cases I've seen so far, it's possible to accurately predict whether a parser is simple based on its schema. With this information, we can now improve :and's parser by:

  1. banning more than one transforming parser per :and
  2. running the transforming parser last
  3. running the transforming unparser first

This automatically handles [:and S [:fn ..]] and makes it more robust, as :fn is now passed the input value instead of the parsed value and the conjuncts can be in any order.

Extras:

Adds a new schema :andn for when you really want multiple transforming parsers in a conjunction. It reparses the input for each conjunct and returns in a Tags. Unparser only unparses the leftmost child, which enables users to transform the unparsed results by removing the other results.

We can now more aggressively optimize simple (un)parsers upfront to not build a result when it will be identical to the input.

Includes a fix for #1173 by bumping up the :max-tries for generating distinct vectors.

@frenchy64 frenchy64 marked this pull request as ready for review March 28, 2025 23:23
@frenchy64 frenchy64 changed the title WIP: Robust :and parser, add :andn Robust :and parser, add :andn Mar 28, 2025
@frenchy64 frenchy64 requested review from ikitommi and opqdonut March 28, 2025 23:23
Copy link
Member

@opqdonut opqdonut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions / comments. Not confident enough yet to approve.

(if (-ref-schema? this)
(-parser-info (-deref this))
(when (-> this -parent -type-properties ::simple-parser)
{:simple-parser true}))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the need for both ParserInfo and -type-properties ::simple-parser?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could get away with just ParserInfo. It seemed neater at the time to have it at the type-level for trivial types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all the schemas are supposed to have -parser-info, we feel like the method should be in Schema. Also, we feel like -type-properties ::simple-parser makes the feature harder to understand, so we'd prefer you drop that.

The default impl for ParserInfo is neat, but makes following the logic harder for future maintainers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we feel like -type-properties ::simple-parser makes the feature harder to understand, so we'd prefer you drop that.

Sure, I will remove -type-properties ::simple-parser support. FWIW I was following a similar flexibility to malli.generator/-create, where generators can be provided at the IntoSchema level and overriden by Schema.

Since all the schemas are supposed to have -parser-info, we feel like the method should be in Schema.

I think this would be a mistake. It would introduce a large chance of dependency hell, where you can't upgrade malli without waiting for all 3rd party schemas to also be updated.

For example, (m/parser [:vector ::3rd-party]) would throw an exception because ::3rd-party doesn't implement -parser-info. Having ParserInfo as a separate protocol solves this particular cause of dependency hell, since you can define a default (you can't with Schema, since ::3rd-party likely already implements it directly).

The default (nil) says a parser is transforming, which at worst will undo some parser optimizations introduced in this PR (i.e., preserving the same perf as before). For the extra pedantic, you could conceivably assert (comp some? m/-parser-info) for every schema in your registry to find these schemas.

One particular kind of dependency hell is still possible though: if a library provides a schema that now throws :malli.core/and-schema-multiple-transforming-parsers. Maybe a default global handler could be provided for this case to give control back to the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't thought about the upgrade path – that's a very good point. Probably best to go with the separate ParserInfo protocol.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll keep ParserInfo as a separate protocol and remove -type-properties ::simple-parser.

@opqdonut
Copy link
Member

opqdonut commented Apr 1, 2025

Letting @ikitommi have a look as well.

@frenchy64
Copy link
Collaborator Author

Thanks for looking @opqdonut.

Copy link
Member

@opqdonut opqdonut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through this in a session with some people from Metosin. We really like the change, and would like to get it in! Thanks for your effort.

Please move -parser-info to Schema and consider the suggestions we had for :parse.

(if (-ref-schema? this)
(-parser-info (-deref this))
(when (-> this -parent -type-properties ::simple-parser)
{:simple-parser true}))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all the schemas are supposed to have -parser-info, we feel like the method should be in Schema. Also, we feel like -type-properties ::simple-parser makes the feature harder to understand, so we'd prefer you drop that.

The default impl for ParserInfo is neat, but makes following the logic harder for future maintainers.

@frenchy64
Copy link
Collaborator Author

Please move -parser-info to Schema

To recap this discussion, -parser-info is in a separate protocol for compatibility reasons, in particular to avoid dependency hell on 3rd party schemas that aren't yet aware of -parser-info.

Copy link
Member

@opqdonut opqdonut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All our comments seem to be addressed, thanks! @mattiuusitalo will do some more testing, and then will merge this and push an alpha release out for comments. This change might break some weird workflows that we don't know about, so good to be a bit more careful.

@mattiuusitalo mattiuusitalo self-requested a review October 10, 2025 06:32
Copy link
Contributor

@mattiuusitalo mattiuusitalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I also did preliminary testing in some projects that depend on this.

@mattiuusitalo mattiuusitalo merged commit ccbaf44 into metosin:master Oct 10, 2025
14 checks passed
@github-project-automation github-project-automation bot moved this from 🐰 Important to ✅ Done in Metosin Open Source Backlog Oct 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ Done

3 participants