The Traversal story

Let me write some thought on the design of optics, and especially
traversals, so that I don't forget again. All this should make its way
into a document 

### Profunctor optics

When designing the optics library, I settled on profunctor optics
rather than a Van Laarhoven encoding for two reasons:
- Profunctor optics compose better (mostly because prisms always
  require some profunctoriness, which the lens library has to work around)
- It makes very clear how to make a `newtype` for `Optic_` that
  everything else is a restriction on.

As I'm writing this, I realise that we could have done `newtype Optic_
p f s t a b = Optical (p a (f b)) -> p s (f t))` and defined, say,
lens as `type Lens s t a b = forall f. Functor f => Optic (FUN 'One) s
t a b`), which counter the second point, but as I'm writing this
counter, another benefit of profunctors is coming back:

- Profunctors abstract over `FUN 'One` vs `(->) = FUN 'Many`.

The latter is used in lens for instance so that `over` can be linear
while `get` and `set` aren't. If we forced `FUN 'One`, then `set` and
`get` would still be definable, but would require additional
allocations, which would be a bit unfortunate.

So, anyway, profunctor it is. Not that it makes the problem of
`Traversal` any harder really, so whatever.

### Traversals

The first difficulty with traversals is that they can be defined in
two ways. Let me define a bunch of classes on profunctors:

```haskell
class Strong (,) () p where
  second :: p a b -> p (c, a) (c, b)
class Strong Either Void p where
  second :: p a b -> p (Either c a) (Either c b)
class Monoidal (,) () p where
  (***) :: p a b -> p c d -> p (a, c) (b, d)
  unit :: p () ()
class (Strong (,) () p, Strong Either Void p) => Wandering p where
  -- | Equivalently but less efficient in general:
  --
  -- > wander' :: Data.Traversable t => p a b -> p (t a) (t b)
  --
  -- It would be less efficient because `wander'` is trivially
  -- implemented as `wander traverse`, implementing `wander` in terms
  -- of `wander'` actually requires allocating some intermediate traversable
  -- structure.
  wander :: forall s t a b. (forall f. Control.Applicative f => (a %1 -> f b) -> s %1 -> f t) -> p a b -> p s t
```

Both `Strong` super-classes are implied by `wander'` since both `(c,)`
and `Either c` are traversable.

#### Dilemma

The first difficulty that we face is that there are two different
definitions for profunctor traversable in the literature.
- `Traversal s t a b = Optic (\p. (Strong (,) () p, Strong Either Void p,
  Monoidal (,) () p))` as in the [2017 profunctor optics paper](https://programming-journal.org/2017/1/7/)
- `Traversal s t a b = Optic (Wandering p)` as in the [Purescript
  profunctor lens library](https://pursuit.purescript.org/packages/purescript-profunctor-lenses)

These two definitions are _equivalent_ (and are equivalent to a bunch
of other definitions not involving profunctors, like the Van Laarhoven
encoding). However, I don't believe that `(Strong (,) () p, Strong
Either Void p, Monoidal (,) () p)` and `Wandering p` are equivalent!
In fact I don't believe either is included in the other (but I don't
have a proof), I'm at least pretty sure that `Wandering p` doesn't
imply `Monoidal (,) () p`.

So it's a true choice. We could even require `(Monoidal (,) () p,
Wandering p)`, or make `Monoidal (,) () p` a super class of `Wandering
p`. There are a lot of possible choices.

The intuitive reason why all of these things work is that the only
profunctor which really matters for traversals is

```haskell
newtype Kleisli f a b = Kleisli (a %1 -> f b)
```

for `f` a control applicative functor. And it has all of the above
instances. So, whatever instances we choose, we can define
`traverseOf` easily and efficiently.

#### Van Laarhoven encodings are great at defining traversals

Where things are more difficult is for defining traversals. Let's give
ourselves an example type.

```haskell
data T
  = MkT1 A A
  | MkT2 A A A
```

Definining a Van-Laarhoven-style traversal is as easy as:

```haskell
traverse f (MkT1 x y) = MkT1 <$> f x <*> f y
traverse f (MkT2 x y z) = MkT2 <$> f x <*> f y <*> f z
```

It's very terse. Very readable. Very systematic. Very exactly what we
want.

If you try defining the same traversal with just `(Strong (,) () p,
Strong Either Void p, Monoidal (,) () p)` you are not going to have
fun. First you will notice that you kind of want `Monoidal Either
Void` as well (which, to be fair, we could throw in no problem).

With `Wandering` we have, instead,

```haskell
traverse t = wander go
  where
    go f (MkT1 x y) = MkT1 <$> f x <*> f y
    go f (MkT2 x y z) = MkT2 <$> f x <*> f y <*> f z
```

It's almost as easy as in the Van Laarhoven case. Not quite perfect,
but quite ok. (note: if we had `wander'` and not `wander` this would
be rather non-fun again)

Which is, I assume, is the reason why Purescript uses this definition.

#### Unrestricted traversals

I lied above. There is a second profunctor of interest:

```haskell
newtype KleisliU f a b = KleisliU (a -> f b)
```

This one lets us define unrestricted traversals:

```haskell
traverseOfU :: Traversal s t a b -> (a -> f b) -> s -> f t
```

We do want to have `traverseOfU` otherwise we have to define two
traversals for every data structure: one unrestricted and one not. And
they are really the same. Which would be quite a shame.

`KleisliU f` is `\p. (Strong (,) () p, Strong Either Void p, Monoidal
(,) () p)`, but I don't believe that it is `Wandering` or, if it is,
it doesn't admit a trivial implementation.

So if we have traversals defined in terms of `(Strong (,) () p, Strong
Either Void p, Monoidal (,) () p)` they can be equally well used in a
linear and unrestricted context. In fact, we may even have,
eventually, a linearity-polymorphic `Kleisli`, and get a polymorphic
traversal out as a mere cast. This is very much not the case with
`Wandering`. This is the reasoning behind #79.

But if don't have `Wandering` how do you build a traversal? Making one
manually by composing profunctor arrows is pretty much out of the
question: it is very cumbersome and allocate a ton as you are forced
to decompose your type into a sequence of `Either` and binary `(,)`
since it's the only language that the profunctor speaks.

There is a generic way to make one from a Van Laarhoven traversal. It
uses a special applicative to reify the traversal.

The simplest presentation of this type is

```haskell
data Batch a b t = Batch [a] ([b] -> t)
  -- invariant: in `Batch a k`, `k` only accepts lists of the same
  -- size as `as`
```

It's an interesting exercise to prove that it is indeed an
applicative. Important functions are

```haskell
pure :: t %1 -> Batch a b t
pure x = Batch [] (\[] -> x)

(<*>) :: Batch a b (x -> t) %1 -> Batch a b x %1 -> Batch a b t
(<*>) = -- exercise

batch :: a %1 -> Batch a b b
batch a = Batch a (\[b] -> b)

fuse :: Batch b b t %1 -> t
fuse (Batch bs k) = k bs
```

With this material, we have:

```haskell
reifyTraversal :: (forall f. Applicative f => (a #-> f b) -> s %1 -> f t) %1 -> s %1 -> Batch a b t
reifyTraversal traverse = traverse @(Batch a b) batch
```

It is then sufficient to know how to traverse `Batch` with a
profunctor, which can be done once and for all.

That being said, we still have to allocate a lot to do a
traversal. And this extra allocation can't be removed by
specialisation, since traversals are usually recursive.

It does seem at this point in my thought process that we have two
definitions: one is good for writing new traversals, the other one is
good for using traversals. That's bad! And I don't quite know what the
best solution is yet.

### Appendix: A possibly more efficient `Batch`

If you have done the exercise above (I'm watching you!), you will have
realised that there are a lot of lists concatenation in this
story. List concatenations are bad.

But, before I propose a way to avoid them let's first make things
worse. Because another potential issue with `Batch` is the presence of
a partial function throughout. It's not really a problem for a type
used only internally, but, at any rate, it can be eliminated. I'll
give the definition that was actually used in #79

```haskell
data Batch a b t
  = Done t
  | More a (Batch a b (b %1 -> t))
```

It's structured as a list of `a` terminated by a function `b %1 -> …
%1 -> t` with the appropriate arity.

There is no partiality here. But everything is horribly slow: even the
`Functor` instance requires a recursion.

By the way, it's some kind of free applicative structure. We can find
the same type in this [sweet blog post about sorting
`Traversable`s](https://elvishjerricco.github.io/2017/03/23/applicative-sorting.html).

At the very least, if we want to reify the traversal through some
applicative functor, none of the functions of interest should be
recursive. So we kind of want to have concatenation built in. Here is
my attempt

```haskell
data Batch a b t
  = Pure t
  | One a (b -> t)
  | Concat (Batch a b x) (Batch a b y) (x %1 -> y %1 -> t)
  
instance Functor (Batch a b) where
  fmap f (Pure t) = Pure $ f t
  fmap f (One a k) = One a $ f . k
  fmap f (Concat l r k) = Concat l r $ \x y -> f (k x y)

instance Applicative (Batch a b) where
  pure = Pure
  bf <*> bx = Concat bf bx ($)
  
batch :: a %1 -> Batch a b b
batch a = One a id

-- Ok this one is recursive, there is little to be done here. Still,
-- it's less bad.
fuse :: Batch b b t %1 -> t
fuse (Pure x) = x
fuse (One a k) = k a
fuse (Concat l r k) = k (fuse l) (fuse r)
```

Honestly though, it's not enough to convince me. I want to read
traversals as some kind of iterator in Rust, or generator in Python,
etc… But here, we still need to reify the recursive structure. So,
basically, a structure which would otherwise be only on the stack is
now duplicated in the heap. It prevents many optimisation, including
inlining with non-recursive traversals, so I'm really not satisfied.

### Bonus thoughts on Traversable-by-data

While I'm at it, let's speak about traversal but with
`Data.Applicative` instead. See also #190.

Basically, in the scope of the Optics library, these are very special
citizen: they don't really compose with anything. A lens composed with
a traversal-by-data is a regular traversal (by control). Same with a
prism. So to be traversable by data you basically need to be a
composition of traversable-by-data.

We will have, just as is the case for the normal traversal, two
definition.

```haskell
class TheOtherWandering p where
  wander :: forall s t a b. (forall f. Data.Applicative f => (a %1 -> f b) -> s %1 -> f t) -> p a b -> p s t
  
type TraversableByData1 = Optic TheOtherWandering
type TraversableByData2 = Optic (\p. (Monoidal (,) () p, Monoidal Either Void p))
```

Notice the absence of `Strong`: traversals-by-data don't compose with
much. I don't think we can avoid `Monoidal Either Void` here, so that
would be an argument to add `Monoidal Either Void` to the constraints
in the regular traversal case if we were to go this way: for
subtyping.

I don't think we need these for the first release, really: they are,
indeed, pretty special. But we may want them eventually
nonetheless. So these were my thoughts on them.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The Traversal story #220

Profunctor optics

Traversals

Dilemma

Van Laarhoven encodings are great at defining traversals

Unrestricted traversals

Appendix: A possibly more efficient `Batch`

Bonus thoughts on Traversable-by-data

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The Traversal story #220

Description

Profunctor optics

Traversals

Dilemma

Van Laarhoven encodings are great at defining traversals

Unrestricted traversals

Appendix: A possibly more efficient Batch

Bonus thoughts on Traversable-by-data

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Appendix: A possibly more efficient `Batch`