Skip to content

Dupable wars, episode IV: a new plan #343

@aspiwack

Description

@aspiwack

Background

I've been insisting, in the past few years, that the Dupable type class ought to:

  • Support instance being defined in a Data.Applicative style
  • Support generating n-tuples for any n.

To this effect, we've been using dupV :: a %1 -> V n a as our main type class method.

We realised long ago that the Data.Applicative instance for dupV is not super efficient, as it allocates intermediate arrays for each use of (<*>). Which I have proposed in the past should be solved by using pull arrays instead (#208) and deducing dupV from the pull-array based duplication function.

Another proposal which has been floated is to specialised to V 2, which would be sufficient (#151 (comment)). I find my explanation, in this comment, that V 2 is insufficient to be lacking: I can defined dupV in term of dup2 (equivalently dupV2). I didn't explain why it is not a great idea.

Movable -> Dupable

A typical instance of Dupable will deep-copy the data type. When I defined dupV @n in terms of dup2, I need to do n-1 of these copies. This is often ok, but if I have a Movable instance, I can do a single copy:

dupV x = case move x of
  Ur x' -> pure x')

It would be quite a pity to forbid ourselves from using this when we can.

A streaming hot approach

I was thinking about the choice between non-mutable and mutable array for the implementation of V (#312, #341). And was starting to be a bit dismayed by the seemingly untamable zoology of twisted little data structures all alike. How can we keep the complexity low? Do we need a special, scoped, dup function for the mutable case (#341 (comment)).

Then I realised that I had made a pretty incorrect assumption there: that the only applicable multi-ary Data.Applicative was V. This is very not true: there are also (pull-polarity) streams. This would look something like this:

data Stream a where
  Stream { state :: x, next :: x %1 -> (a, x), final :: x %1 -> a}

instance Data.Applicative Stream where
  pure a = Stream
    { state = Ur a
    , next = (\(Ur x) -> (x, Ur x))
    , final = unur }
  (fs <*> xs) = Stream
    { state = (state fs, state xs)
    , next = (\(sf, sx) -> case (next sf, next sx) of 
                           ((f, sf'), (x, sx')) -> (f x, (sf', sx')))
    , final = (\(sf, sx) -> final sf (final sx))}

(Stream is a variant of AffineStream (Of a) Identity a from the streaming sublibrary with a single occurrence of a (to be able to define instances), Of is linear in a, and which (crucially) cannot decide to end on its own)

We could define dupS :: a %1 -> Stream a as our main method for the Dupable type class. Note how pure gives us the move-based implementation.

We could then derive functions like dupV (or the mutable equivalent) as a derived concept. More generally, we can define

elim :: KnownNat n => Fun n a b -> Stream a %1 -> b

(though the question of doing this efficiently is probably as tricky as in the case of V.elim, but it does give us elimination into V, and into tuples for instance).

This gives us a much more comprehensible Dupable type class too! And it becomes the responsibility of the various data types that we provide to provide a good interface to Dupable. Much more manageable.

A moving touch

There is still something unfortunate though (which also holds the current dupV-based abstraction): if I use <*> on pure streams, then I lose the only-one-deep-copy property. Let's look at the instance for lists:

instance Dupable a => Dupable [a] where
  dupS [] = pure []
  dupS (a:as) = (:) <&> dupS a <*> dupS as

If a is movable, then [a] is movable too. However, if each of the a-s is indeed deep-copied only once, the spine of the list is copied n times 🙁 .

I think that the right solution is to do some dynamic dispatch:

data BetterStream a where
  Moved :: a -> BetterStream a
  Sequential :: Stream a %1 -> BetterStream a

instance Applicative BetterStream a where
  pure = Moved
  
  (Moved f) <*> (Moved x) = Moved (f x)
  fs <*> xs = (toStream fs) <*> (toStream xs)
  
toStream :: BetterStream a %1 -> Stream a
toStream (Moved x) = pure x
toStream (Sequential xs) = xs

We can similary defined elim for this type.

Conclusion (kind of)

I don't feel super great to introduce two extra stream types on top of the existing two. I suppose Stream may want to stay internal while BetterStream is exposed (with a less silly name). A type of stream defined solely to be the output of dupS still does feel a tad unsatisfactory. Still, it's probably the best thing to do. I do wonder though if we don't want BetterStream to expose its constructors, so that V (and its mutable variant) can take advantage of it for a slightly faster build in the Moved case. Or maybe we have a clever elim' function with a type like

elim' :: (a -> b) -> (Fun n a b) -> BetterStream a %1 -> b

This would avoid displaying one additional stream types. I don't quite know yet.

Anyway, this does look like the right way forward, doesn't it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions