Skip to content

Conversation

mmottl
Copy link

@mmottl mmottl commented Oct 4, 2016

This branch implements changes to the union-find implementation to make it more efficient by removing indirections, thus eliminating extra heap blocks and pointer dereferencing. This is achieved by combining GADTs and inline records. This trick can likely be used in many other contexts.

There are no API changes. The changes have been tested, but should also be tested with the internal Jane Street test suite.

There is currently a small bug in the beta 2 of OCaml 4.04 that prevents me from implementing the type definitions in a more straightforward way. The workaround using a type variable is only mildly more complicated. Note that OCaml 4.04, which can unbox single tag constructors, is required to see efficiency gains.

The algorithm now also does not use tail recursion in the compression loop anymore to avoid expensive list allocations. The worst case recursion depth cannot realistically blow the stack anyway due to the O(log N) bound.

@ghost ghost assigned trefis Oct 6, 2016
Copy link
Contributor

@trefis trefis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state the test module doesn't compile. Would you mind fixing your PR so we can start testing it internally?

Thanks!

| Inner t' as node -> compress t' ~inner_node:node ~inner:t ~descendants:[]
let compress_inner (Inner inner as repr) = compress ~repr inner.parent

let root t = snd (representative t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed root without changing all its use sites (eg. all the ones in test module are left untouched).


let is_compressed t =
let is_compressed (Inner t) =
invariant t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't typecheck, you can't just get t and pass it to an arbitrary function, the compiler will reject this as the type might escape.
In any case invariant expects a tree anyway so I'm guessing you want:

  let is_compressed (Inner t as t') =
    invariant t';
    ...

let is_compressed (Inner t) =
invariant t;
match t.node with
match t.parent with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following pattern matching doesn't typecheck.
You get the following error on the Inner case:

File "union_find.ml", line 123, characters 6-13:
Error: This pattern matches values of type ('a, [ `inner ], 'b) tree
       but a pattern was expected which matches values of type
         ('a, [ `root ], 'c) tree

@mmottl
Copy link
Author

mmottl commented Oct 6, 2016

It was a little messy testing the test suite without having support for that, but the patch I just pushed should probably help. I hope the public Core_kernel will soon support building and running test suites.

@yminsky
Copy link

yminsky commented Oct 10, 2020

Ping. This also seems like a good improvement that we should upstream. CC @bcc32 @rnml @ceastlund

@github-iron github-iron added the forwarded-to-js-devs This report has been forwarded to Jane Street's internal review system. label Nov 11, 2020
@mdelvecchio-jsc
Copy link

I implemented this on our most recent version of [Core.Union_find], and ran the current version vs the proposed version against some benchmarks. In particular, our benchmarks include calling each of the functions in the interface on single nodes, as well as on multiple nodes in unions of 50 or 100 nodes to get some variation in path compression complexity.

In general, it seems the new implementation performs better on small cases and cases which don't require traversing parents (~20%-40% speed improvement), while performing worse in cases which do require traversing parents (~50%-80% speed penalty). The new implementation does reduce allocations by 2 words per node (7 -> 5), but I don't think the improvements are worth the penalties.

Do you have your own benchmarks which suggest this implementation will have better performance in cases we may have missed?

@mmottl
Copy link
Author

mmottl commented Apr 14, 2023

It has been a long time since I opened this PR, and the compiler and runtime have gone through significant changes. As usual, your most frequent use cases will determine which implementation will work better for you. I'm a little surprised why traversing parents would be slower with this representation. I don't see any obvious reason why it should, but haven't looked into this in years. It may be worthwhile investigating why this case appears slower. Maybe a small modification of the code could improve performance for that case. Sometimes even simple things like changing the order of variants in the type or of match cases might make a difference. The representation would otherwise seem quite optimal to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
forwarded-to-js-devs This report has been forwarded to Jane Street's internal review system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants