-
Notifications
You must be signed in to change notification settings - Fork 9
Implement a novel consistent hashing algorithm with replication #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aneubeck
wants to merge
21
commits into
main
Choose a base branch
from
aneubeck/sampling
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
0d83453
core implementation
aneubeck 89f8ad4
Update README.md
aneubeck b03f0b7
Apply suggestion from @Copilot
aneubeck 0dbf1c9
finish proof
aneubeck 953d9bd
Merge branch 'aneubeck/sampling' of https://github.com/github/rust-ge…
aneubeck a5eb91e
Update README.md
aneubeck 220624d
Update README.md
aneubeck fc69a9e
Update README.md
aneubeck 90259e9
Update README.md
aneubeck 8480ea3
Replace key with hasher traits
aneubeck 0baaafc
Update lib.rs
aneubeck 0dcb137
Update README.md
aneubeck d4b8410
Update crates/consistent-hashing/README.md
aneubeck 0935ea0
Update crates/consistent-hashing/README.md
aneubeck 99c69f3
Update crates/consistent-hashing/README.md
aneubeck 496f539
Update crates/consistent-hashing/README.md
aneubeck 23f3080
add benchmark
aneubeck 5d52237
remove second vector
aneubeck f6e29f7
Update README.md
aneubeck d20f9b6
Update performance.rs
aneubeck 1dde97c
make linter happy
aneubeck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
[package] | ||
name = "consistent-hashing" | ||
version = "0.1.0" | ||
edition = "2021" | ||
description = "Constant time consistent hashing algorithms." | ||
repository = "https://github.com/github/rust-gems" | ||
license = "MIT" | ||
keywords = ["probabilistic", "algorithm", "consistent hashing", "jump hashing", "rendezvous hashing"] | ||
categories = ["algorithms", "data-structures", "mathematics", "science"] | ||
|
||
[lib] | ||
crate-type = ["lib", "staticlib"] | ||
bench = false | ||
|
||
[dependencies] | ||
|
||
[dev-dependencies] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# Consistent Hashing | ||
|
||
Consistent hashing maps keys to a changing set of nodes (shards, servers) so that when nodes join or leave, only a small fraction of keys move. It is used in distributed caches, databases, object stores, and load balancers to achieve scalability and high availability with minimal data reshuffling. | ||
|
||
Common algorithms | ||
- [Consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing) (hash ring with virtual nodes) | ||
- [Rendezvous hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing) | ||
- [Jump consistent hash](https://en.wikipedia.org/wiki/Jump_consistent_hash) | ||
- [Maglev hashing](https://research.google/pubs/pub44824) | ||
- [AnchorHash: A Scalable Consistent Hash](https://arxiv.org/abs/1812.09674) | ||
- [DXHash](https://arxiv.org/abs/2107.07930) | ||
- [JumpBackHash](https://arxiv.org/abs/2403.18682) | ||
|
||
## Complexity summary | ||
|
||
where `N` is the number of nodes and `R` is the number of replicas. | ||
|
||
| Algorithm | Lookup per key | Node add/remove | Memory | Replication support | | ||
|-------------------------|----------------------|----------------------------------------|---------------------------|--------------------------------------------------| | ||
| Hash ring (with vnodes) | O(log N) binary search over N points; O(1) with specialized structures | O(log N) to insert/remove points | O(N) points | Yes: take next R distinct successors; O(log N + R) | | ||
| Rendezvous | O(N) score per node; top-1 | O(1) (no state to rebalance) | O(N) node list | Yes: pick top R scores; O(N log R) | | ||
| Jump consistent hash | O(log(N)) | O(1) | O(1) | Not native | | ||
| AnchorHash | O(1) expected | O(1) expected/amortized | O(N) | Not native | | ||
| DXHash | O(1) expected | O(1) expected | O(N) | Not native | | ||
| JumpBackHash | O(1) | O(1) expected | O(1) | Not native | | ||
|
||
Replication of keys | ||
- Hash ring: replicate by walking clockwise to the next R distinct nodes. Virtual nodes help spread replicas evenly and avoid hotspots. | ||
- Rendezvous hashing: replicate by selecting the top R nodes by score for the key. This naturally yields R distinct owners and supports weights. | ||
- Jump consistent hash: the base function returns one bucket. Replication can be achieved by hashing (key, replica_index) and collecting R distinct buckets; this is simple but lacks the single-pass global ranking HRW provides. | ||
|
||
Why replication matters | ||
- Tolerates node failures and maintenance without data unavailability. | ||
- Distributes read/write load across multiple owners, reducing hotspots. | ||
- Enables fast recovery and higher tail-latency resilience. | ||
|
||
## N-Choose-R replication | ||
|
||
We define the consistent `n-choose-rk` replication as follows: | ||
|
||
1. for a given number `n` of nodes, choose `k` distinct nodes `S`. | ||
2. for a given `key` the chosen set of nodes must be uniformly chosen from all possible sets of size `k`. | ||
3. when `n` increases by one, exactly one node in the chosen set will be changed with probability `k/(n+1)`. | ||
|
||
For simplicity, nodes are represented by integers `0..n`. | ||
Given `k` independent consistent hash functions `h_i(n)` for a given key, the following algorithm will have the desired properties: | ||
|
||
``` | ||
fn consistent_choose_k<Key>(key: Key, k: usize, n: usize) -> Vec<usize> { | ||
(0..k).rev().scan(n, |n, k| Some(consistent_choose_next(key, k, n))).collect() | ||
} | ||
|
||
fn consistent_choose_next<Key>(key: Key, k: usize, n: usize) -> usize { | ||
(0..k).map(|k| consistent_hash(key, k, n - k) + k).max() | ||
} | ||
|
||
fn consistent_hash<Key>(key: Key, k: usize, n: usize) -> usize { | ||
// compute the k-th independent consistent hash for `key` and `n` nodes. | ||
} | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,292 @@ | ||
use std::hash::{DefaultHasher, Hash, Hasher}; | ||
|
||
/// One building block for the consistent hashing algorithm is a consistent | ||
/// hash iterator which enumerates all the hashes for a given for a specific bucket. | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/// A bucket covers the range `(1<<bit)..(2<<bit)`. | ||
#[derive(Default)] | ||
struct BucketIterator { | ||
hasher: DefaultHasher, | ||
n: usize, | ||
is_first: bool, | ||
bit: u64, | ||
} | ||
|
||
impl BucketIterator { | ||
fn new(key: u64, n: usize, bit: u64) -> Self { | ||
let mut hasher = DefaultHasher::new(); | ||
key.hash(&mut hasher); | ||
bit.hash(&mut hasher); | ||
Self { | ||
hasher, | ||
n, | ||
is_first: true, | ||
bit, | ||
} | ||
} | ||
} | ||
|
||
impl Iterator for BucketIterator { | ||
type Item = usize; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
if self.bit == 0 { | ||
return None; | ||
} | ||
if self.is_first { | ||
let res = self.hasher.finish() % self.bit + self.bit; | ||
if res < self.n as u64 { | ||
self.n = res as usize; | ||
return Some(self.n); | ||
} | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.is_first = false; | ||
} | ||
loop { | ||
478392.hash(&mut self.hasher); | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
let res = self.hasher.finish() % (self.bit * 2); | ||
if res & self.bit == 0 { | ||
return None; | ||
} | ||
if res < self.n as u64 { | ||
self.n = res as usize; | ||
return Some(self.n); | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
} | ||
} | ||
|
||
/// An iterator which enumerates all the consistent hashes for a given key | ||
/// from largest to smallest in the range `0..n`. | ||
pub struct ConsistentHashRevIterator { | ||
bits: u64, | ||
key: u64, | ||
n: usize, | ||
inner: BucketIterator, | ||
} | ||
|
||
impl ConsistentHashRevIterator { | ||
pub fn new(key: u64, n: usize) -> Self { | ||
let mut hasher = DefaultHasher::new(); | ||
key.hash(&mut hasher); | ||
let bits = hasher.finish() % n.next_power_of_two() as u64; | ||
let inner = BucketIterator::default(); | ||
Self { | ||
bits, | ||
key, | ||
n, | ||
inner, | ||
} | ||
} | ||
} | ||
|
||
impl Iterator for ConsistentHashRevIterator { | ||
type Item = usize; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
if self.n == 0 { | ||
return None; | ||
} | ||
if let Some(res) = self.inner.next() { | ||
return Some(res); | ||
} | ||
while self.bits > 0 { | ||
let bit = 1 << self.bits.ilog2(); | ||
self.bits ^= bit; | ||
self.inner = BucketIterator::new(self.key, self.n, bit); | ||
if let Some(res) = self.inner.next() { | ||
return Some(res); | ||
} | ||
} | ||
self.n = 0; | ||
Some(self.n) | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
|
||
/// Same as `ConsistentHashRevIterator`, but iterates from smallest to largest | ||
/// for the range `n..`. | ||
pub struct ConsistentHashIterator { | ||
bits: u64, | ||
key: u64, | ||
n: usize, | ||
stack: Vec<usize>, | ||
} | ||
|
||
impl ConsistentHashIterator { | ||
pub fn new(key: u64, n: usize) -> Self { | ||
let mut hasher = DefaultHasher::new(); | ||
key.hash(&mut hasher); | ||
let mut bits = hasher.finish() as u64; | ||
bits &= !((n + 2).next_power_of_two() as u64 / 2 - 1); | ||
let stack = if n == 0 { vec![0] } else { vec![] }; | ||
Self { | ||
bits, | ||
key, | ||
n, | ||
stack, | ||
} | ||
} | ||
} | ||
|
||
impl Iterator for ConsistentHashIterator { | ||
type Item = usize; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
if let Some(res) = self.stack.pop() { | ||
return Some(res); | ||
} | ||
while self.bits > 0 { | ||
let bit = self.bits & !(self.bits - 1); | ||
self.bits &= self.bits - 1; | ||
let inner = BucketIterator::new(self.key, bit as usize * 2, bit); | ||
self.stack = inner.take_while(|x| *x >= self.n).collect(); | ||
if let Some(res) = self.stack.pop() { | ||
return Some(res); | ||
} | ||
} | ||
None | ||
} | ||
} | ||
|
||
/// Wrapper around `ConsistentHashIterator` and `ConsistentHashRevIterator` to compute | ||
/// the next or previous consistent hash for a given key for a given number of nodes `n`. | ||
pub struct ConsistentHasher { | ||
key: u64, | ||
} | ||
|
||
impl ConsistentHasher { | ||
pub fn new(key: u64) -> Self { | ||
Self { key } | ||
} | ||
|
||
pub fn prev(&self, n: usize) -> usize { | ||
let mut sampler = ConsistentHashRevIterator::new(self.key, n); | ||
sampler.next().expect("n must be > 0!") | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
pub fn next(&self, n: usize) -> usize { | ||
let mut sampler = ConsistentHashIterator::new(self.key, n); | ||
sampler.next().expect("Exceeded iterator bounds :(") | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
|
||
/// Implementation of a consistent choose k hashing algorithm. | ||
/// It returns k distinct consistent hashes in the range `0..n`. | ||
/// The hashes are consistent when `n` changes and when `k` changes! | ||
/// I.e. on average exactly `1/(n+1)` (resp. `1/(k+1)`) many hashes will change | ||
/// when `n` (resp. `k`) increases by one. Additionally, the returned `k` tuple | ||
/// is guaranteed to be uniformely chosen from all possible `n-choose-k` tuples. | ||
pub struct ConsistentChooseKHasher { | ||
key: u64, | ||
k: usize, | ||
} | ||
|
||
impl ConsistentChooseKHasher { | ||
pub fn new(key: u64, k: usize) -> Self { | ||
Self { key, k } | ||
} | ||
|
||
// TODO: Implement this as an iterator! | ||
pub fn prev(&self, mut n: usize) -> Vec<usize> { | ||
let mut samples = Vec::with_capacity(self.k); | ||
let mut samplers: Vec<_> = (0..self.k) | ||
.map(|i| ConsistentHashRevIterator::new(self.key + 43987492 * i as u64, n - i).peekable()) | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.collect(); | ||
for i in (0..self.k).rev() { | ||
let mut max = 0; | ||
for k in 0..=i { | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
while samplers[k].peek() >= Some(&(n - k)) && n - k > 0 { | ||
samplers[k].next(); | ||
} | ||
max = max.max(samplers[k].peek().unwrap() + k); | ||
} | ||
samples.push(max); | ||
n = max; | ||
} | ||
samples.sort(); | ||
aneubeck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
samples | ||
} | ||
} | ||
|
||
|
||
#[cfg(test)] | ||
mod tests { | ||
use super::*; | ||
|
||
#[test] | ||
fn test_uniform_1() { | ||
for k in 0..100 { | ||
let sampler = ConsistentHasher::new(k); | ||
for n in 0..1000 { | ||
assert!(sampler.prev(n + 1) <= sampler.prev(n + 2)); | ||
let next = sampler.next(n); | ||
assert_eq!(next, sampler.prev(next + 1)); | ||
} | ||
let mut iter_rev: Vec<_> = ConsistentHashIterator::new(k, 0) | ||
.take_while(|x| *x < 1000) | ||
.collect(); | ||
iter_rev.reverse(); | ||
let iter: Vec<_> = ConsistentHashRevIterator::new(k, 1000).collect(); | ||
assert_eq!(iter, iter_rev); | ||
} | ||
let mut stats = vec![0; 13]; | ||
for i in 0..100000 { | ||
let sampler = ConsistentHasher::new(i); | ||
let x = sampler.prev(stats.len()); | ||
stats[x] += 1; | ||
} | ||
println!("{stats:?}"); | ||
} | ||
|
||
#[test] | ||
fn test_uniform_k() { | ||
const K: usize = 3; | ||
for k in 0..100 { | ||
let sampler = ConsistentChooseKHasher::new(k, K); | ||
for n in K..1000 { | ||
let samples = sampler.prev(n + 1); | ||
assert!(samples.len() == K); | ||
for i in 0..K - 1 { | ||
assert!(samples[i] < samples[i + 1]); | ||
} | ||
let next = sampler.prev(n + 2); | ||
for i in 0..K { | ||
assert!(samples[i] <= next[i]); | ||
} | ||
let mut merged = samples.clone(); | ||
merged.extend(next.clone()); | ||
merged.sort(); | ||
merged.dedup(); | ||
assert!( | ||
merged.len() == K || merged.len() == K + 1, | ||
"Unexpected {samples:?} vs. {next:?}" | ||
); | ||
} | ||
} | ||
let mut stats = vec![0; 8]; | ||
for i in 0..32 { | ||
let sampler = ConsistentChooseKHasher::new(i + 32783, 2); | ||
let samples = sampler.prev(stats.len()); | ||
for s in samples { | ||
stats[s] += 1; | ||
} | ||
} | ||
println!("{stats:?}"); | ||
// Test consistency when increasing k! | ||
for k in 1..10 { | ||
for n in k + 1..20 { | ||
for key in 0..1000 { | ||
let sampler1 = ConsistentChooseKHasher::new(key, k); | ||
let sampler2 = ConsistentChooseKHasher::new(key, k + 1); | ||
let set1 = sampler1.prev(n); | ||
let set2 = sampler2.prev(n); | ||
assert_eq!(set1.len(), k); | ||
assert_eq!(set2.len(), k + 1); | ||
let mut merged = set1.clone(); | ||
merged.extend(set2); | ||
merged.sort(); | ||
merged.dedup(); | ||
assert_eq!(merged.len(), k + 1); | ||
} | ||
} | ||
} | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.