-
Notifications
You must be signed in to change notification settings - Fork 100
Open
Labels
topic: bloom_filterIssues related to bloom_filterIssues related to bloom_filtertype: bugSomething isn't workingSomething isn't working
Description
I am encountering unexpected behavior when using cuco::bloom_filter with pattern_bits = 4. The false positive rate (FPR) degrades too dramatically when changing from pattern_bits = 8 with a constant 'load factor' (i.e., the fraction of bits set in the filter). The issue may be related to the bit pattern selection.
The following code demonstrates the issue:
#include <cuco/bloom_filter.cuh>
#include <iostream>
#include <thrust/count.h>
#include <thrust/device_vector.h>
#include <thrust/sequence.h>
// 'Blocked' filter policy with 8B blocks
using policy_t = cuco::default_filter_policy<cuco::xxhash_64<uint32_t>, uint64_t, 1>;
using bf_t =
cuco::bloom_filter<uint32_t, cuco::extent<std::size_t>, cuda::thread_scope_device, policy_t>;
constexpr size_t bits_per_block = 64;
constexpr uint32_t pattern_bits_A = 4;
constexpr uint32_t pattern_bits_B = 8;
constexpr size_t bits_per_key_A = 2 * pattern_bits_A;
constexpr size_t bits_per_key_B = 2 * pattern_bits_B;
int main()
{
// Initialize non-overlapping build and probe key sets
thrust::device_vector<uint32_t> build_keys(1U << 20U);
thrust::device_vector<uint32_t> probe_keys(1U << 25U);
thrust::device_vector<bool> flags_A(1U << 25U, false);
thrust::device_vector<bool> flags_B(1U << 25U, false);
thrust::sequence(build_keys.begin(), build_keys.end(), 0, 2);
thrust::sequence(probe_keys.begin(), probe_keys.end(), 1, 2);
// Specify pattern bits for the policy
policy_t policy_A(pattern_bits_A);
bf_t filter_A(cuda::ceil_div(bits_per_key_A * build_keys.size(), bits_per_block), {}, policy_A);
filter_A.add(build_keys.begin(), build_keys.end());
filter_A.contains(probe_keys.begin(), probe_keys.end(), flags_A.begin());
size_t fps_A = thrust::count(flags_A.begin(), flags_A.end(), true);
double_t fpr_A = 100.0 * fps_A / flags_A.size();
std::cout << "FPR A: " << fpr_A << "\n";
policy_t policy_B(pattern_bits_B);
bf_t filter_B(cuda::ceil_div(bits_per_key_B * build_keys.size(), bits_per_block), {}, policy_B);
filter_B.add(build_keys.begin(), build_keys.end());
filter_B.contains(probe_keys.begin(), probe_keys.end(), flags_B.begin());
size_t fps_B = thrust::count(flags_B.begin(), flags_B.end(), true);
double_t fpr_B = 100.0 * fps_B / flags_B.size();
std::cout << "FPR B: " << fpr_B << "\n";
return 0;
}Observed Behavior:
FPR A: 16.9311
FPR B: 0.611573
Expected Behavior:
The FPR should increase more smoothly with decreasing pattern_bits / filter size. This configuration of 8B blocks with 4 bits being set per key is common (arrow/acero) and is not expected to produce such a high FPR with a 'load factor' of 0.5.
Environment:
- Cuco version: 0.0.1
- CUDA version: 12.2
- Compiler: gcc 11.4.0
- GPU: L4
- OS: Ubuntu
Would appreciate any insights into what might be causing this! Or, if I'm missing something. Thanks!
Metadata
Metadata
Assignees
Labels
topic: bloom_filterIssues related to bloom_filterIssues related to bloom_filtertype: bugSomething isn't workingSomething isn't working