Skip to content

Assertion panic in lock.rs #351

@eugenels

Description

@eugenels

We're experiencing an intermittent panic caused by an assertion failure inside dashmap:

thread '<unnamed>' panicked at /home/vsts/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dashmap-5.5.3/src/lock.rs:149:9:
assertion `left == right` failed
  left: 139610391675444
 right: 18446744073709551612

Usage Context
We're using DashMap in a relatively straightforward way. The core usage pattern looks like this:

fn dispatch_event(self: &Arc<Self>, traced: WalEventTraced) {
    let dir_name = traced.event.dir_name().clone();
    let sender = self
        .wal_event_senders
        .entry(dir_name.clone())
        .or_insert_with(|| {
            let (send, recv) = mpsc::unbounded_channel::<WalEventTraced>();
            self.handle
                .spawn(process_events_task(self.clone(), dir_name, recv));
            send
        });

    if let Err(err) = sender.send(traced) {
        error!(
            "could not send WAL event to processing task, channel closed [event={:?}]",
            err.0.event
        );
    }
}

At the same time, we have other threads that may call self.wal_event_senders.remove(key) on the same map.

Stack Trace Snippet
Here is the relevant portion of the stack trace:

 18:     0x7efedc2adea4 - dashmap::lock::RawRwLock::unlock_exclusive_slow::h5154677c46b217b2
  19:     0x7efedc337127 - <dashmap::lock::RawRwLock as lock_api::rwlock::RawRwLock>::unlock_exclusive::h269693121521c35b
                               at /home/vsts/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dashmap-5.5.3/src/lock.rs:50:18
  20:     0x7efedc337127 - <lock_api::rwlock::RwLockWriteGuard<R,T> as core::ops::drop::Drop>::drop::hf6a213ea61c660a5
                               at /home/vsts/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/lock_api-0.4.12/src/rwlock.rs:1714:29
  21:     0x7efedc337127 - core::ptr::drop_in_place<lock_api::rwlock::RwLockWriteGuard<dashmap::lock::RawRwLock,hashbrown::map::HashMap<qdb_ent::wal::table_dir_name::TableDirName,dashmap::util::SharedValue<tokio::sync::mpsc::unbounded::UnboundedSender<qdb_ent::wal::uploader::WalEventTraced>>,std::hash::random::RandomState>>>::hdf16027c12b9d4b5
                               at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/core/src/ptr/mod.rs:799:1
  22:     0x7efedc337127 - core::ptr::drop_in_place<dashmap::mapref::one::RefMut<qdb_ent::wal::table_dir_name::TableDirName,tokio::sync::mpsc::unbounded::UnboundedSender<qdb_ent::wal::uploader::WalEventTraced>>>::h22bd76f32123c89a
                               at /rustc/29483883eed69d5fb4db01964cdf2af4d86e9cb2/library/core/src/ptr/mod.rs:799:1

Question

We assume .remove(key) being called concurrently with .entry(key).or_insert_with(...) is safe and supported by DashMap's design. Is it possible that under some specific interleaving or timing, this pattern causes an internal lock state corruption or miscount, leading to the assertion panic we're seeing?

Notes

  • We are using dashmap v5.5.3.
  • The panic appears rarely and is not consistently reproducible.
  • No unsafe code is used in our project.
  • Let us know if there are known issues with this version or if upgrading might help.

Thanks for any insight you can provide!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions