-
Notifications
You must be signed in to change notification settings - Fork 418
Async FilesystemStore #3931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Async FilesystemStore #3931
Conversation
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
29b8bcf
to
81ad668
Compare
let this = Arc::clone(&self.inner); | ||
|
||
Box::pin(async move { | ||
tokio::task::spawn_blocking(move || { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh, so I'm not sure if spawning blocking tasks for every IO call is the way to go (see for example https://docs.rs/tokio/latest/tokio/fs/index.html#tuning-your-file-io: "To get good performance with file IO on Tokio, it is recommended to batch your operations into as few spawn_blocking calls as possible."). Maybe there are other designs that we should at least consider before moving forward with this approach. For example, we could create a dedicated pool of longer-lived worker task(s) that process a queue?
If we use spawn_blocking
, can we give the user control over which runtime this exactly will be spawned on? Also, rather than just doing wrapping, should we be using tokio::fs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh, so I'm not sure if spawning blocking tasks for every IO call is the way to go (see for example https://docs.rs/tokio/latest/tokio/fs/index.html#tuning-your-file-io: "To get good performance with file IO on Tokio, it is recommended to batch your operations into as few spawn_blocking calls as possible.").
If we should batch operations, I think the current approach is better than using tokio::fs? Because it already batches the various operations inside kvstoresync::write.
Further batching probably needs to happen at a higher level in LDK, and might be a bigger change. Not sure if that is worth it just for FIlesystemStore, especially when that store is not the preferred store for real world usage?
For example, we could create a dedicated pool of longer-lived worker task(s) that process a queue?
Isn't Tokio doing that already when a task is spawned?
If we use spawn_blocking, can we give the user control over which runtime this exactly will be spawned on? Also, rather than just doing wrapping, should we be using tokio::fs?
With tokio::fs, the current runtime is used. I'd think that that is then also sufficient if we spawn ourselves, without a need to specifiy which runtime exactly?
More generally, I think the main purpose of this PR is to show how an async kvstore could be implemented, and to have something for testing potentially. Additionally if there are users that really want to use this type of store in production, they could. But I don't think it is something to spend too much time on. A remote database is probably the more important target to design for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With tokio::fs, the current runtime is used. I'd think that that is then also sufficient if we spawn ourselves, without a need to specifiy which runtime exactly?
Hmm, I'm not entirely sure, especially for users that have multiple runtime contexts floating around, it might be important to make sure the store uses a particular one (cc @domZippilli ?). I'll also have to think through this for LDK Node when we make the switch to async KVStore there, but happy to leave as-is for now.
lightning/src/util/persist.rs
Outdated
} | ||
|
||
/// Provides additional interface methods that are required for [`KVStore`]-to-[`KVStore`] | ||
/// data migration. | ||
pub trait MigratableKVStore: KVStore { | ||
pub trait MigratableKVStore: KVStoreSync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will we solve this for an KVStore
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment belongs in #3905?
We might not need to solve it now, as long as we still require a sync implementation alongside an async one? If we support async-only kvstores, then we can create an async version of this trait?
81ad668
to
e462bce
Compare
Removed garbage collector, because we need to keep the last written version. |
97d6b3f
to
02dce94
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3931 +/- ##
==========================================
- Coverage 88.77% 88.77% -0.01%
==========================================
Files 175 175
Lines 127760 128086 +326
Branches 127760 128086 +326
==========================================
+ Hits 113425 113714 +289
- Misses 11780 11807 +27
- Partials 2555 2565 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c061fcd
to
2492508
Compare
9938dfe
to
7d98528
Compare
38ab949
to
dd9e1b5
Compare
Updated code to not use an async wrapper, but conditionally expose the async I didn't yet update the |
528b414
to
744fdc8
Compare
Rebased to see if fuzz error disappears |
let inner_lock_ref: Arc<RwLock<AsyncState>> = self.get_inner_lock_ref(dest_file_path); | ||
|
||
let new_version = { | ||
let mut async_state = inner_lock_ref.write().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bleh, this means that if there's a write happening for a key and another write starts for the same key, the task spawning the second write async will end up blocking until the first write completes. This should be easy to remedy by moving the lock onto just the latest_written_version
field and making the latest_version
field an atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about also adding an atomic. I see new edge cases coming towards me then. Especially because we now also use the latest version to determine if there a writes in flight.
Isn't it acceptable to block in case the same file is written again and is that even likely to happen? The big win is parallel writes to different files and we got that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't see using an atomic for this as that complicated. Specifically, I don't think we're actually relying on the lock here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of the line
let more_writes_pending = async_state.latest_written_version < async_state.latest_version;
I think there are gaps again when using an atomic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because we also check the Arc
reference count. Basically, the steps are (a) take the top-level lock and with that lock get a reference, then (b) get a version number then do the write, and finally (c) take the top-level lock and with that lock check if there's other references not yet complete. In fact, for the purpose of cleaning the map, I don't think we need to be looking at the version at all. The version really only needs to order the writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, it might be more readable to explicitly disentangle the version numbers from the map cleanup, separating the concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was a nice simplification to remove the inflight_write
counter and instead derive that value from the latest_version
and latest_written_version
. But if I understand you correctly, you are suggesting that the Arc
is already the inflight_write
counter. Because of that earlier refactor where the inner lock ref is only obtained once and passed in the future, it indeed seems to work exactly like that. Cool. Added fixup.
7b46ffb
to
6f24148
Compare
6f24148
to
c96aaff
Compare
Fuzzer found an issue, fixup commit "f: fix remove clean up" |
c96aaff
to
29e2861
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a first look at the fuzzer parts. I wonder if we would get any notable performance benefit from running the FilesystemStore fuzzer on a ramdisk? Or would we even lose some coverage going this way as it's exactly the IO latency that increases the chances of running into race conditions etc?
fuzz/Cargo.toml
Outdated
bech32 = "0.11.0" | ||
bitcoin = { version = "0.32.2", features = ["secp-lowmemory"] } | ||
tokio = { version = "1.35.*", default-features = false, features = ["rt-multi-thread"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This is more common (note not 100% equivalent, but probably preferable):
tokio = { version = "1.35.*", default-features = false, features = ["rt-multi-thread"] } | |
tokio = { version = "1.35", default-features = false, features = ["rt-multi-thread"] } |
Or is there any reason we don't want any API-compatible version 1.36 and above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it doesn't work with rust 1.63
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it doesn't work with rust 1.63
Huh, but why can we get away with 1.35
below in the actual lightning-persister
dependency then? Also, while the *
works, you'd usually rather see ~1.35
used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, the compiler decided that 1.35 could safely be bumped to 1.47. Also happened in CI.
error: package `tokio v1.47.1` cannot be built because it requires rustc 1.70 or newer, while the currently active rustc version is 1.63.0
~/repo/rust-lightning/fuzz (async-fsstore ✗) cargo tree -i tokio
tokio v1.47.1
├── lightning-fuzz v0.0.1 (/Users/joost/repo/rust-lightning/fuzz)
└── lightning-persister v0.2.0+git (/Users/joost/repo/rust-lightning/lightning-persister)
└── lightning-fuzz v0.0.1 (/Users/joost/repo/rust-lightning/fuzz)
```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it ~1.35, reads nicer indeed.
fuzz/src/bin/fs_store_target.rs
Outdated
use lightning_fuzz::utils::test_logger::StringBuffer; | ||
|
||
use std::sync::{atomic, Arc}; | ||
// { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Remove commented-out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes. I was still wondering what that code was for. Some default fuzz string sanity check?
fuzz/src/fs_store.rs
Outdated
let secondary_namespace = "secondary"; | ||
let key = "key"; | ||
|
||
// Remove the key in case something was left over from a previous run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, rather than doing this, do we want to add a random suffix to temp_path
above, so that we're sure to start with a clean directory every time? Also, do we want to clean up the filesystem store directory at the end of the run, similar to what we do in lightning-persister
tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added random suffixes. It is also necessary because fuzzing runs in parallel. I used uuid for simplicity, but can also generate names differently if preferred.
Also added clean up. I couldn't just copy the Drop
trait, because the FilesystemStore
isn't in the same crate. So created a wrapper for it. Maybe there is a better way to do it.
let fut = futures.remove(fut_idx); | ||
|
||
fut.await.unwrap(); | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't change anything, but do we want to throw in some coverage for KVStore::list
for good measure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Only I don't think we can assert anything because things may be in flight. It does add some extra variation to the test to also list during async ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added read. Some story, nothing to assert, but we do cover read execution during writes.
732f87c
to
217a027
Compare
Considered the RAM disk, but it is platform-specific. @TheBlueMatt suggested an alternative option which is to allow the injection of the actual disk write handler into FilesystemStore and supply an in-memory implementation for fuzzing. But perhaps we are stretching the scope of this PR too much then, so wanted to see if we can keep it to what it is currently? |
665cb25
to
5f7008b
Compare
5f7008b
to
1a3631c
Compare
Fuzz passes, but some deviating log lines show up:
|
Using /dev/shm as ramdisk if present fixed the timeouts. |
Tested with a RAM disk on macos using the tool https://github.com/conorarmstrong/macOS-ramdisk, to see if it isn't too fast now to catch problems. I think it is ok. On my machine RAM disk is about 10x faster than disk. Also when removing the |
4efbeee
to
2dbf59c
Compare
Async filesystem store with eventually consistent writes. It is just using tokio's
spawn_blocking
, because that is whattokio::fs
would otherwise do as well. Usingtokio::fs
would make it complicated to reuse the sync code.ldk-node try out: lightningdevkit/ldk-node@main...joostjager:ldk-node:async-fsstore