-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi @rob-p, hope you're doing well!
I have been trying to build an index for the first time in a while, and so I'm returning to an issue that has been raised a couple of times, for example here: #161.
I've put in a bit more debugging effort than previously, and have managed to get it to work for me. In my case, I had to use /tmp rather than /scratch, as it turns out /scratch for me is (I think) a super-fast NFS server and therefore has the same issues.
Here are a few bits of information that will hopefully be interesting to you:
- In my case, the main problem is not how long the process takes, but the fact that many tiny / empty files named cuttlefish-path-output-... are created, but not all deleted, by
simpleaf index. This becomes a real problem, as at some point you hit 1M files and you get kicked off the cluster 😅 - I tried using
/scratchas working and index directory and had the same problem, as it turns out our scratch is fast but non-local. - Using
/tmpworked! I checked the directory and there were the same cuttlefish-path-output-... files as before, but deletion was keeping up with creation. - Another strange observation is that this problem didn't happen in an interactive job on the cluster, independent of working directory.
I'm wondering about easy ways to fix this. A thought is something like:
- Add a
local_dirparameter tosimpleaf index, and then once it is finished, copy the outputs over to theoutput_dir. - Include some testing that the
local_dirreally is local.
I imagine the second part is a bit more tricky. Maybe a possible approach is not to check whether the dir is local, but to check that it is sufficiently fast? And/or include checks of whether the number of cuttlefish-path-output files is getting out of hand? It feels like something should be possible...
Hope this is helpful!
Will