-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Decide techniques used to transfer data required for jobs between clients. Bittorrent seems promising, but I am not sure as to how fast is initialization and how costly is peer searching and communication overhead. Also check how Hadoop does it. How saturated should a mesh be? There should be some mechanism for labeling data with rareness or reproducibility and cost. This can perhaps be determined by run jobs' requirements to some extent. How should data be addressed and catalogued? Also, NSQ uses snappy or deflate for message compression, but perhaps zstd may perform better, or decide based on data usage frequency.