Skip to content

Data cloud #2

@emphasis87

Description

@emphasis87

Decide techniques used to transfer data required for jobs between clients. Bittorrent seems promising, but I am not sure as to how fast is initialization and how costly is peer searching and communication overhead. Also check how Hadoop does it. How saturated should a mesh be? There should be some mechanism for labeling data with rareness or reproducibility and cost. This can perhaps be determined by run jobs' requirements to some extent. How should data be addressed and catalogued? Also, NSQ uses snappy or deflate for message compression, but perhaps zstd may perform better, or decide based on data usage frequency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions