Skip to content

Fine-tune download speed for xet + stream backpressure #1279

Open
@coyotte508

Description

@coyotte508

Currently we have a mechanism to have 1000 chunks in backpressure. (would love that to be a byte amount but it's not doable easily due to incompatibilities between implementations of ReadableStream, would need per-platform adaptations)

But if the stream is consumed instantly (eg writing to file), we have a bottleneck because we don't make calls in parallel to the CAS service.

Eg someone with a download speed of 1GB/s, each chunk CAS range is at most 64MB compressed, that's 16 HTTP request/second, assuming a 30ms ping, that's a 33% download speed decrease just from doing the request one after the other.

And it can even be worse, with smaller ranges, and higher bandwiths.

So we need to add some parallelism probably, and ideally be able to react to the download speed / latency

Like maybe open multiple read streams in // and join them and add on a new read stream as an old one is consumed (see also https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/pipeTo)

Metadata

Metadata

Assignees

No one assigned

    Labels

    hub@huggingface/hub related

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions