thaw
building open source GPU state management for LLM inference. 17x faster cold starts via pipelined DMA, KV cache snapshots, multi-GPU tensor parallel.
- 1 follower
- United States of America
- https://thaw.sh
- company/thaw-ai
- nils@thaw.sh
Popular repositories Loading
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
Repositories
Showing 2 of 2 repositories
- thaw Public
git for live agent sessions: snapshot, branch, and diff a running vLLM/SGLang session as a durable file. inspect & diff on a laptop, no GPU; restore skips prefill. Rust + CUDA, Apache-2.0. pip install thaw-vllm
thaw-ai/thaw’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
thaw-ai/vllm’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…