Skip to content
@thaw-ai

thaw

building open source GPU state management for LLM inference. 17x faster cold starts via pipelined DMA, KV cache snapshots, multi-GPU tensor parallel.

Popular repositories Loading

  1. thaw thaw Public

    git for live agent sessions: snapshot, branch, and diff a running vLLM/SGLang session as a durable file. inspect & diff on a laptop, no GPU; restore skips prefill. Rust + CUDA, Apache-2.0. pip inst…

    Python 6

  2. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

Repositories

Showing 2 of 2 repositories

Top languages

Loading…

Most used topics

Loading…