-
Notifications
You must be signed in to change notification settings - Fork 475
Pull requests: sgl-project/mini-sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Fix] Fix OOM during weight loading with tensor parallelism
#93
opened Mar 4, 2026 by
NikitosKh
Loading…
Fix: torch.AcceleratorError: CUDA error: an illegal memory access was encountered
#89
opened Mar 1, 2026 by
itechbear
Loading…
perf: Optimize CUDA graph batch size selection and padding
#56
opened Dec 30, 2025 by
louiswang524
Loading…
feat: Implement batch tokenization for improved throughput
#55
opened Dec 30, 2025 by
louiswang524
Loading…
[Refactor] Restructure test suite to match source layout and isolate benchmarks
#53
opened Dec 29, 2025 by
DhiraPT
Loading…
[Feature] Add MLA configuration and KV cache storage kernel
#42
opened Dec 23, 2025 by
DhiraPT
Loading…
[Education] Offline benchmark performance of Qwen3-0.6B on MLX (CPU) and Modal (GPU)
#40
opened Dec 23, 2025 by
lamng3
Loading…
[Improvement] Enhance engine error handling and documentation add more logging and doc
#23
opened Dec 20, 2025 by
louiswang524
Loading…
ProTip!
Add no:assignee to see everything that’s not assigned.