Releases: intel/llm-scaler
Releases · intel/llm-scaler
llm-scaler-omni beta release 0.1.0-b3
Highlights
Resources
- Docker Image: intel/llm-scaler-omni:0.1.0-b3
What’s new
-
omni:
- More workflows support:
- Hunyuan 3D 2.1
- Controlnet on SD3.5, FLUX.1, etc.
- Multi XPU support for Wan 2.2 I2V 14B rapid aio
- AnimateDiff lightning
- Add Windows installation
- More workflows support:
llm-scaler-vllm beta release 0.10.2-b5
Highlights
Resources
- Docker Image: intel/llm-scaler-vllm:0.10.2-b5
What’s new
-
vLLM:
- Enable Qwen3-VL series models
- Enable Qwen3-Omni series models
- Add gpt-oss models support
llm-scaler-omni beta release 0.1.0-b2
Highlights
Resources
- Docker Image: intel/llm-scaler-omni:0.1.0-b2
What’s new
-
omni:
- Fix issues
- Fix ComfyUI interpolate issue.
- Fix Xinference XPU index selection issue.
- Support more workflows
- ComfyUI
- Wan2.2-Animate-14B basic workflow
- Qwen-Image-Edit 2509 workflow
- VoxCPM workflow
- Xinference
- Kokoro-82M-v1.1-zh
- ComfyUI
- Fix issues
llm-scaler-vllm beta release 1.1-preview
Highlights
Resources
- Docker Image: intel/llm-scaler-vllm:1.1-preview
(functionally equivalent to intel/llm-scaler-vllm:0.10.0-b2)
What’s new
- vLLM:
- Bug fix for sym_int4 online quantization on Multi-modal models
llm-scaler-omni beta release 0.1.0-b1
Highlights
Resources
- Docker Image: intel/llm-scaler-omni:0.1.0-b1
What’s new
-
omni:
- Integrated ComfyUI on XPU and provide sample workflows for:
- Wan2.2 TI2V 5B
- Wan2.2 T2V 14B (multi-XPU supported)
- FLUX.1 dev
- FLUX.1 Kontext dev
- Stable Diffusion 3.5 large
- Qwen Image, Qwen Image Edit, etc
- Added support for xDit, Yunchang, and Raylight usages on XPU.
- Integrated Xinference with OpenAI-compatible APIs to provide:
- TTS: kokoro 82M
- STT: Whisper Large v3
- T2I: Stable Diffusion 3.5 Medium
- Integrated ComfyUI on XPU and provide sample workflows for:
llm-scaler-vllm beta release 0.10.0-b4
llm-scaler-vllm beta release 0.10.0-b3
Highlights
Resources
- Docker Image: intel/llm-scaler-vllm:0.10.0-b3
What’s new
-
vLLM:
- Support Seed-oss model
- Adding miner-U
- Enable MiniCPM-V-4_5
- Fix internvl_3_5 and deepseek-v2-lite error
llm-scaler-vllm beta release 0.10.0-b2
Highlights
Resources
- Docker Image: intel/llm-scaler-vllm:0.10.0-b2
What’s new
-
vLLM:
- Bug fixe for sym_int4 online quantization on Multi-modal models
llm-scaler-vllm beta release 0.10.0-b1
Highlights
Resources
- Docker Image: intel/llm-scaler-vllm:0.10.0-b1
What’s new
-
vLLM:
- Upgrade vLLM to 0.10.0 version
- Supports async scheduling with option --async-scheduling
- Changing the support of embedding/reranker models to V1 engine
- Supporting pipeline parallsim with mp/ray backend
- Enable internvl3-8b model
- Enable MiniCPM-v-4 model
- Enable InternVL3_5-8B
llm-scaler-vllm beta release 0.9.0-b3
Highlights
Resources
- Docker Image: intel/llm-scaler-vllm:0.9.0-b3
What’s new
-
vLLM:
- Enable whisper model
- Enable GLM-4.5-Air
- Optimize vLLM memory usage by updating profile_run logic
- Enable/Optimize pipeline parallelism with Ray backend
- Enable GLM-4.1V-9B-Thinking for image input
- Enable model dots.ocr