diff --git a/README.md b/README.md index 8c3a0465..8717a19b 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@

- + Parallax

Trusted by Partners

SGLang @@ -8,16 +8,16 @@ Qwen DeepSeek Kimi - Minimax - ZAI + MiniMax + ZAI

-[![license](https://img.shields.io/github/license/GradientHQ/parallax.svg)](https://github.com/GradientHQ/parallax/tree/main/LICENSE) +[![license](https://img.shields.io/github/license/GradientHQ/parallax.svg)](https://github.com/GradientHQ/parallax/blob/main/LICENSE) [![issue resolution](https://img.shields.io/github/issues-closed-raw/GradientHQ/parallax)](https://github.com/GradientHQ/parallax/issues) [![open issues](https://img.shields.io/github/issues-raw/GradientHQ/parallax)](https://github.com/GradientHQ/parallax/issues) -Parallax by Gradient - Host LLMs across devices sharing GPU to make your AI go brrr | Product Hunt +Parallax by Gradient - Product Hunt Top Post
@@ -28,33 +28,41 @@ | [**Discord**](https://discord.gg/parallaxai) | [**Arxiv**](https://arxiv.org/pdf/2509.26182v1) -## News -- [2026/2] 🦞 Parallax now supports OpenClaw integration! See [Docs](./docs/user_guide/work_with_openclaw.md) -- [2025/10] 🔥 Parallax won #1 Product of The Day on Product Hunt! -- [2025/10] 🔥 Parallax version 0.0.1 has been released! +## Recent Updates + +- [2026/05] Expanded model support with Qwen3.6/Qwen3.5, GLM-5.1, MiniMax-M2.7, Step-3.5-Flash, and newer DeepSeek/Kimi/gpt-oss families. +- [2026/05] Added Apple M5 family hardware detection for Mac workers. +- [2026/02] Added OpenClaw integration. See [Working with OpenClaw](https://github.com/GradientHQ/parallax/blob/main/docs/user_guide/work_with_openclaw.md). +- [2025/10] Parallax won #1 Product of The Day on Product Hunt. +- [2025/10] Parallax version 0.0.1 was released. ## About -A fully decentralized inference engine developed by [Gradient](https://gradient.network). Parallax lets you build your own AI cluster for model inference onto a set of distributed nodes despite their varying configuration and physical location. Its core features include: -- **Host local LLM on personal devices** -- **Cross-platform support** -- **Pipeline parallel model sharding** -- **Paged KV cache management & continuous batching for Mac** +Parallax is a decentralized inference engine developed by [Gradient](https://gradient.network). It lets you build your own AI cluster for model inference across distributed nodes with different hardware and locations. + +Core features include: + +- **Host local LLMs on personal devices** +- **Cross-platform support for Macs, Windows PCs, and GPU machines** +- **Pipeline-parallel model sharding** +- **Paged KV cache management and continuous batching for Mac** - **Dynamic request scheduling and routing for high performance** -The backend architecture: +Backend architecture: -* P2P communication powered by [Lattica](https://github.com/GradientHQ/lattica) -* GPU backend powered by [SGLang](https://github.com/sgl-project/sglang) and [vLLM](https://github.com/vllm-project/vllm) -* MAC backend powered by [MLX LM](https://github.com/ml-explore/mlx-lm) +- P2P communication powered by [Lattica](https://github.com/GradientHQ/lattica) +- GPU backend powered by [SGLang](https://github.com/sgl-project/sglang) and [vLLM](https://github.com/vllm-project/vllm) +- Mac backend powered by [MLX LM](https://github.com/ml-explore/mlx-lm) ## User Guide -- [Installation](./docs/user_guide/install.md) -- [Getting Started](./docs/user_guide/quick_start.md) -- [Working with OpenClaw 🦞](./docs/user_guide/work_with_openclaw.md) +- [Installation](https://github.com/GradientHQ/parallax/blob/main/docs/user_guide/install.md) +- [Getting Started](https://github.com/GradientHQ/parallax/blob/main/docs/user_guide/quick_start.md) +- [Working with OpenClaw](https://github.com/GradientHQ/parallax/blob/main/docs/user_guide/work_with_openclaw.md) -## Quick Install +## Quickstart + +Install from source on macOS or Linux: ```sh git clone https://github.com/GradientHQ/parallax.git @@ -63,24 +71,92 @@ cd parallax source .venv/bin/activate ``` -The install script installs `uv` if needed, creates `.venv`, installs Parallax, -and builds the `vllm-rs` frontend binary into `.venv/bin`. Use -`./install.sh --extras gpu` on Linux/WSL GPU hosts or `./install.sh --extras mac` -on Apple silicon macOS. For development dependencies, use `--extras gpu,dev` or -`--extras mac,dev`. +The install script creates `.venv`, installs Parallax, selects the default extras for your platform, and builds the `vllm-rs` frontend binary into `.venv/bin`. Use `./install.sh --extras gpu` on Linux/WSL GPU hosts or `./install.sh --extras mac` on Apple Silicon macOS. -## Contributing +Windows users can download [Parallax_Win_Setup.exe](https://github.com/GradientHQ/parallax_win_cli/releases/latest/download/Parallax_Win_Setup.exe), open Windows Terminal as administrator, and run: + +```powershell +parallax install +``` + +Start the scheduler and setup UI: + +```sh +parallax run +``` + +Open [http://localhost:3001](http://localhost:3001), choose a model and node count, then continue to the join screen. + +If you need a different model source or a Hugging Face mirror, set it on the scheduler up front: + +```sh +# ModelScope +parallax run --model-source modelscope --host 0.0.0.0 + +# Hugging Face mirror +parallax run --model-source huggingface --hf-endpoint https://hf-mirror.com --host 0.0.0.0 +``` + +Join each worker node with the command shown in the UI. For a local-network cluster, this is usually: + +```sh +parallax join +``` -We warmly welcome contributions of all kinds! For guidelines on how to get involved, please refer to our [Contributing Guide](./docs/CONTRIBUTING.md). +For nodes outside the same LAN, use the scheduler address shown in the UI or logs: + +```sh +parallax join -s +``` + +The generated join command preserves any `--model-source` or `--hf-endpoint` settings chosen on the scheduler. If you launch workers manually, keep those flags aligned: + +```sh +parallax join --model-source modelscope +parallax join -s --model-source modelscope +``` + +Call the OpenAI-compatible API: + +```sh +curl http://localhost:3001/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [{"role": "user", "content": "hello"}], + "max_tokens": 256, + "stream": true + }' +``` + +Parallax routes the request to the model selected in setup. + +## Dashboard + +Parallax includes a browser setup flow at `http://localhost:3001` for choosing a model, generating worker join commands, and testing the hosted model in chat. + +| Setup | Join workers | Chat | +|:--:|:--:|:--:| +| ![Model and node setup](./docs/images/node_config.png) | ![Node join command](./docs/images/node_join.png) | ![Chat interface](./docs/images/chat_interface.png) | ## Supported Models -| | Provider | HuggingFace Collection | Blog | Description | -|:-------------|:-------------|:----------------------------:|:----------------------------:|:----------------------------| -|DeepSeek | Deepseek | [DeepSeek-V3.2](https://huggingface.co/deepseek-ai/DeepSeek-V3.2)
[DeepSeek-R1](https://huggingface.co/collections/deepseek-ai/deepseek-r1)
| [Deep Seek AI Launches Revolutionary Language Model](https://deepseek.ai/blog/deepseek-v32) | Deep Seek AI is proud to announce the launch of our latest language model, setting new standards in natural language processing and understanding. This breakthrough represents a significant step forward in AI technology, offering unprecedented capabilities in text generation, comprehension, and analysis. | -|MiniMax-M2 | MiniMax AI | [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2)
[MiniMax-M2.1](https://huggingface.co/MiniMaxAI/MiniMax-M2.1)
[MiniMax-M2.7](https://huggingface.co/MiniMaxAI/MiniMax-M2.7) | [MiniMax M2.1: Significantly Enhanced Multi-Language Programming](https://www.minimax.io/news/minimax-m21) | MiniMax-M2.7 and MiniMax-M2.1 are enhanced sparse MoE models (about 230B parameters, 10B active) built for advanced coding and agentic workflows. They offer state-of-the-art intelligence, delivering efficient, reliable tool use and strong multi-step reasoning. | -|GLM | Z AI | [GLM-5](https://huggingface.co/zai-org/GLM-5)
[GLM-5.1](https://huggingface.co/zai-org/GLM-5.1) | [GLM-5.1 Overview](https://docs.z.ai/guides/llm/glm-5.1) | "GLM" is an advanced large language model series from Z AI, including GLM-5 and GLM-5.1. These models feature long-context support, strong coding and reasoning performance, enhanced tool-use and agent integration, and competitive results across leading open-source benchmarks. | -|Kimi-K2 | Moonshot AI | [Kimi-K2](https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba60617d) | [Kimi K2: Open Agentic Intelligence](https://moonshotai.github.io/Kimi-K2/) | "Kimi-K2" is Moonshot AI's Kimi-K2 model family, including Kimi-K2-Base, Kimi-K2-Instruct and Kimi-K2-Thinking. Kimi K2 Thinking is a state-of-the-art open-source agentic model designed for deep, step-by-step reasoning and dynamic tool use. It features native INT4 quantization and a 256k context window for fast, memory-efficient inference. Uniquely stable in long-horizon tasks, Kimi K2 enables reliable autonomous workflows with consistent performance across hundreds of tool calls. -|Qwen | Qwen | [Qwen3-Next](https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d)
[Qwen3](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f)
[Qwen2.5](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e)| [Qwen3-Next: Towards Ultimate Training & Inference Efficiency](https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list) | The Qwen series is a family of large language models developed by Alibaba's Qwen team. It includes multiple generations such as Qwen2.5, Qwen3, and Qwen3-Next, which improve upon model architecture, efficiency, and capabilities. The models are available in various sizes and instruction-tuned versions, with support for cutting-edge features like long context and quantization. Suitable for a wide range of language tasks and open-source use cases. | -|gpt-oss | OpenAI | [gpt-oss](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4)
[gpt-oss-safeguard](https://huggingface.co/collections/openai/gpt-oss-safeguard) | [Introducing gpt-oss-safeguard](https://openai.com/index/introducing-gpt-oss-safeguard/) | gpt-oss are OpenAI’s open-weight GPT models (20B & 120B). The gpt-oss-safeguard variants are reasoning-based safety classification models: developers provide their own policy at inference, and the model uses chain-of-thought to classify content and explain its reasoning. This allows flexible, policy-driven moderation in complex or evolving domains, with open weights under Apache 2.0. | -|Meta Llama 3 | Meta | [Meta Llama 3](https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6)
[Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f)
[Llama 3.2](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf)
[Llama 3.3](https://huggingface.co/collections/meta-llama/llama-33-67531d5c405ec5d08a852000) | [Introducing Meta Llama 3: The most capable openly available LLM to date](https://ai.meta.com/blog/meta-llama-3/) | "Meta Llama 3" is Meta's third-generation Llama model, available in sizes such as 8B and 70B parameters. Includes instruction-tuned and quantized (e.g., FP8) variants. | +Parallax supports a growing set of open model families. On Apple Silicon, many public Hugging Face IDs are mapped to MLX-optimized variants automatically. See [`src/backend/server/static_config.py`](https://github.com/GradientHQ/parallax/blob/main/src/backend/server/static_config.py) for the current model map. + +| Family | Example model IDs | +|:--|:--| +| Qwen | `Qwen/Qwen3-0.6B`, `Qwen/Qwen3-32B`, `Qwen/Qwen3-Next-80B-A3B-Instruct`, `Qwen/Qwen3.6-27B` | +| DeepSeek | `deepseek-ai/DeepSeek-V3.2`, `deepseek-ai/DeepSeek-R1`, `deepseek-ai/DeepSeek-V3.1` | +| Kimi-K2 | `moonshotai/Kimi-K2-Instruct`, `moonshotai/Kimi-K2-Instruct-0905`, `moonshotai/Kimi-K2-Thinking` | +| MiniMax | `MiniMaxAI/MiniMax-M2`, `MiniMaxAI/MiniMax-M2.1`, `MiniMaxAI/MiniMax-M2.7` | +| GLM / Z.ai | `zai-org/GLM-4.7`, `zai-org/GLM-4.7-Flash`, `zai-org/GLM-5.1` | +| gpt-oss | `openai/gpt-oss-20b`, `openai/gpt-oss-120b`, `openai/gpt-oss-safeguard-20b` | +| Llama | `nvidia/Llama-3.1-8B-Instruct-FP8`, `nvidia/Llama-3.3-70B-Instruct-FP8` | +| StepFun | `stepfun-ai/Step-3.5-Flash` | + +## Contributing + +We warmly welcome contributions of all kinds. For guidelines on how to get involved, please refer to our [Contributing Guide](https://github.com/GradientHQ/parallax/blob/main/docs/CONTRIBUTING.md). + +## License + +Parallax is licensed under the [Apache 2.0 License](https://github.com/GradientHQ/parallax/blob/main/LICENSE).