Skip to content

docs: overhaul README — multi-device positioning, device-selection quickstart, runtime stack table#463

Open
iamagenius00 wants to merge 2 commits into
GradientHQ:mainfrom
iamagenius00:docs/readme-overhaul
Open

docs: overhaul README — multi-device positioning, device-selection quickstart, runtime stack table#463
iamagenius00 wants to merge 2 commits into
GradientHQ:mainfrom
iamagenius00:docs/readme-overhaul

Conversation

@iamagenius00
Copy link
Copy Markdown

What

A README overhaul that repositions Parallax around its full device story (Macs, Windows PCs, and GPU machines) and restructures the top of the doc for first-time users.

Why

The previous README led with a source git clone and surfaced Windows support only deep inside the Install section. New users land without an obvious "which path do I take for my machine" answer, and the model list — the main reason people evaluate Parallax — sat far down the page.

What's included

  • Repositioned tagline: "Distributed LLM serving across Macs, Windows PCs, and GPU machines" — promotes Windows to a first-class device alongside Mac and GPU hosts.
  • Device-selection Quickstart: step 1 is now a four-row table (Apple Silicon macOS / Linux GPU / Windows / Docker) mapping each device to its recommended install path, before falling back to the default source install.
  • Supported Models moved up (right after Quickstart) and refreshed: Qwen3.6/Qwen3.5 architecture, GLM-5.1, MiniMax-M2.7, Step-3.5-Flash, DeepSeek-V3.2, Kimi-K2 Thinking, gpt-oss safeguard.
  • New Runtime Stack table: pinned versions for SGLang, vLLM, vLLM Rust frontend, MLX-LM, MLX, Lattica, and Transformers, with what each is used for.
  • Expanded nav, Updates, and Requirements to reflect the above (Windows app guidance, runtime refresh notes).

Verification

All version numbers in the new Runtime Stack table and Recent Updates were checked against the current main:

  • sglang[all]==0.5.12, vllm==0.14.0, mlx-lm==0.31.3, mlx==0.31.2, lattica==1.0.21, transformers>=4.57.1, requires-python = ">=3.11,<3.14" — all match pyproject.toml.
  • vLLM Rust frontend v0.22.0 — matches VLLM_REF default in install.sh.
  • kernels<0.15 (GPU extra) and UV_PRERELEASE=allow — both present in source.

This PR touches README.md only.

@iamagenius00 iamagenius00 requested a review from a team May 31, 2026 15:26
Comment thread README.md Outdated
- Run one model across multiple worker nodes with pipeline-parallel layer sharding.
- Serve an OpenAI-compatible `/v1/chat/completions` endpoint from your own machines.
- Mix Apple Silicon Macs, Windows PCs, and Linux GPU hosts in the same cluster.
- Use GPU backends and the Mac MLX/oMLX path behind one API.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not wrap omlx

Comment thread README.md Outdated
curl http://localhost:3001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "parallax",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave model name as placeholder

Comment thread README.md Outdated
}'
```

For Qwen3 and gpt-oss style reasoning models, thinking is enabled by default. To disable it, add:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe remove this?

@iamagenius00 iamagenius00 force-pushed the docs/readme-overhaul branch from c4b3185 to c424714 Compare June 1, 2026 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants