Skip to content

lmstudio-ai/mlx-engine

Repository files navigation

lmstudio + MLX

mlx-engine - Apple MLX LLM Engine for LM Studio


Discord

mlx-engine

MLX engine for LM Studio


Built with

  • mlx-lm - Apple MLX inference engine (MIT)
  • Outlines - Structured output for LLMs (Apache 2.0)
  • mlx-vlm - Vision model inferencing for MLX (MIT)

How to use in LM Studio

LM Studio 0.3.4 and newer for Mac ships pre-bundled with mlx-engine. Download LM Studio from here


Standalone Demo

Prerequisites

  • macOS 14.0 (Sonoma) or greater.
  • python3.11
    • The requirements.txt file is compiled specifically for python3.11. python3.11 is the python version bundled within the LM Studio MLX runtime
    • brew install [email protected] is a quick way to add python3.11 to your path that doesn't break your default python setup

Install Steps

To run a demo of model load and inference:

  1. Clone the repository
git clone https://github.com/lmstudio-ai/mlx-engine.git
cd mlx-engine
  1. Create a virtual environment (optional)
 python3.11 -m venv .venv
 source .venv/bin/activate
  1. Install the required dependency packages
pip install -U -r requirements.txt

Text Model Demo

Download models with the lms CLI tool. The lms CLI documentation can be found here: https://lmstudio.ai/docs/cli Run the demo.py script with an MLX text generation model:

lms get mlx-community/Meta-Llama-3.1-8B-Instruct-4bit
python demo.py --model mlx-community/Meta-Llama-3.1-8B-Instruct-4bit 

mlx-community/Meta-Llama-3.1-8B-Instruct-4bit - 4.53 GB

This command will use a default prompt. For a different prompt, add a custom --prompt argument like:

lms get mlx-community/Mistral-Small-Instruct-2409-4bit
python demo.py --model mlx-community/Mistral-Small-Instruct-2409-4bit --prompt "How long will it take for an apple to fall from a 10m tree?"

mlx-community/Mistral-Small-Instruct-2409-4bit - 12.52 GB

Vision Model Demo

Run the demo.py script with an MLX vision model:

lms get mlx-community/pixtral-12b-4bit
python demo.py --model mlx-community/pixtral-12b-4bit --prompt "Compare these images" --images demo-data/chameleon.webp demo-data/toucan.jpeg

Currently supported vision models include:

  • Llama-3.2-Vision
    • lms get mlx-community/Llama-3.2-11B-Vision-Instruct-4bit
  • Pixtral
    • lms get mlx-community/pixtral-12b-4bit
  • Qwen2-VL
    • lms get mlx-community/Qwen2-VL-7B-Instruct-4bit
  • Llava-v1.6
    • lms get mlx-community/llava-v1.6-mistral-7b-4bit

Speculative Decoding Demo

Run the demo.py script with an MLX text generation model and a compatible --draft-model

lms get mlx-community/Qwen2.5-7B-Instruct-4bit
lms get lmstudio-community/Qwen2.5-0.5B-Instruct-MLX-8bit
python demo.py \
    --model mlx-community/Qwen2.5-7B-Instruct-4bit \
    --draft-model lmstudio-community/Qwen2.5-0.5B-Instruct-MLX-8bit \
    --prompt "<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
Write a quick sort algorithm in C++<|im_end|>
<|im_start|>assistant
"

Development Setup

Pre-commit Hooks

We use pre-commit hooks to maintain code quality. Before contributing, please:

  1. Install pre-commit:
    pip install pre-commit && pre-commit install
  2. Run pre-commit:
    pre-commit run --all-files
  3. Fix any issues before submitting your PR

Testing

To run tests, run the following from the root of this repo:

python -m pip install pytest
python -m pytest tests/

To test specific vision models:

python -m pytest tests/test_vision_models.py -k pixtral

Attribution

Ernie 4.5 modeling code is sourced from Baidu

About

Apple MLX engine for LM Studio

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages