Chat with MLX is a high-performance macOS application that connects your local documents to a personalized large language model (LLM). By leveraging retrieval-augmented generation (RAG), open source LLMs, and MLX for accelerated machine learning on Apple silicon, you can efficently search, query, and interact with your documents without information ever leaving your device.
Our high-level features include:
- Query: load and search with document-specific prompts
- Converse: switch model interaction modes (converse vs. assist) in real time
- Instruct: provide personalization and response tuning
First, setup huggingface access tokens to download models (request access to google/gemma-7b-it), then
huggingface-cli loginThen download the npm/python requirements
cd app && npm install
pip install -r server/requirements.txtFinally, start the application
cd app && npm run devAll contributions are welcome. Please take a look at contributing guide.

