Skip to content

yakhyo/smolvlm-realtime-webcam-vllm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmolVLM Real-Time Webcam Demo using vLLM Backend

Real-time webcam demo using SmolVLM with a vLLM backend.

The app captures webcam images in the browser, sends them with a text prompt to a local vLLM server via OpenAI-compatible API, and displays the model’s visual-language response.

How to Set Up

Follow these steps to get the real-time SmolVLM webcam demo running with a local vLLM server:


1. Clone the Repository

git clone https://github.com/yourusername/smolvlm-realtime-webcam-vllm.git
cd smolvlm-realtime-webcam-vllm

2. Set Up the vLLM Backend

vLLM Installation (GPU)

# (Recommended) Create a new conda environment.
conda create -n vllm python=3.12 -y
conda activate vllm

# Install vLLM
pip install vllm

Or see full installation instructions in the vLLM documentation.

Start the vLLM Server

Run the vLLM server using the provided shell script:

bash vllm_backend.sh

Default model: HuggingFaceTB/SmolVLM-500M-Instruct

Tested:

  • HuggingFaceTB/SmolVLM-500M-Instruct
  • HuggingFaceTB/SmolVLM-256M-Instruct
  • HuggingFaceTB/SmolVLM-Instruct

If you want to use a different model, pass the model name as an argument:

bash vllm_backend.sh your-org/your-model-name

Note ℹ️ Make sure the model is compatible with vLLM and supports the OpenAI Chat API format.

The script will automatically fall back to the default model if no argument is provided.

3. Launch the Frontend

Open index.html on a browser and allow access to webcam. Done!

Tested using RTX 4090

Reference

About

Real-time webcam demo using SmolVLM with vLLM backend

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published