This web application built on ONNX Runtime Web implements YOLO's multi-task inference capabilities:
- 🔍 Object Detection - Precisely identify and locate various objects
- 👤 Pose Estimation - Track human keypoints and poses
- 🖼️ Instance Segmentation - Pixel-level object area identification
- ⚡ WebGPU Acceleration - Leverage the latest Web graphics API for enhanced performance
- 🧠 WASM (CPU) - Provide compatibility on devices that don't support WebGPU
The application supports multiple input types for object detection:
Input Type | Format | Description | Use Case |
---|---|---|---|
📷 Image | JPG, PNG | Upload and analyze static images | 🔍 Single image analysis, batch processing |
📹 Video | MP4 | Upload and process video files | 🎬 Offline video analysis, content review |
📺 Live Camera | Real-time stream | Use device camera for live detection | 🚀 Real-time monitoring, interactive demos |
Model | Input Size | Params | mAPval 50-95 |
Speed T4 TensorRT10 (ms) |
Best For | License |
---|---|---|---|---|---|---|
YOLO11-N | 640 | 2.6M | 39.5 | 1.5 | 📱 Mobile devices & real-time applications | AGPL-3.0 (Ultralytics YOLO) |
YOLO11-S | 640 | 9.4M | 47.0 | 2.5 | 🖥️ Higher accuracy requirements | AGPL-3.0 (Ultralytics YOLO) |
YOLO12-N | 640 | 2.6M | 40.6 | 1.64 | 📱 Mobile devices & real-time applications | AGPL-3.0 (Ultralytics YOLO) |
YOLO12-S | 640 | 9.3M | 48.0 | 2.61 | 🖥️ Higher accuracy requirements | AGPL-3.0 (Ultralytics YOLO) |
- Clone this repository
git clone https://github.com/nomi30701/yolo-multi-task-onnxruntime-web.git
- cd to the project directory
cd yolo-multi-task-onnxruntime-web
- Install dependencies
yarn install
Start development server
yarn dev
Build the project
yarn build
To use a custom YOLO model, follow these steps:
Use Ultralytics or your preferred method to export your YOLO model to ONNX format. Ensure to use opset=12
for WebGPU compatibility.
from ultralytics import YOLO
# Load your model
model = YOLO("path/to/your/model.pt")
# Export to ONNX
model.export(format="onnx", opset=12, dynamic=True)
You can either:
- 📁 Copy your ONNX model file to the
./public/models/
directory - 🔄 Upload your model directly through the
**Add model**
button in the web interface
In App.jsx
<label htmlFor="model-selector">Model:</label>
<select name="model-selector">
<option value="yolo12n">yolo11n-2.6M</option>
<option value="yolo12s">yolo11s-9.4M</option>
<option value="your-custom-model-name">Your Custom Model</option>
</select>
Replace "your-custom-model-name"
with the filename of your ONNX model.
You have two options to define class labels for your custom model:
- Click the Add Classes.json button in the web interface
- Upload your custom
classes.json
file OR - Use the default COCO classes by selecting "Use Default Classes"
Update the src/utils/yolo_classes.json
file with the class names that your custom model uses. This file should contain a dict of strings representing the class labels.
For example:
{
"class": {
"0": "person",
"1": "bicycle",
"2": "car",
"3": "motorcycle",
"4": "airplane"
}
}
Make sure the classes match exactly with those used during training of your custom model.
💡 Tip: The web interface allows you to:
- 📤 Upload custom
classes.json
files for different models- 🔄 Switch between default and custom class definitions
- ✅ Validate your class definitions before inference
🚀 WebGPU Support
Ensure you set
opset=12
when exporting ONNX models, as this is required for WebGPU compatibility.
The web application provides two options for handling input image sizes, controlled by the imgsz_type
setting:
-
Dynamic:
- When selected, the input image is used at its original size without resizing.
- Inference time may vary depending on the image resolution; larger images take longer to process.
-
Zero Pad:
- When selected, the input image is first padded with zero pixels to make it square (by adding padding to the right and bottom).
- The padded image is then resized to 640x640 pixels.
- This option provides a balance between accuracy and inference time, as it avoids extreme scaling while maintaining a predictable processing speed.
- Use this option for real-time applications.
✨ Dynamic input
This requires that the YOLO model was exported with
dynamic=True
to support variable input sizes.