This Android application performs real-time wake word detection using ONNX models. The system integrates custom audio preprocessing, Mel-spectrogram generation, feature embedding, and wake word classification — all directly on-device. The backend is fully implemented in Java, optimized for Android runtime environments.
- Real-Time Wake Word Detection: Detects custom wake words from live microphone input.
- ONNX Runtime Integration: Runs ONNX-based inference for Mel-spectrogram, embedding, and classification models.
- Custom Pipeline: Uses a modular Java pipeline: audio buffer → Mel-spectrogram → embedding → wake word prediction.
- YAMNet Classification: Optionally integrates TensorFlow Lite YAMNet to classify ambient audio.
- Thread-Safe Design: Handles audio recording, inference, and UI updates with optimized multithreaded architecture.
- Modular & Extensible: Easily replace models or adjust buffer/window configurations for different use cases.
AudioRecord (16 kHz PCM)
↓
1280-sample sliding buffer
↓
Mel-spectrogram (ONNX)
↓
Embedding model (ONNX)
↓
Wake word classifier (ONNX)
↓
Detection result (UI update via callback)
- Android Studio (latest stable version)
- Android device running Android 6.0 (API 23) or higher
- Internet access for downloading dependencies (e.g., ONNX Runtime Mobile)
- Place the following models in your
assets/folder:melspectrogram.onnxembedding_model.onnxyour_custom_wakeword_model.onnx(e.g.,chaamiiya.onnx)yamnet.tfliteandyamnet_labels.txtfor sound classification
- Clone this repository.
- Open the project in Android Studio.
- Place the required .onnx and .tflite models in the assets/ folder.
- Build and run the app on a real device or emulator (microphone access required).
- Speak the wake word and observe detection results in the UI.
- This project provides a fully on-device wake word detection pipeline using ONNX models — no server or internet required after setup.
- The wake word model and embedding extractor are fully customizable; you can retrain models using Python and deploy them here.