Reference implementation using HuggingFace's (HF) tokenizers in Android.
UI to show text to tokens via the tokenizers library in real-time on Android at demo/demo.mp4:
demo.mp4
- Find a model you want to test on HF, e.g., Google's gemma-3-4b-it
- Download and add the
tokenizer.jsontoapp/src/main/assetsnamedgemma-3-4b-it.json - Modify
SELECTED_TOKENIZERinapp/build.gradle.kts
- Run any HuggingFace's (HF) tokenizers on-device in Android.
rusttojavaNDK bindings of HF's tokenizers inrs-hfta- Use of JNI bindings between rust and Android
- Parameterized instrumentation tests (runs on-device)
- compiler optimizations to reduce lib filesize
Run any HF's tokenizer on Android using the associated tokenizers.json from huggingface.co. To achieve that, the HF library is built via rust into a shared library and uses Java Native Interface (JNI) to load the library.
- Hugging Face's
tokenizerslibrary - Qualcomm's Genie Library has a rust to C++ static library implementation of HF's tokenizers at
qairt/2.34.0.250424/examples/Genie/Genie/src/qualla/tokenizers/rust - Shubham Panchal's
Sentence-Embeddings-Android - Rust's
profiledocs