This Rust crate is a binding for the sentencepiece unsupervised text tokenizer. The crate documentation is available online.
This crate depends on the sentencepiece C++ library. By default,
this dependency is treated as follows:
- If
sentencepiececould be found withpkg-config, the crate will link against the library found throughpkg-config. Warning: dynamic linking only works correctly with sentencepiece 0.1.95 or later, due to a bug in earlier versions. - Otherwise, the crate's build script will do a static build of the
sentencepiecelibrary. This requires thatcmakeis available.
If you wish to override this behavior, the sentencepiece-sys crate
offers two features:
system: always attempt to link to thesentencepiecelibrary found withpkg-config.static: always do a static build of thesentencepiecelibrary and link against that.