Skip to content

Releases: openvinotoolkit/openvino_tokenizers

2024.1.0.2

10 May 09:42
c754503

Choose a tag to compare

What's Changed

Full Changelog: 2024.1.0.1...2024.1.0.2

2024.1.0.1

08 May 15:59
37d20ce

Choose a tag to compare

What's Changed

  • Llama3 Tokenizer Support
  • Add not-add-special-tokens flag to CLI conversion tool

Full Changelog: 2024.1.0.0...2024.1.0.1

2024.1.0.0

25 Apr 13:04
ad37623

Choose a tag to compare

What's Changed

  • New operations:
    • TrieTokenizer
    • VocabEncoder
    • EqualStr
    • RaggedToSparse
    • RaggedToRagged
    • FuzeRagged
  • Update existing operations:
    • Add max_splits argument to RegexSplit
    • Add encoding argument to CaseFold
  • Add new and update existing TensorFlow translators for TextVectorization layer partial support.
  • RWKV tokenizer support.
  • New way to get OpenVINO Tokenizers - build from files. Supports RWKV tokenizer.
  • Update tokenizer operation caching mechanism for OpenVINO model caching support
  • SentencePiece tokenizer changes and fixes:
    • Update to 0.2.0 version
    • Use constant 0 as mask hide token by @as-suvorov in #90
    • Sentencepiece BOS Token Detection
  • Fix multi-input model merging by @yas-sim in #53

New Contributors

Full Changelog: 2024.0.0.0...2024.1.0.0

2024.0.0.0

21 Mar 14:38
aa0587d

Choose a tag to compare

What's Changed

  • Improve Regex Support - filter lookarounds, unsupported by re2
  • Improve model coverage - T5 tokenizers, QWEN2
  • Add tokenizer metadata to rt_info - EOS token id
  • Support TensorFlow Text MUSE model conversion and inference

New Contributors