Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the llama.cpp dependency to version b8851 and introduces support for Gemma 4 models. Key architectural changes include refactoring the Llama tool-calling parser to use a stateful C++ PEG parser context and updating the MLX client to use MLXHuggingFace for tokenizer loading. Feedback focuses on a high-severity thread-safety issue where the shared PEG parser arena in LlamaClient could cause race conditions during concurrent generations, as well as a recommendation to use stricter versioning for the swift-jinja dependency to avoid breaking changes.
8d6a2b4 to
559f51b
Compare
- Added symlinks for new headers: unicode.h, mtmd-debug.h, jinja, nlohmann, and module.modulemap. - Updated llama.cpp submodule to a new commit. - Enhanced utils.h and utils.cpp with new functions for handling chat parameters and tool inputs. - Removed obsolete minja symlink. - Introduced new mtmd-image files for improved image handling. - Updated Context.swift and Utility.swift for better integration with MLX. - Modified glob patterns in Globs.swift to include new jinja files. - Removed outdated LlamaToolCallParserTests and refactored LlamaToolTests for improved functionality. - Added new tests for message processing and tool call parsing. - Updated update_dependencies.sh to verify symlink integrity after submodule updates.
|
The tests pass in my local environment, but for some reason they seem to fail on GitHub Actions. I would like to investigate this issue separately. Also, if anyone can fix it, please feel free to send a PR. |
Wow, that was fast! 🚀 Thanks for the quick turnaround on supporting Gemma 4. Amazing work! |
Support Gemma 4