-
Notifications
You must be signed in to change notification settings - Fork 6
Add an optional HF persistent cache configuration #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds optional persistent Hugging Face model caching support to prevent models from being re-downloaded on every script execution, particularly useful for manual testing scenarios. The implementation creates a Docker volume for cache persistence when configured and passes the cache directory to model execution scripts.
Key Changes:
- Added logic to check for and utilize
persistent_hf_cache_dirconfiguration from context - Implemented Docker volume creation and mounting for persistent HF cache storage
- Refactored model execution command building to append
--hf_cache_dirargument when persistent cache is enabled
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| docker_options += self.get_mount_arg(mount_datapaths) | ||
|
|
||
| if persistent_hf_cache_dir: | ||
| docker_options += f" -v {hf_models_cache_volume_name}:{persistent_hf_cache_dir} " |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The persistent_hf_cache_dir value from context is used directly in the Docker mount without validation. This could lead to arbitrary directory mounting if the context contains malicious input. Add validation to ensure the path is safe and doesn't contain dangerous characters or paths like /, /etc, or similar sensitive directories.
| model_args = self.context.ctx.get("model_args", info["args"]) | ||
|
|
||
| if persistent_hf_cache_dir: | ||
| model_args += f" --hf_cache_dir {persistent_hf_cache_dir} " |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The persistent_hf_cache_dir is concatenated directly into the command string without shell escaping. If the path contains spaces or special characters, it could break the command or introduce command injection vulnerabilities. Use proper shell escaping (e.g., shlex.quote()) when building the command string.
|
LGTM |
Motivation
This PR introduces support for persistent Hugging Face model caching within Docker containers. This prevents models from being re-downloaded every time the script runs, which is especially useful during manual testing on private machines from Conductor. By allowing cache reuse, it reduces network overhead and speeds up repeated test cycles.
Technical Details
persistent_hf_cache_dirin the context.hf_models_cache_volume) whenpersistent_hf_cache_diris provided.--hf_cache_dirargument when persistent cache is enabled.Should be merged after:
Test Plan
Validated with manual job: https://mlseqa-ci.amd.com:9090/view/Experimental/job/Test-MAD-akochin/118/
Test Result
Submission Checklist