Fix self-host DatasetMeta serialization and offline base-model alias resolution#231
Fix self-host DatasetMeta serialization and offline base-model alias resolution#231kevssim wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for model ID aliases by adding a new model_alias module, integrating alias resolution into the model downloading process, and propagating the alias mapping via environment variables. It also refactors DatasetMeta serialization to dynamically filter fields. The review feedback highlights several excellent optimization opportunities, such as caching the dataclass fields as a module-level constant to avoid dynamic lookup overhead, using strip('/') to robustly handle trailing slashes in route prefixes, and avoiding redundant dictionary copies when resolving aliases.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
PR type
PR information
This PR fixes two self-host client/server issues.
DatasetMetaHTTP serialization failure.DatasetMetawas serialized from__dict__, which included the runtime-only_uidfield. When the processor service deserialized that payload and rebuiltDatasetMeta, server-side processor creation failed with:This PR changes the serialization path to only include declared dataclass fields and filters unexpected/private fields during deserialization. That fixes both newly generated payloads and stale payloads that may still contain
_uid.Previously,
dataset.set_template(..., model_id='Qwen/Qwen3.5-4B')failed in offline self-host deployments because template initialization treatedmodel_idas a real local path or a hub model id. In practice, the client may only know the public model name, while the server is configured to load the actual model from a local path such as/nas/disk1/Qwen3-8B.This PR adds server-side model alias resolution:
public model name -> real model_id/pathalias map from model deployment configTWINKLE_MODEL_ID_ALIASESHubOperation.download_model()before local path or hub resolutionWith this change, client code can continue using:
as long as the server deployment is configured with a matching route and real local model path.
Files changed:
src/twinkle_client/common/serialize.pysrc/twinkle/hub/model_alias.pysrc/twinkle/hub/hub.pysrc/twinkle/hub/__init__.pysrc/twinkle/server/launcher/server_launcher.pysrc/twinkle/server/launcher/env_propagation.py