-
Notifications
You must be signed in to change notification settings - Fork 18
Migrate backend from raw REST API to PythonCall.jl + CondaPkg.jl #79
Description
Historically, MLFlowClient.jl has operated as a pure Julia wrapper around the MLflow REST API. However, keeping pace with MLflow's upstream development has become unsustainable.
While MLflow technically provides an OpenAPI/Swagger specification, their backend relies heavily on Protocol Buffers. In practice, this means the REST API frequently experiences undocumented changes, volatile internal endpoints (e.g., /ajax-api/), and inconsistent payload typing. Maintaining a manual HTTP/JSON mapping against a fast-moving, fundamentally Python-first ecosystem requires constant reverse-engineering and patches, which is no longer feasible for this package's long-term health.
I propose we move the core architecture of MLFlowClient.jl to act as an idiomatic Julia bridge to the official Python mlflow client, utilizing PythonCall.jl paired with CondaPkg.jl.
Workflow:
- CondaPkg.jl will automatically provision an isolated, invisible Python environment and install the mlflow package when a user installs MLFlowClient.jl. The user does not need to manage Python themselves
- PythonCall.jl will handle the zero-overhead data translation between Julia types and the underlying Python library
- End-users will still write pure Julia code (e.g., using do blocks instead of Python with context managers)
Considerations
Pros
- Parity: When MLflow releases new features (like recent LLM tracking or Model Registry updates), Julia users get access to them immediately without waiting for us to reverse-engineer the REST payloads
- Artifact logging: Uploading complex artifacts or models becomes significantly easier than raw HTTP multipart/form-data POST requests (a common issue in the history of the package)
- Reduction in tech debt: Reduced JSON serialization and HTTP boilerplate code
Cons
- No pure Julia: Python is now a dependency
- Startup latency: The Python environment will increase the package initialization time