fix(python): cap substrait below 0.85.0#7153
Conversation
Signed-off-by: Xu Che <chrisxuche@gmail.com>
|
I found an important mistake in this PR, and I want to correct it here. I was misled by the way the Substrait Python packaging is split. I also only did a quick check in an existing project before opening the PR, rather than testing the install in a clean environment from scratch. That rough testing caused me to miss the problem in this PR. Most of my comments in the issue/PR are still correct:
However, I misunderstood how Substrait is packaged:
( So this PR was wrong. I wrote the bound as if the PyPI I have now tested the new fix in a clean virtual environment, and I confirmed that this corrected bound works. For this issue, I will send a follow-up PR to correct the dependency version bound. |
## Summary I made a mistake in #7153, for details about the mistake, see this comment: #7153 (comment) This PR correctly fixes the `substrait-python` dependency problem. ## Testing - In a clean Python 3.12 virtual environment, `pip install vortex-data==0.64.0` resolved to `substrait==0.29.0`, `substrait-protobuf==0.85.0`, and `substrait-extensions==0.85.0`, and `import vortex` failed. - In a clean Python 3.12 virtual environment, installing `vortex-data` from this PR branch resolved to `substrait==0.28.0`, `substrait-protobuf==0.79.0`, and `substrait-extensions==0.79.0`, and `import vortex` succeeded. Signed-off-by: Xu Che <chrisxuche@gmail.com>
Summary
Fixes #7152.
In Substrait 0.85.0,
SimpleExtensionURIwas removed. The Vortex Python bindings have not been updated to use the new implementation, and they also do not set an appropriate upper bound on the dependency version.This change caps the Python dependency at
substrait<0.85.0until the code is updated for the URN schema.This workaround is used because it is currently difficult to make Vortex support the latest Substrait. Vortex currently consumes Substrait expressions produced by PyArrow. However, the current Arrow tree still uses Substrait v0.44.0. Until Arrow is updated, it is difficult for us to support the latest Substrait easily because of our dependency on Arrow.
See:
PyArrow uses
v0.44.0: https://github.com/apache/arrow/blob/f9315d4e7fb61ac85a77c651dcee84dbfad88472/cpp/thirdparty/versions.txt#L109We convert PyArrow Expression into Substrait type:
vortex/vortex-python/python/vortex/arrow/expression.py
Lines 47 to 50 in f200823
Testing
Did not run tests. This is a dependency metadata change only.