-
Notifications
You must be signed in to change notification settings - Fork 3.9k
GH-46375: [C++] Add adapters/orc directory to Meson #46906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1181676
to
6e1df95
Compare
I don't believe the AppVeyor failure is related |
I'm not familiar with Meson internals but it looks reasonable to me. cc @kou |
cpp/subprojects/orc.wrap
Outdated
|
||
[wrap-file] | ||
directory = orc-15d86dfcad6fc80ee24cf785419a820a55db103b | ||
source_url = https://github.com/apache/orc/archive/15d86dfcad6fc80ee24cf785419a820a55db103b.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a released version instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet - a release with the Meson configuration hasn't happened (it's brand new)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we merge this PR after the new ORC release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we can wait. I know originally the next release for Orc was planned far out, but there have been discussions for it happening sooner (next month or so).
Will keep tabs on that discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually it would be good if we reconsider for now. The orc implementation here is a dependency for the dataset work, which in turn can unlock substrait work, which is the last component needed before we can more seriously look at the Python build system again.
There are of course ways to work around that, but I think going in a linear fashion is the easiest way to avoid missing anything compared against CMake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. But can we do this after 21.0.0 feature freeze?
We should not release a version that depends on non "official" Apache product releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kou - looks like the wrapdb merged apache-orc today, so this uses the official 2.2.0 release
Hmm looks like the build is failing due to some UB:
I don't remember this from testing against a development version of orc, so its possibly a regression in the 2.2.0 release? Have to investigate more |
@wgtmac may help us. |
Let me try to reproduce it on my side. Is it a stable failure on your side? @WillAyd |
It is reproducible. If you take this branch and do:
Then you can either run the test suite with
or run the individual test in However, I erred in my comment before that this did not appear previously. It looks like even back to orc commit 15d86dfcad6fc80ee24cf785419a820a55db103b where this PR started that UB can be detected; we may have just gotten lucky with CI not failing |
It seems that I cannot execute |
Are you getting any error? |
I got the above error. If I removed them, I see some compiling error from boost like below:
Let me investigate it. |
Ah that's interesting. Maybe just doing If not, will look further into what the macOS linker is doing |
Maybe the sanitizers aren't supported on arm macs? This homebrew conversation is a few years old but I don't see a resolution: |
I have successfully reproduced it without meson. Simply run
|
Could you please apply this patch? |
I believe the core of the issue is that orc does not distribute pkgconfig information like most (possibly all) of the other conda-installed libraries, which place their respective files in |
Unfortunately I'm not familiar with |
Meson can use use not only pkg-config package ( |
The CMake config could be used as well but it is also not part of the conda installation |
Can we improve conda package to include the CMake config? |
I found |
Ah nice find! Unfortunately, the conda installation seems to be pinning orc to the 1.9.0 installation (I noticed this also in #47455 (comment)). The 1.9.0 conda package doesn't seem to distribute that same file, so we might have to dissect what is causing that version pin |
4b3c200
to
8771237
Compare
8771237
to
b16a0db
Compare
OK so the issue is definitely the fact that there are multiple versions of Orc installed, and the conda-forge installed version of 1.9.0 is too old to distribute any cmake/pkgconfig information that makes it detectable. Looking at ci/conda_env_cpp.txt, it looks like the There are a lot of potential options to fix this, but I think the cleanest (and easiest?) would be to get an updated conda package for grpc_cpp somehow |
It seems that |
Nice find! That helped us get unpinned. The newest issue appears to be something with the cmake files that orc distributes. The dependency call currently fails with the following warning:
Within the conda environment, I see orc distributes its own files for CMake package resolution:
Renaming or removing any of these files seems to let CMake find the required packages. Hopefully there's some type of setting for CMake we can use to ignore these? |
@wgtmac Can Apache ORC also try finding CMake config packages? For example, can https://github.com/apache/orc/blob/main/cmake_modules/FindSnappy.cmake call arrow/cpp/cmake_modules/FindSnappyAlt.cmake Lines 34 to 66 in d2dace9
FindSnappyAlt.cmake doesn't need explicit CONIFG because it uses different name (not using FindSnappy.cmake ).)
|
Can we try https://github.com/apache/orc/blob/main/cmake_modules/ThirdpartyToolchain.cmake#L269 |
Hmm I'm not sure - that file is not distributed with the conda install, so it may not have an effect? I'm not sure of all the rules CMake uses, but from Meson I tried to do: orc_dep = dependency('orc', cmake_args: ['-DORC_PACKAGE_KIND=conan']) I can confirm in the logs that argument is getting forwarded to cmake:
But it appears to make no difference |
Let's improve Apache ORC before this. |
I'm not sure if I can find some time to work on this recently. I'm also a little bit concerned of breaking compatibility of downstream... |
FWIW I see this starting back with the orc 2.1.0 release. To be clear, I don't think it is stricly a Meson <> Cmake bridge issue, as it seems like the orc distributed FindXXX.cmake files are interefering with CMake's normal dependency resolution mechanism(s) (see the failing non-Meson jobs in the above CI run) |
Let me know if apache/orc#2416 can help fix the issue here. |
Rationale for this change
Apache ORC now has native support for Meson, so we can rather easily add support now in Arrow
What changes are included in this PR?
This integrates the adapters/orc directory into the Meson configuration
Are these changes tested?
Yes
Are there any user-facing changes?
No