-
Notifications
You must be signed in to change notification settings - Fork 100
[XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 #2938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: r2.18-rocm-enhanced
Are you sure you want to change the base?
Conversation
tensorflow/tf-build-actions@600513b [ROCm] Fix flaky gpu compiler test when building with rocm tensorflow/tf-build-actions@a35cf48 [XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 xla@e849446 [ROCm] Pass correct warp size to Triton pipeline xla@3e7b0fe cherry-picked warp size passing to triton calls, and globally enabled warpsize=64 xla@750ad89 Fixes.
c76562c
to
5e84717
Compare
PR is very large; it changes 90 files. |
This is the original commit tensorflow@a35cf48 which has 75 files modified. Yeah, definitely this is a quite large one for us already. And since the latest XLA is quite different from the one in tensorflow r2.18, the number of files modified increased during backporting. Maybe I can split pure original commit (with conflicts) and modifications made during backporting (to solve conflicts and make it compile), but this will still leave us a commit contains at least 75 files modified plus a relatively smaller patch. Do you think this would help this case? |
Probably not too much, will try to review it as it is. |
That would be tough work, sorry |
Is it possible to get that second commit (fixing errors) somewhere? |
Just looked through local branches but had no fortune. But no worries, I believe it won't take too long to do the rework. Let me do it, otherwise it might not be well organized for you to review. |
Done. Please help review under this reorganized PR: #2962 |
We should query the hardware to discover its warp size.
PiperOrigin-RevId: 700787004