Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebGPU correctness tests are failing on new buildbot #7886

Open
abadams opened this issue Oct 9, 2023 · 5 comments
Open

WebGPU correctness tests are failing on new buildbot #7886

abadams opened this issue Oct 9, 2023 · 5 comments
Assignees

Comments

@abadams
Copy link
Member

abadams commented Oct 9, 2023

Perhaps dawn has changed and we need to account for it:

https://buildbot.halide-lang.org/master/#/builders/26/builds/32

HL_JIT_TARGET=host-webgpu
HL_TARGET=host-webgpu
#CTEST_RESOURCE_GROUP_COUNT=
/Users/halidenightly/build_bot/worker/halide-nightly-main-llvm16-x86-64-osx-cmake/halide-build/test/correctness/correctness_argmax
Error: Requested timedWaitAnyMaxCount is not supported
    at Initialize (/Users/halidenightly/dawn/src/dawn/native/EventManager.cpp:98)
    at Initialize (/Users/halidenightly/dawn/src/dawn/native/Instance.cpp:189)
Required regular expression not found. Regex=[Success!]
@shoaibkamil
Copy link
Contributor

This looks like it is being caused by a mismatch between Halide's WebGPU headers and Dawn's. I can't seem to find any documentation as to why Dawn has changed their headers, and Dawn's seem to differ from the other major implementations. @jrprice may know?

@jrprice
Copy link
Contributor

jrprice commented Oct 31, 2023

I can't seem to find any documentation as to why Dawn has changed their headers, and Dawn's seem to differ from the other major implementations.

The WebGPU native headers are not yet stable so you are still likely to hit incompatibilities if you're using a version of Dawn that doesn't match the ABI of the headers you're using. If you've just built the latest version of Dawn on the new buildbot then this would explain it.

I'd recommend either:

  1. Downgrading the version of Dawn on the new buildbot to match whichever Dawn commit is being used on the other buildbots.
  2. Upgrading all Dawn versions and mini_webgpu.h to the latest versions.

If you want to do option 2 then I can help find the most recent compatible versions of Dawn and the WebGPU headers.

@steven-johnson
Copy link
Contributor

(briefly emerges from the depths...)

IMHO option 2 is the better answer, upgrading Dawn isn't hard.

(submerges once again, bloop)

@shoaibkamil
Copy link
Contributor

Agree-- option 2 is the best option.

@jrprice If you point me to the most recent compatible version of Dawn, I can create a PR. I have something working with tip-of-tree Dawn (which now seems to support wgpuInstanceProcessEvents() so possibly we can eliminate one set of hacks). I also tested with the wgpu implementation, but that does not support overrides, so I'll add a note in README_webgpu.md saying we don't support it.

@jrprice
Copy link
Contributor

jrprice commented Nov 3, 2023

If you point me to the most recent compatible version of Dawn, I can create a PR.

I went ahead and made the PR since I had to fix up Halide in order to test ToT Dawn anyway.

I also documented the process of updating mini_webgpu.h, which I promised to do many months ago.

tip-of-tree Dawn (which now seems to support wgpuInstanceProcessEvents() so possibly we can eliminate one set of hacks).

This might work on the Dawn side, but unfortunately Emscripten still does not support wgpuInstanceProcessEvents so we'll still need to native vs Emscripten split.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants