Error handling and panic in observablehq notebook #165

jmatsushita · 2024-03-04T09:57:13Z

Hi,

Sorry for opening the issue directly on github, but I cannot see this repo in the dropdown menu in https://antv-issue-helper.surge.sh

I created a reproducer in ObservableHQ for a panic when using a specific shader:
https://observablehq.com/d/bece57b91ebd5a40

I'm using a compute-toys notebook fork which pulls in the latest version @antv/[email protected]

The setup works for the origami shader, however when using shader code that compiles and runs fine in compute.toys (see https://compute.toys/view/1079) I can get the following stack trace in the console:

glsl_wgsl_compiler_bg.wasm:0x23257e Uncaught (in promise) RuntimeError: unreachable
    at __rust_start_panic (glsl_wgsl_compiler_bg.wasm:0x23257e)
    at rust_panic (glsl_wgsl_compiler_bg.wasm:0x2318d1)
    at std::panicking::rust_panic_with_hook::hdb857edf4e61fe11 (glsl_wgsl_compiler_bg.wasm:0x216e3e)
    at std::panicking::begin_panic_handler::{{closure}}::hef75e0b11b6899ae (glsl_wgsl_compiler_bg.wasm:0x21dd09)
    at std::sys_common::backtrace::__rust_end_short_backtrace::h2c4c1ce3acc2944e (glsl_wgsl_compiler_bg.wasm:0x2321dc)
    at rust_begin_unwind (glsl_wgsl_compiler_bg.wasm:0x228274)
    at core::panicking::panic_fmt::hf20f9922c43dc05c (glsl_wgsl_compiler_bg.wasm:0x22ca20)
    at core::result::unwrap_failed::h92d50af5668640d2 (glsl_wgsl_compiler_bg.wasm:0x21f486)
    at wgslcomposer_wgsl_compile (glsl_wgsl_compiler_bg.wasm:0x180ba4)
    at Rr.wgsl_compile (glsl_wgsl_compiler.js:211:12)
$__rust_start_panic    @    glsl_wgsl_compiler_bg.wasm:0x23257e
$rust_panic    @    glsl_wgsl_compiler_bg.wasm:0x2318d1
$std::panicking::rust_panic_with_hook::hdb857edf4e61fe11    @    glsl_wgsl_compiler_bg.wasm:0x216e3e
$std::panicking::begin_panic_handler::{{closure}}::hef75e0b11b6899ae    @    glsl_wgsl_compiler_bg.wasm:0x21dd09
$std::sys_common::backtrace::__rust_end_short_backtrace::h2c4c1ce3acc2944e    @    glsl_wgsl_compiler_bg.wasm:0x2321dc
$rust_begin_unwind    @    glsl_wgsl_compiler_bg.wasm:0x228274
$core::panicking::panic_fmt::hf20f9922c43dc05c    @    glsl_wgsl_compiler_bg.wasm:0x22ca20
$core::result::unwrap_failed::h92d50af5668640d2    @    glsl_wgsl_compiler_bg.wasm:0x21f486
$wgslcomposer_wgsl_compile    @    glsl_wgsl_compiler_bg.wasm:0x180ba4
wgsl_compile    @    glsl_wgsl_compiler.js:211
(anonymous)    @    compute-toys.js?v=4&…7b91ebd5a40@314:485
(anonymous)    @    compute-toys.js?v=4&…7b91ebd5a40@314:478
eval    @    observablehq-17:43
await in eval (async)        
eval    @    observablehq-295:4
(anonymous)    @    worker-488e6463.js:2
Promise.then (async)        
ea    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
(anonymous)    @    worker-488e6463.js:2
Promise.then (async)        
value    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
Pr    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
define    @    compute-toys.js?v=4&…7b91ebd5a40@314:698
value    @    worker-488e6463.js:2
(anonymous)    @    worker-488e6463.js:2
Promise.then (async)        
Ea    @    worker-488e6463.js:2
define    @    worker-488e6463.js:2
rs    @    worker-488e6463.js:2
(anonymous)

Additional context

Chrome on MacOS Version 122.0.6261.94 (Official Build) (arm64)

I have noticed the following in regard to errors and error handling in general:

Since I've made the run function take in the shader string, there are cases where in particular reusing the prelude without binding for instance time.elapsed will result in these errors being constantly spammed in the console.

GPUValidationError {message: '[Invalid CommandBuffer] is invalid.\n - While calling [Queue].Submit([[Invalid CommandBuffer]])\n'}
GPUValidationError {message: '[Invalid BindGroup] is invalid.\n - While encoding …r].SetBindGroup(0, [Invalid BindGroup], 0, ...).\n'}

Using a trick such as:

var ignore = time.elapsed / time.elapsed; // to avoid rewriting run without the time bindings.
var uv = (fragCoord * 2. - resolution.xy) / resolution.y * ignore;

Works to bypass that problem and avoid having to reinitialise each cell with a different set of bindings.

Is there a better way to enable using a notebook to progressively build a compute shader? Maybe some variables can be initialised and made available to the page first, like device and only running createProgram() and device.createComputePipeline() on each cell? If you had a example notebook with this approach that would be great to be able to use observablehq to explain shaders step by step.
Some code that runs on compute.toys doesn't work with g-device-api. This seems to have to do with type checking.

let a = hash1(p+vec2(0,0));

compute.toys seems able to infer the types but this will spam this error message with g-device-api:

Device.ts:210 
GPUValidationError {message: '[Invalid CommandBuffer] is invalid.\n - While calling [Queue].Submit([[Invalid CommandBuffer]])\n'}

The fix is easy enough, but the error handling and lack of error messages makes it difficult to use.

let a = hash1(p+float2(0,0)); 
// or let a = hash1(p+vec2(0.,0.));

In general, when there is an error that spams the console, even when the code is fixed in the same session/notebook, the messages continue to appear in the console (I'm not sure if there's still a context behind the scenes that is not cleaned up, or if the messages have been buffered). It requires reloading the notebook for the messages to stop.

Thank you very much for the great library, I'm really looking forward to using it more in observablehq and I hope my remarks are helpful to improve the error handling experience!

Cheers,

Jun

The text was updated successfully, but these errors were encountered:

xiaoiver · 2024-03-11T05:52:04Z

Just try to change const to let:

// before
const tmax = 2000.0;

// after
let tmax = 2000.0;

Or declare it in module scope:

const tmax = 2000.0;

fn main_image() {}

I will check if it's a bug relative to naga, since according to WGSL Spec, a const-declaration can be declared in function scope:
https://www.w3.org/TR/WGSL/#const-decls

jmatsushita · 2024-03-16T17:59:06Z

Thank you. Indeed the change you suggested fixes the error:

// after
let tmax = 2000.0;

However, should this be a panic (sometimes just swallowed or displaying RuntimeError: unreachable) would it be possible to have a more granular error message to help troubleshoot the shader next time?

xiaoiver · 2024-03-19T02:12:06Z

You're right, I should throw the compiling error message from naga.

xiaoiver · 2024-03-19T06:40:51Z

I update both naga & naga-oil to the latest version and throw error message from Rust side.

// before
naga = { version = "0.14.1", features = ["glsl-in", "wgsl-in", "wgsl-out"] }
naga_oil = "0.11.0"

// after
naga = { version = "0.19.2", features = ["glsl-in", "wgsl-in", "wgsl-out"] }
naga_oil = "0.13.0"

Now the error message can be shown correctly. For example, in the tmax case, naga-oil will complain the following message:

make_naga_module Composer error: expected assignment or increment/decrement, found 'tmax'

But we cannot use alias for now: bevyengine/naga_oil#79
If we try to prepend alias int = i32; to our shader chunk:

Composable module identifiers must not require substitution according to naga writeback rules: `int`

So we have to do some alias work when borrowing shaders from compute-toys. Here is the complete alias map used in compute-toys:

alias int = i32;
alias uint = u32;
alias float = f32;
alias int2 = vec2<i32>;
alias int3 = vec3<i32>;
alias int4 = vec4<i32>;
alias uint2 = vec2<u32>;
alias uint3 = vec3<u32>;
alias uint4 = vec4<u32>;
alias float2 = vec2<f32>;
alias float3 = vec3<f32>;
alias float4 = vec4<f32>;
alias bool2 = vec2<bool>;
alias bool3 = vec3<bool>;
alias bool4 = vec4<bool>;
alias float2x2 = mat2x2<f32>;
alias float2x3 = mat2x3<f32>;
alias float2x4 = mat2x4<f32>;
alias float3x2 = mat3x2<f32>;
alias float3x3 = mat3x3<f32>;
alias float3x4 = mat3x4<f32>;
alias float4x2 = mat4x2<f32>;
alias float4x3 = mat4x3<f32>;
alias float4x4 = mat4x4<f32>;

Or use predeclares instead, eg.

vec2f https://www.w3.org/TR/WGSL/#vec2f
mat3x3f https://www.w3.org/TR/WGSL/#mat3x3f

* fix: throw error message from naga #165 * chore: commit changeset * fix: remove new URL

* Fix 165 (#169) * fix: throw error message from naga #165 * chore: commit changeset * fix: remove new URL * chore(release): bump version (#170) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

xiaoiver · 2024-03-19T11:11:17Z

Just use the latest @antv/[email protected] & WASM:

import { WebGPUDeviceContribution } from '@antv/g-device-api';

const deviceContribution = new WebGPUDeviceContribution({
  shaderCompilerPath: '/glsl_wgsl_compiler_bg.wasm',
  // From CDN
  // shaderCompilerPath: 'https://unpkg.com/@antv/[email protected]/rust/pkg/glsl_wgsl_compiler_bg.wasm',
});

Calls begin/endFrame() at the beginning and end of computePass：

device.beginFrame();

const computePass = device.createComputePass();
computePass.setPipeline(computePipeline);
computePass.setBindings(bindings);
computePass.dispatchWorkgroups(1);
device.submitPass(computePass);

device.endFrame();

xiaoiver closed this as completed Mar 11, 2024

xiaoiver reopened this Mar 11, 2024

xiaoiver self-assigned this Mar 13, 2024

xiaoiver added the bug Something isn't working label Mar 13, 2024

xiaoiver added a commit that referenced this issue Mar 19, 2024

fix: throw error message from naga #165

c30e2d9

xiaoiver mentioned this issue Mar 19, 2024

Fix 165 #169

Merged

14 tasks

xiaoiver added a commit that referenced this issue Mar 19, 2024

Fix 165 (#169)

5cfa31c

* fix: throw error message from naga #165 * chore: commit changeset * fix: remove new URL

xiaoiver closed this as completed Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error handling and panic in observablehq notebook #165

Error handling and panic in observablehq notebook #165

jmatsushita commented Mar 4, 2024

xiaoiver commented Mar 11, 2024 •

edited

Loading

jmatsushita commented Mar 16, 2024

xiaoiver commented Mar 19, 2024

xiaoiver commented Mar 19, 2024 •

edited

Loading

xiaoiver commented Mar 19, 2024 •

edited

Loading

Error handling and panic in observablehq notebook #165

Error handling and panic in observablehq notebook #165

Comments

jmatsushita commented Mar 4, 2024

Additional context

xiaoiver commented Mar 11, 2024 • edited Loading

jmatsushita commented Mar 16, 2024

xiaoiver commented Mar 19, 2024

xiaoiver commented Mar 19, 2024 • edited Loading

xiaoiver commented Mar 19, 2024 • edited Loading

xiaoiver commented Mar 11, 2024 •

edited

Loading

xiaoiver commented Mar 19, 2024 •

edited

Loading

xiaoiver commented Mar 19, 2024 •

edited

Loading