Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic from Rust is not caught properly when trying to reset with an invalid pci id #56

Open
nhuang-tt opened this issue Nov 15, 2024 · 0 comments

Comments

@nhuang-tt
Copy link
Member

nhuang-tt commented Nov 15, 2024

tt-smi -r 3, 3 is invalid. Using an invalid PCI index is not handled properly by tt-smi. I can see tt_smi/tt_smi_backend.py:590 has code to catch Python Exceptions but the real error is a panic coming from a Rust library.

The system is a Wormhole B0, tt-smi v 3.0.2.

thread '<unnamed>' panicked at crates/pyluwen/src/lib.rs:557:70:
called `Result::unwrap()` on an `Err` value: DeviceOpenFailed { id: 3, source: Os { code: 2, kind: NotFound, message: "No such file or directory" } }
stack backtrace:
   0:     0x7610996fa85b - std::backtrace_rs::backtrace::libunwind::trace::h3926e05c1d1f3b6d
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
   1:     0x7610996fa85b - std::backtrace_rs::backtrace::trace_unsynchronized::h9f5691494ac25ae6
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x7610996fa85b - std::sys_common::backtrace::_print_fmt::h7e6bb7b81bf214f4
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x7610996fa85b - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hcf688c88e28c91b4
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x76109972e490 - core::fmt::rt::Argument::fmt::h59a542682908b618
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/core/src/fmt/rt.rs:142:9
   5:     0x76109972e490 - core::fmt::write::hce91e70849a27dee
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/core/src/fmt/mod.rs:1120:17
   6:     0x7610996f0bbd - std::io::Write::write_fmt::h0bba58d3b1b495e9
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/io/mod.rs:1762:15
   7:     0x7610996fa644 - std::sys_common::backtrace::_print::hf3a4f110a22f16df
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x7610996fa644 - std::sys_common::backtrace::print::h0450d1fd5fc83f73
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x76109971736a - std::panicking::default_hook::{{closure}}::hee7ec73fab21a529
  10:     0x76109971700d - std::panicking::default_hook::he65be6b11b67d1e4
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/panicking.rs:292:9
  11:     0x7610997176a8 - std::panicking::rust_panic_with_hook::h9e4f07a5a69c9caf
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/panicking.rs:779:13
  12:     0x7610996fac3e - std::panicking::begin_panic_handler::{{closure}}::h69a9732dd2e7007d
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/panicking.rs:657:13
  13:     0x7610996faa76 - std::sys_common::backtrace::__rust_end_short_backtrace::hf159dc40d4738bc4
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/sys_common/backtrace.rs:170:18
  14:     0x7610997173d2 - rust_begin_unwind
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/std/src/panicking.rs:645:5
  15:     0x761099658ac5 - core::panicking::panic_fmt::hf38ef33e65607e17
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/core/src/panicking.rs:72:14
  16:     0x7610996591d3 - core::result::unwrap_failed::h93afb55b612add5a
                               at /build/rustc-60UC9b/rustc-1.75.0+dfsg0ubuntu1~bpo0/library/core/src/result.rs:1653:5
  17:     0x76109966107f - pyluwen::PciChip::new::h18a5d5b21bdf257b
  18:     0x761099687e8f - pyluwen::_::_::__INVENTORY::trampoline::h7d04eee8a05a4bd1
  19:           0x5e7173 - _PyObject_MakeTpCall
  20:           0x56247d - _PyEval_EvalFrameDefault
  21:           0x55abda - _PyEval_EvalCodeWithName
  22:           0x5e6c43 - _PyFunction_Vectorcall
  23:           0x55dafb - _PyEval_EvalFrameDefault
  24:           0x5e6a66 - _PyFunction_Vectorcall
  25:           0x55c91d - _PyEval_EvalFrameDefault
  26:           0x55abda - _PyEval_EvalCodeWithName
  27:           0x68bfe7 - PyEval_EvalCode
  28:           0x67d831 - <unknown>
  29:           0x67d8af - <unknown>
  30:           0x67d951 - <unknown>
  31:           0x67e5e7 - PyRun_SimpleFileExFlags
  32:           0x6b5732 - Py_RunMain
  33:           0x6b5abd - Py_BytesMain
  34:     0x76109b7a3083 - __libc_start_main
  35:           0x5eb5ee - _start
  36:                0x0 - <unknown>
Traceback (most recent call last):
  File "/home/nhuang/.local/bin/tt-smi", line 8, in <module>
    sys.exit(main())
  File "/home/nhuang/.local/lib/python3.8/site-packages/tt_smi/tt_smi.py", line 779, in main
    pci_board_reset(args.reset, reinit=True)
  File "/home/nhuang/.local/lib/python3.8/site-packages/tt_smi/tt_smi_backend.py", line 590, in pci_board_reset
    chip = PciChip(pci_interface=pci_idx)
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: DeviceOpenFailed { id: 3, source: Os { code: 2, kind: NotFound, message: "No such file or directory" } }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant