Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release-7: I2C comms failure with Si5324 on Kasli v1.1 #2567

Open
b-bondurant opened this issue Aug 30, 2024 · 6 comments
Open

Release-7: I2C comms failure with Si5324 on Kasli v1.1 #2567

b-bondurant opened this issue Aug 30, 2024 · 6 comments

Comments

@b-bondurant
Copy link
Contributor

Bug Report

One-Line Summary

Newer release-7 gateware/firmware fails to initialize Si5324 on Kasli v1.1, reportedly because of an I2C failure.

Issue Details

We have a Kasli v1.1 running hardware-based unit tests for DAX. I recently updated its gateware (no change in major version, just a newer rev) and was met with the following:

 __  __ _ ____         ____
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |
| |  | | |___) | (_) | |___
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2022 M-Labs Limited

Bootloader CRC passed
Gateware ident 7.8208.38c72fd;tester_11
Initializing SDRAM...
Read leveling scan:
Module 1:
00000000000111111111100000000000
Module 0:
00000000000111111111110000000000
Read leveling: 15+-5 16+-5 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000013s]  INFO(runtime): ARTIQ runtime starting...
[     0.003930s]  INFO(runtime): software ident 7.8208.38c72fd;tester_11
[     0.010295s]  INFO(runtime): gateware ident 7.8208.38c72fd;tester_11
[     0.016653s]  INFO(runtime): log level set to INFO by default
[     0.022390s]  INFO(runtime): UART log level set to INFO by default
[     0.028778s]  WARN(runtime::rtio_clocking): rtio_clock setting not recognised. Falling back to default.
[     0.037967s]  INFO(runtime::rtio_clocking): using internal 125MHz RTIO clock
[     0.314469s]  INFO(board_artiq::si5324): waiting for Si5324 lock...
panic at runtime/rtio_clocking.rs:246:55: cannot initialize Si5324: "Si5324 failed to ack write address"
backtrace for software version 7.8208.38c72fd;tester_11:
0x4002ebbc
0x400083ac
0x40007cd0
0x4002d64c
0x40005b88
0x4001f0f4
0x4001f09c
0x4002dd48
halting.
use `artiq_coremgmt config write -s panic_reset 1` to restart instead

I replicated the same behavior on a second Kasli v1.1. Haven't checked with any newer hardware but I assume it isn't an issue since no one has reported this yet. Haven't checked release-8 yet either.

Searching backward through the release-7 commits, it looks like 2534678 is where things break.

Previous commit, c812801:

 __  __ _ ____         ____
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |
| |  | | |___) | (_) | |___
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2022 M-Labs Limited

Bootloader CRC passed
Gateware ident 7.8193.c812801;tester_11
Initializing SDRAM...
Read leveling scan:
Module 1:
00000000000111111111100000000000
Module 0:
00000000000111111111110000000000
Read leveling: 15+-5 16+-5 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000013s]  INFO(runtime): ARTIQ runtime starting...
[     0.003929s]  INFO(runtime): software ident 7.8193.c812801;tester_11
[     0.010292s]  INFO(runtime): gateware ident 7.8193.c812801;tester_11
[     0.016650s]  INFO(runtime): log level set to INFO by default
[     0.022386s]  INFO(runtime): UART log level set to INFO by default
[     0.028775s]  WARN(runtime::rtio_clocking): rtio_clock setting not recognised. Falling back to default.
[     0.037965s]  INFO(runtime::rtio_clocking): using internal 125MHz RTIO clock
[     0.314464s]  INFO(board_artiq::si5324): waiting for Si5324 lock...
[     4.538995s]  INFO(board_artiq::si5324):   ...locked
[     4.568118s]  INFO(runtime): network addresses: MAC=54-10-ec-34-dd-65 IPv4=192.168.1.70 IPv6-LL=fe80::56
10:ecff:fe34:dd65 IPv6=no configured address
[     4.581919s]  INFO(runtime::mgmt): management interface active
[     4.594070s]  INFO(runtime::session): accepting network sessions
[     4.607324s]  INFO(runtime::session): running startup kernel
[     4.611780s]  INFO(runtime::session): no startup kernel found
[     4.617601s]  INFO(runtime::session): no connection, starting idle kernel
[     4.624432s]  INFO(runtime::session): no idle kernel found

@ 2534678:

 __  __ _ ____         ____
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |
| |  | | |___) | (_) | |___
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2022 M-Labs Limited

Bootloader CRC passed
Gateware ident 7.8194.2534678;tester_11
Initializing SDRAM...
Read leveling scan:
Module 1:
00000000000111111111100000000000
Module 0:
00000000000111111111110000000000
Read leveling: 15+-5 16+-5 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000013s]  INFO(runtime): ARTIQ runtime starting...
[     0.003932s]  INFO(runtime): software ident 7.8194.2534678;tester_11
[     0.010297s]  INFO(runtime): gateware ident 7.8194.2534678;tester_11
[     0.016655s]  INFO(runtime): log level set to INFO by default
[     0.022392s]  INFO(runtime): UART log level set to INFO by default
[     0.028779s]  WARN(runtime::rtio_clocking): rtio_clock setting not recognised. Falling back to default.
[     0.037969s]  INFO(runtime::rtio_clocking): using internal 125MHz RTIO clock
[     0.314469s]  INFO(board_artiq::si5324): waiting for Si5324 lock...
panic at runtime/rtio_clocking.rs:246:55: cannot initialize Si5324: "Si5324 failed to ack write address"
backtrace for software version 7.8194.2534678;tester_11:
0x4002ebbc
0x40008530
0x40007cd0
0x4002d64c
0x40005b88
0x4001f0f4
0x4001f09c
0x4002dd48
halting.
use `artiq_coremgmt config write -s panic_reset 1` to restart instead

Full logs with each revision tested: https://pastebin.com/fyDyV8vA

Steps to Reproduce

  1. $ nix develop 'git+https://github.com/m-labs/artiq?ref=release-7&rev=<rev-to-test>'
  2. $ python -m artiq.gateware.targets.kasli_generic tester_11.json (json here)
  3. $ artiq_flash --srcbuild -d artiq_kasli/tester_11/
  4. Monitor serial output on boot

Expected Behavior

The system initializes.

Actual (undesired) Behavior

The system doesn't initialize.

Your System (omit irrelevant parts)

  • Operating System:
$ distro
Name: Fedora Linux 40 (Workstation Edition)
Version: 40
Codename:
$ uname -a
Linux brad-desktop 6.10.3-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Aug  5 14:30:00 UTC 2024 x86_64 GNU/Linux
  • ARTIQ version: last working version: 7.8193.c812801
  • Version of the gateware and runtime loaded in the core device: same
  • Hardware involved: Kasli v1.1
  • Vivado version: 2022.2
@dnadlinger
Copy link
Collaborator

Commit 2534678 looks like a false positive; a change to the compiler shouldn't have any effect on the Rust runtime.

@b-bondurant
Copy link
Contributor Author

b-bondurant commented Aug 30, 2024 via email

@dnadlinger
Copy link
Collaborator

Is the gateware bitstream/firmware build even different at all? I guess with gateware there is always the chance of two non-deterministic optimisation runs resulting in subtly different outcomes…

@b-bondurant
Copy link
Contributor Author

b-bondurant commented Aug 30, 2024 via email

@b-bondurant
Copy link
Contributor Author

Bitstream:

$ diff -q artiq_kasli_7.8193.c812801/tester_11/gateware/top.bit artiq_kasli_7.8194.2534678/tester_11/gateware/top.bit
Files artiq_kasli_7.8193.c812801/tester_11/gateware/top.bit and artiq_kasli_7.8194.2534678/tester_11/gateware/top.bit differ

Runtime:

$ diff -q artiq_kasli_7.8193.c812801/tester_11/software/runtime/runtime.bin artiq_kasli_7.8194.2534678/tester_11/software/runtime/runtime.bin
Files artiq_kasli_7.8193.c812801/tester_11/software/runtime/runtime.bin and artiq_kasli_7.8194.2534678/tester_11/software/runtime/runtime.bin differ

Building in a more strict environment, nix develop ... --sandbox --pure-eval --ignore-environment --keep HOME (sandboxing should be on by default, but just in case; HOME required to make Vivado happy):

$ diff -q artiq_kasli_7.8193.c812801_pure/tester_11/gateware/top.bit artiq_kasli_7.8194.2534678_pure/tester_11/gateware/top.bit
Files artiq_kasli_7.8193.c812801_pure/tester_11/gateware/top.bit and artiq_kasli_7.8194.2534678_pure/tester_11/gateware/top.bit differ

$ diff -q artiq_kasli_7.8193.c812801_pure/tester_11/software/runtime/runtime.bin artiq_kasli_7.8194.2534678_pure/tester_11/software/runtime/runtime.bin
Files artiq_kasli_7.8193.c812801_pure/tester_11/software/runtime/runtime.bin and artiq_kasli_7.8194.2534678_pure/tester_11/software/runtime/runtime.bin differ

No clue why 🤷‍♂️

@b-bondurant
Copy link
Contributor Author

Latest release-8 works fine:

 __  __ _ ____         ____
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |
| |  | | |___) | (_) | |___
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2024 M-Labs Limited

Bootloader CRC passed
Gateware ident 8.8955+0ac9e77;tester_11
Initializing SDRAM...
Read leveling scan:
Module 1:
00000001111111110000000000000000
Module 0:
00000011111111111000000000000000
Read leveling: 11+-4 11+-5 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000012s]  INFO(runtime): ARTIQ runtime starting...
[     0.003899s]  INFO(runtime): software ident 8.8955+0ac9e77;tester_11
[     0.010245s]  INFO(runtime): gateware ident 8.8955+0ac9e77;tester_11
[     0.016594s]  INFO(runtime): log level set to INFO by default
[     0.022312s]  INFO(runtime): UART log level set to INFO by default
[     0.028683s]  WARN(runtime::rtio_clocking): rtio_clock setting not recognised. Falling back to default.
[     0.037850s]  INFO(runtime::rtio_clocking): Clocking has already been set up.
[     0.070364s]  INFO(runtime): network addresses: MAC=54-10-ec-34-dd-65 IPv4=10.236.88.210/0 IPv6-LL=fe80:
:5610:ecff:fe34:dd65/10 IPv6=no configured address
[     0.083182s]  WARN(runtime::rtio_mgt): error reading device map (key not found), device names will not b
e available in RTIO error messages
[     0.095441s]  INFO(runtime::rtio_mgt): SED spreading disabled by default
[     0.103423s]  INFO(runtime::mgmt): management interface active
[     0.114721s]  INFO(runtime::session): accepting network sessions
[     0.119457s]  INFO(runtime::session): running startup kernel
[     0.125114s]  INFO(runtime::session): no startup kernel found
[     0.130832s]  INFO(runtime::session): no connection, starting idle kernel
[     0.145124s]  INFO(runtime::session): no idle kernel found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants