Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intel-ipu6: CSE authenticate_run failed + error -EIO: FW authentication failed #306

Open
nagmat84 opened this issue Nov 30, 2024 · 11 comments

Comments

@nagmat84
Copy link

I get the following error on kernel 6.12.1 with a custom kernel config

intel-ipu6 0000:00:05.0: enabling device (0000 -> 0002)
intel-ipu6 0000:00:05.0: IPU6 in secure mode touch 0x80000000 mask 0x0
Loading firmware: intel/ipu/ipu6epmtl_fw.bin
intel-ipu6 0000:00:05.0: FW version: 20230925
intel-ipu6 0000:00:05.0: Found supported sensor OVTI08F4:00
intel-ipu6 0000:00:05.0: Connected 1 cameras
intel-ipu6 0000:00:05.0: Sending BOOT_LOAD to CSE
intel-ipu6 0000:00:05.0: Sending AUTHENTICATE_RUN to CSE
intel-ipu6 0000:00:05.0: expected resp: 0x2, IPC response: 0xd20 
intel-ipu6 0000:00:05.0: CSE authenticate_run failed
intel-ipu6 0000:00:05.0: error -EIO: FW authentication failed
intel-ipu6 0000:00:05.0: probe with driver intel-ipu6 failed with error -5

Expected behaviour: AUTHENTICATE_RUN should succeed.

A necessary kernel option might be missing in the custom kernel configuration, because FW loading works with the distribution-provided kernel. However, if this was indeed the reason, that would indicate a issue with Kconfig for the IPU6 driver as Kconfig should ensure that the kernel includes all necessary options.

Pastebins:

@nagmat84
Copy link
Author

nagmat84 commented Dec 2, 2024

@bingbucao Any tips which I could do myself to narrow down the issue? Unfortunately, I am not much of a kernel programmer so I have no idea how I could investigate myself what might cause the failing authentication.

@manwegit
Copy link

manwegit commented Dec 3, 2024

Do you have SecureBoot enabled?
I had to add intel keys: https://dgpu-docs.intel.com/driver/configuring-secure-boot.html

@nagmat84
Copy link
Author

nagmat84 commented Dec 3, 2024

Yes, secure boot is enabled. I replaced the platform key (PK) with my own key, re-signed the Microsoft KEKs 2011 and 2023, Canonical KEK with my own PK and imported the DB keys from Microsoft (as I need dual boot with Windows) and Canonical as well as my own DB key to sign my custom Linux kernel.

Where do I get the Intel keys?

@nagmat84
Copy link
Author

nagmat84 commented Dec 3, 2024

Silly me, the download link for the certificate is on the page you linked. However, I built and signed the IPU6 kernel module myself, so the Intel certificate is not required to authenticate the kernel module. Do I also need the certificate to verify the firmware? I thought the CSE uses fixed hash values fused into the hardware to authenticate the firmware.

I am asking, because firmware loading works with a self-built and self-signed kernel based on a distribution-provided configuration even if the Intel certificate is not installed in my EFI vars.

Hence, I was under the impression that the Intel keys are only necessary if one uses the pre-built Intel kernel module.

@manwegit
Copy link

manwegit commented Dec 3, 2024

Honestly I do not know. And I do not know how far the hardware itself checks those things.
Importing the linked intel pub key to secureboot DB removed that error message.

But my camera is still not working =(

@nagmat84
Copy link
Author

nagmat84 commented Dec 3, 2024

I added the Intel keys to the UEFI firmware. It didn't help (as expected).

For my custom-built kernel, the CSE still fails to authenticate the firmware. It works with a self-compiled kernel based on the distribution-provided kernel configuration.

@bingbucao
Copy link

Which UEFI firmware version are you using? It looks like the CSE firmware does not match the signed IPU firmware.

@nagmat84
Copy link
Author

nagmat84 commented Dec 4, 2024

  • UEFI: LENOVO 21KCCTO1WW/21KCCTO1WW, BIOS N3YET71W (1.36 ) 09/04/2024
  • IPU6 FW: 20230925

But please keep in mind that FW loading and authentication works with a distribution kernel, but not with the self-configured kernel on the same machine using the same firmware. So it shouldn't be a hardware or firmware problem.

@bingbucao
Copy link

  • UEFI: LENOVO 21KCCTO1WW/21KCCTO1WW, BIOS N3YET71W (1.36 ) 09/04/2024
  • IPU6 FW: 20230925

But please keep in mind that FW loading and authentication works with a distribution kernel, but not with the self-configured kernel on the same machine using the same firmware. So it shouldn't be a hardware or firmware problem.

Are you sure using same IPU firmware binary with the distribution kernel? I see that the distribution kernel is using a built-in firmware binary.

CONFIG_EXTRA_FIRMWARE="intel/ibt-0180-0041.ddc intel/ibt-0180-0041.sfi i915/mtl_guc_70.bin i915/mtl_huc_gsc.bin i915/mtl_gsc_1.bin i915/mtl_dmc.bin intel/ipu/ipu6epadln_fw.bin intel/ipu/ipu6epmtl_fw.bin intel/ipu/ipu6ep_fw.bin intel/vpu/vpu_37xx_v0.0.bin intel/sof-ipc4/mtl/sof-mtl.ri intel-ucode/06-aa-01 intel-ucode/06-aa-02 intel-ucode/06-aa-04 iwlwifi-ma-b0-gf-a0-89.ucode iwlwifi-ma-b0-gf-a0.pnvm regulatory.db regulatory.db.p7s rtl_nic/rtl8153b-2.fw intel/sof-ace-tplg/sof-hda-generic-2ch.tplg"
CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"

@nagmat84
Copy link
Author

nagmat84 commented Dec 4, 2024

Are you sure using same IPU firmware binary with the distribution kernel? I see that the distribution kernel is using a built-in firmware binary.

Yes, I am. Once again: Both kernels are self-compiled. The kernels are built on the target machine (and both kernels include the same, identical firmware file from /lib/firmware. The only difference between the kernels is that one kernel is based on the distribution-provided configuration, while the other kernel uses a customized, stripped-down configuration to save compile time.

Quite obviously, I disabled too much, i.e. left out a necessary kernel option or module.

@nagmat84
Copy link
Author

nagmat84 commented Dec 9, 2024

I have experimented a bit and at least was able to achieve some different behavior. (But nothing good though.)

Summary (TLDR)

There seems to be at least three issues here, but I do not know how they are related or if they are independent:

  1. Issue 1: Successful CSE authentication for IPU6 requires at least INTEL_MEI_GSC to be enabled and maybe also INTEL_MEI_VSC_HW and INTEL_MEI_VSC, but the dependency for IPU6 does not enforce that.
  2. Issue 2: CSE authentication only works, if the IPU6 driver, the i915 driver and the options from issue 1 are built as modules. If the drivers are statically linked into the kernel, then CSE authentication still fails.
  3. Isssue 3 (for kernel version 6.12.2): If the drivers and options above are built as modules, CSE authentication finally succeeds, but the kernel shows a DMA bug and kernel trace after that.

Links to Pastebins:

Note: All kernels have been self-compiled on the target machine using the same tool chain. If firmware is embedded into the kernel, this firmware is identical to the firmware which is dynamically loaded at run-time from /lib/firmware.

Details

Issue 1: Successful CSE authentication for IPU6 requires at least INTEL_MEI_GSC, CONFIG_INTEL_MEI_VSC_HW and CONFIG_INTEL_MEI_VSC

It seems that indeed some kernel options were missing. The relevant options are CONFIG_INTEL_MEI_GSC, CONFIG_INTEL_MEI_VSC_HW and CONFIG_INTEL_MEI_VSC. The help texts for those options are

CONFIG_INTEL_MEI_GSC:

Intel auxiliary driver for GSC devices embedded in Intel graphics devices.
An MEI device here called GSC can be embedded in an Intel graphics devices,
to support a range of chassis tasks such as graphics card firmware update
and security tasks.


CONFIG_INTEL_MEI_VSC_HW:

Intel SPI transport driver between host and Intel visual sensing controller
(IVSC) device. This driver can also be built as a module. If so, the module
will be called mei-vsc-hw.


CONFIG_INTEL_MEI_VSC:

Intel MEI over SPI driver for Intel visual sensing controller (IVSC) device
embedded in IA platform. It supports camera sharing between IVSC for
context sensing and IPU for typical media usage. Select this config should
enable transport layer for IVSC device. This driver can also be built as a
module. If so, the module will be called mei-vsc.

I believe the crucial points here are "firmware update", "security task" and "IPU". My original trimmed-down configuration for 6.12.1 disabled those option completely. The distribution-provided default configuration for 6.12.1 enables those options as modules.

It seems we have the following dependency chain here:

  • FW authentication for the IPU6 depends on INTEL_MEI_GSC (GSC stands for graphics security controller, believe), and
  • INTEL_MEI_GSC depends on the DRM_I915 in turn.

However, the first dependency is not reflected by the kernel options.

Maybe one solution could be to update Kconfig such that VIDEO_INTEL_IPU6 depends on INTEL_MEI_GSC? I don't know how the other two options (CONFIG_INTEL_MEI_VSC_HW and CONFIG_INTEL_MEI_VSC) come into play.

Issue 2: Successful CSE authentication only works with dynamic module loading not with statically linked-in drivers

If one builds a monolithic kernel which statically links all drivers and embeds the necessary firmware, one still gets the error CSE authenticate_run failed as before. However, if one builds a modular kernel which dynamically loads the graphics driver DRM_I915, the camera driver VIDEO_INTEL_IPU6 and the management engine drivers INTEL_MEI_*, then CSE authentication succeeds.

The expected behavior is that CSE authentication should also work for a monolithic kernel.

I don't know the reasons for that different behavior. Maybe a problem with the order of device initialization, timing or power-sequencing issue which comes to surface if everything is built-in, but miraculously vanishes if modules are loaded dynamically? (However, I am only speculating here.)

Issue 3: DMA bug and kernel trace for 6.12.2 after successful CSE authentication

Finally, there is another issue with a kernel 6.12.2 which includes CONFIG_INTEL_MEI_GSC, CONFIG_INTEL_MEI_VSC_HW and CONFIG_INTEL_MEI_VSC as modules. While CSE authentication finally works, there is a kernel warning and trace:

intel-ipu6 0000:00:05.0: enabling device (0000 -> 0002)
intel-ipu6 0000:00:05.0: IPU6 in secure mode touch 0x80000000 mask 0x0
intel-ipu6 0000:00:05.0: FW version: 20230925
intel-ipu6 0000:00:05.0: Found supported sensor OVTI08F4:00
intel-ipu6 0000:00:05.0: Connected 1 cameras
intel-ipu6 0000:00:05.0: Sending BOOT_LOAD to CSE
intel-ipu6 0000:00:05.0: Sending AUTHENTICATE_RUN to CSE
intel-ipu6 0000:00:05.0: CSE authenticate_run done
intel-ipu6 0000:00:05.0: IPU6-v4[7d19] hardware version 6
------------[ cut here ]------------
WARNING: CPU: 5 PID: 564 at kernel/dma/mapping.c:597 dma_alloc_attrs+0x35/0x40
Modules linked in: intel_ipu6_isys(+) videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_sof_pci_intel_mtl snd_sof_intel_hda_generic snd_sof_pci snd_sof_xtensa_dsp iwlmvm(+) snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda snd_sof mac80211 snd_intel_dspcfg libarc4 snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec snd_sof_intel_hda_mlink snd_hda_ext_core btusb snd_hda_core btrtl btintel mei_gsc_proxy iwlwifi btbcm intel_ipu6 cfg80211 ipu_bridge xe drm_ttm_helper gpu_sched drm_suballoc_helper drm_gpuvm drm_exec i915 drm_buddy ttm intel_gtt drm_display_helper intel_vpu cec drm_shmem_helper
CPU: 5 UID: 0 PID: 564 Comm: (udev-worker) Not tainted 6.12.2-gentoo-r1-modularized #3
Hardware name: LENOVO 21KCCTO1WW/21KCCTO1WW, BIOS N3YET71W (1.36 ) 09/04/2024
RIP: 0010:dma_alloc_attrs+0x35/0x40
Code: 00 74 27 f7 c1 00 00 04 00 75 16 83 e1 f8 f6 87 dc 02 00 00 40 75 05 e9 89 12 00 00 e9 14 08 57 00 0f 0b 31 c0 c3 cc cc cc cc <0f> 0b eb d5 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffa9d682e17aa0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9e66d5db0028 RCX: 0000000000000cc0
RDX: ffffa9d682e17aa8 RSI: 0000000000000190 RDI: ffff9e66c9e3e000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff9e66d5db20a8 R11: ffffffffb4f29420 R12: ffff9e66d5db20c0
R13: 0000000000000014 R14: ffff9e66c9e3e000 R15: ffff9e66d5db0028
FS:  00007f5aa77d6840(0000) GS:ffff9e6e00140000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc61d163cc0 CR3: 00000001067fe003 CR4: 0000000000f70ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0However, I don't know if this problem is related or an independent problem. 000000000000000
DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 ? __warn.cold+0x90/0x9e
 ? dma_alloc_attrs+0x35/0x40
 ? report_bug+0xfa/0x140
 ? handle_bug+0x53/0x90
 ? exc_invalid_op+0x17/0x70
 ? asm_exc_invalid_op+0x1a/0x20
 ? dma_alloc_attrs+0x35/0x40
 alloc_fw_msg_bufs+0x4a/0x180 [intel_ipu6_isys]
 ? pm_qos_update_target+0xcc/0x190
 isys_probe+0x363/0x930 [intel_ipu6_isys]
 ? kernfs_add_one+0x13c/0x150
 ? __pfx_isys_probe+0x10/0x10 [intel_ipu6_isys]
 auxiliary_bus_probe+0x41/0x80
 ? driver_sysfs_add+0x52/0xb0
 really_probe+0xd1/0x270
 ? pm_runtime_barrier+0x4f/0x90
 ? __pfx___driver_attach+0x10/0x10
 __driver_probe_device+0x6e/0xe0
 driver_probe_device+0x1a/0xf0
 __driver_attach+0x83/0x180
 bus_for_each_dev+0x76/0xd0
 bus_add_driver+0xe3/0x1c0
 driver_register+0x6d/0xc0
 __auxiliary_driver_register+0x69/0xd0
 ? __pfx_isys_driver_init+0x10/0x10 [intel_ipu6_isys]
 do_one_initcall+0x46/0x1c0
 ? __kmalloc_cache_noprof+0x164/0x1f0
 do_init_module+0x5b/0x1f0
 init_module_from_file+0x81/0xc0
 idempotent_init_module+0x103/0x300
 __x64_sys_finit_module+0x57/0x90
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x71/0x79
RIP: 0033:0x7f5aa732b89d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5b a5 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffda0d3d4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 00005575b6dda570 RCX: 00007f5aa732b89d
RDX: 0000000000000000 RSI: 00007f5aa7163379 RDI: 000000000000003e
RBP: 0000000000000000 R08: 0000000000000000 R09: 00005575b6ce8a40
R10: 00007f5aa73f6ac0 R11: 0000000000000246 R12: 00007f5aa7163379
R13: 0000000000020000 R14: 00005575b6dc6bc0 R15: 00005575b6dbd6e0
 </TASK>
---[ end trace 0000000000000000 ]---

I haven't yet tested whether there are still some other kernel options missing or whether this is a regression compared to 6.12.1. Hence, I am not sure whether this error is related to some additionally missing dependencies or independent. Given the observation that this seems somehow be related to DMA, this also might be a result of the commits 6ac269ab and 11b0543e which have been recently back-ported into 6.12.2 to fix a DMA issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants