-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tracking] arm64 kola test failures #474
Comments
|
@pothos What's the path forward on those, then? Are we sunk unless we're testing those on actual hardware? |
We can tweak the QEMU invocation but otherwise it's like on amd64 where some tests have to be rerun, too. |
Alright, have updated the list to reflect this. |
For completeness, here the paste of failure messages from the 2955.0.0 release: The |
Same problem for etcd: The quay.io/coreos/etcd:v3.3.25 image is not a multiarch image, and we need to update to quay.io/coreos/etcd:v3.5.0 |
AFAIK SELinux things are not installed on |
Yes, I started a PR for installing SELinux some time ago and it needs to be done again for the current |
I also got a temporary verity test failure and filed a PR to make it more robust: flatcar/mantle#202 |
@pothos @wrl for Sometimes, Comparing the two
And comparing // success
$ cat _kola_temp/qemu-2021-08-17-1538-10573/docker.userns/99f84f72-844c-4c59-8df9-be4b180e164c/journal.txt | grep -i torcx
...
Aug 17 15:40:26.839093 /usr/lib/systemd/system-generators/torcx-generator[755]: time="2021-08-17T13:40:26Z" level=info msg="store skipped" err="open /usr/share/oem/torcx/store: no such file or directory" path=/usr/share/oem/torcx/store
Aug 17 15:40:26.843044 /usr/lib/systemd/system-generators/torcx-generator[755]: time="2021-08-17T13:40:26Z" level=info msg="store skipped" err="open /var/lib/torcx/store/2942.0.0: no such file or directory" path=/var/lib/torcx/store/2942.0.0
Aug 17 15:40:26.846258 /usr/lib/systemd/system-generators/torcx-generator[755]: time="2021-08-17T13:40:26Z" level=info msg="store skipped" err="open /var/lib/torcx/store: no such file or directory" path=/var/lib/torcx/store
Aug 17 15:41:27.152038 /usr/lib/systemd/system-generators/torcx-generator[755]: time="2021-08-17T13:41:27Z" level=debug msg="image unpacked" image=docker path=/run/torcx/unpack/docker reference=com.coreos.cl
...
// failure
$ cat _kola_temp/qemu-latest/docker.userns/9fae9ad7-40ac-4c98-8ced-9f209d2c507d/journal.txt | grep -i torcx
...
Aug 17 16:06:53.561320 /usr/lib/systemd/system-generators/torcx-generator[756]: time="2021-08-17T14:06:53Z" level=info msg="store skipped" err="open /usr/share/oem/torcx/store: no such file or directory" path=/usr/share/oem/torcx/store
Aug 17 16:06:53.593780 /usr/lib/systemd/system-generators/torcx-generator[756]: time="2021-08-17T14:06:53Z" level=info msg="store skipped" err="open /var/lib/torcx/store/2942.0.0: no such file or directory" path=/var/lib/torcx/store/2942.0.0
Aug 17 16:06:53.595867 /usr/lib/systemd/system-generators/torcx-generator[756]: time="2021-08-17T14:06:53Z" level=info msg="store skipped" err="open /var/lib/torcx/store: no such file or directory" path=/var/lib/torcx/store ⬆️ on the success we have the EDIT: by connecting on a machine with a failing Docker; we can see that the
The issue might be on the sealing part then... EDIT (bis): I suspect we have an error on the unpack / sealing part but this one is not caught / displayed in the logs. According to the systemd-generator doc, generators should even not rely on
I'll give a try to forward errors to EDIT (ter): I compared the working docker:
failing docker:
and even the FINAL EDIT: we increased the |
I found one more case where the tests with verity would go wrong when not using QEMU: flatcar/mantle#218 |
I created a new PR to install selinux tools and enable selinux on arm64 flatcar-archive/coreos-overlay#1245 (replacing flatcar-archive/coreos-overlay#135). |
The tests now pass with SELinux enabled:
|
SELinux-related (see PR: flatcar-archive/coreos-overlay#1245):
arm64
architecture mantle#209)arm64
architecture mantle#209)Missing multiarch images (see PR: flatcar-archive/coreos-overlay#1179):
v2
support for various tests mantle#216)v2
support for various tests mantle#216)v2
support for various tests mantle#216)v2
support for various tests mantle#216)v2
support for various tests mantle#216)arm64
: fixkubeadm.*
tests mantle#217)Polkit related (see PR: flatcar-archive/coreos-overlay#1263):
kolet: Process exited with status 1
according to @pothos, these are related to "kernel soft lockup coming from slow QEMU emulation".
"waiting for UPDATE_STATUS_UPDATED_NEED_REBOOT: time limit exceeded"
"cluster failed starting machines: connect: connection refused"
Also ref #470 to normalise the test running environment. As it stands now, kola is happy to run its test suite without verifying that the host system is configured properly (and, what constitutes "properly" is still somewhat hazy).
The text was updated successfully, but these errors were encountered: