-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Never boots, saying rpool is in use from other system #2195
Comments
The problem is that the hostid for the pool in the initramfs does not match the hostid of the actual pool. Run |
I've rebuilt the initramfs numerous times. The hostid command from within the initramfs, the spl message from mere seconds after boot in dmesg, and the hostid command from the running system, all show the same hostid as well. zpool set cachefile= rpool, by itself, does nothing. If I then rebuild the initramfs, things progress a little further. However, it still doesn't boot, with this message:
followed by the same hints. This time, I have to run just exit to boot. One other note. Something is changing cachefile back to none on this pool. It does not stay at the empty string. |
@FransUrbo You might also be interested in this discussion. |
The command I provided should regenerate zpool.cache. Which distribution and initramfs generator are you using? It looks like the software is not able to handle verbatim import from the cachefile. In that case, making an empty cachefile would work. I believe that I solved this problem in Gentoo by modifying genkernel to autodetect verbatim import from the cachefile and skip that step. |
This is Debian with its default initramfs generator (hence the @FransUrbo cc) It was working fine with the 0.6.2 support; the 0.6.3 dailies have shown this breakage. |
How could I go about debugging hostid issues? |
In the current released version, the initrd would try to mount it without using force. If that failed, it would then try a forced import. In my dailies, I've removed that (because it's considered .... if not 'evil', but at least 'bad practice'). This unfortunately leads to the effect that more and more people are reporting import failures. It only affects people that have their root on ZFS. The reason for this is that the pool isn't exported properly when the system shuts down. The new init scripts in the dailies does try to do this correctly (and that code is sound!). Unfortunately, since one is booting from the filesystem/pool, it can not be exported (because it is in use by the very script that tries to export it!). Technically, this is not a problem because the filesystem is mounted read only a couple of moments earlier. It's just a problem with the next import (it will be reported as 'in use by another system', because it wasn't exported properly). I have no real, good way of solving this unfortunately :(. @behlendorf have mentioned somewhere (a couple of years ago) that eventually the hostid 'stuff' will be removed (because it no longer serves a purpose if I remember correctly). Then this might go away. But until then, adding the 'zfsforce=yes' option on the kernel command line will help. It is not a good and proper way, but it will work. And if you ONLY use your pool on one, single computer with only one OS (as opposed to importing it on multiple computers with many different operating systems) then there won't be any problem. |
Yes, I'd like to remove the existing hostid implementation in favor of proper multi-mount protection. That work is described in #745 and should resolve this issue, however we haven't scheduled anyone yet to do that work. |
Thanks everyone for your help. The way I see it, there are at least these three open questions:
@FransUrbo have you seen that third problem anywhere before? I can readily duplicate all of these. |
Are there any updates on this issue? This still seems to be relevant with release 0.6.3 and I only manage to boot from ZFS without errors (the same as reported by jgoerzen) if I build initramfs without zpool.cache (and zfsforce=1). It even gives the error when I write an explicit /etc/hostid and include that file also in initramfs. Like jgoerzen, I double checked that this hostid matches the pool's hostid and the hostid used by SPL during boot. |
@agijsberts For reasons like this in 0.6.3 we've set the default hostid to 0 which disables the hostid check. What you're going to want to do is make sure the hostid for your system gets set to 0 on boot by removing your /etc/hostid file. Then force import the pool and export it. At this point your pool should no longer contain a specific host id and will no long perform this check. You can verify that's the case by running If you need to run a fail over configuration in the future you'll need to explicitly create /etc/hostid files to enable this support. See openzfs/spl#224. |
@behlendorf Thanks for your suggestions, they sound like the (default) setup I had originally. To be sure I removed /etc/hostid and rebuilt initramfs. After export/import the pools no longer have any hostid attached and SPL reports hostid=00000000. Unfortunately, I still get the following error (the same as the second one reported by jgoerzen):
At this point the pools are actually mounted and I can resume system boot simply by exiting the emergency shell (CTRL-D), so to my untrained eye it appears that it tries to import the pool twice. So far the only ways I found to avoid this error are to either (1) to build initramfs without zpool.cache or (2) to explicitly set /etc/hostid. It might be a user-error somewhere, but I'm drawing a blank what it could be. |
I just did an install this morning using Linux Mint Debian ( Mate ) 64 bit rolling release with a ZFS (0.6.3) root and I'm experiencing this problem as well. I have tried removing /etc/hostid, adding zfsforce=1, and zpool set cachefile= as Ryao had suggested and nothing is working. Every boot I'm forced to type zpool import -f -N rpool ; exit to get it started. I tried zdb -l | grep hostid and I see nothing. I did notice that hostname was set to '(none)', could this be a problem? Anyone else have any other suggestions? |
@FuzzySunshine This refers to a different problem than discussed here, but it seems you try to boot from the ZFS root dataset: did you try to remove the trailing slash from the cmdline in grub.cfg? (see: zfsonlinux/grub#15). Also make sure to try and rebuild initramfs without zpool.cache. |
@agijsberts Thank you for replying :) I looked at the bug you linked and I don't think it applies because I did end up writing my own grub.cfg line " linux /vmlinuz-3.11-2-amd64 bootfs=rpool/ROOT/debian-live-1 boot=zfs ro ". I did try to delete zpool.cache and recreated initramfs (update-initramfs -u -k all ) with no luck. I did manage to figure out that when boot did bork, that if I force import and then export rpool and then reboot instead of exit that the next boot completes just fine without any interaction. Any suggestions? |
@FuzzySunshine You're right, in your case it cmdline issue does not apply. Make sure though to include zfsforce=1 in the cmdline, this is absolutely required (see FransUrbo's comment above). I'm not a ZFS developer, so unfortunately I can merely suggest which things helped in my case. As temporary solution, you can also try to write /etc/hostid explicitly. Then export+import your pools, double check with zdb that it has been set (iirc zdb converts the bytes to decimal), and recreate initramfs. ZFS is moving away from this reliance on hostid, but at least this might help you right now (it worked in my case). If it doesn't, you might want to move the issue to the mailing list where more people might see it. |
Likely fixed by #2766 if Dracut is used. |
Make use of Dracut's ability to restore the initramfs on shutdown and pivot to it, allowing for a clean unmount and export of the ZFS root. No need to force-import on every reboot anymore. Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2195 Issue openzfs#2476 Issue openzfs#2498 Issue openzfs#2556 Issue openzfs#2563 Issue openzfs#2575 Issue openzfs#2600 Issue openzfs#2755 Issue openzfs#2766
Make use of Dracut's ability to restore the initramfs on shutdown and pivot to it, allowing for a clean unmount and export of the ZFS root. No need to force-import on every reboot anymore. Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2195 Issue openzfs#2476 Issue openzfs#2498 Issue openzfs#2556 Issue openzfs#2563 Issue openzfs#2575 Issue openzfs#2600 Issue openzfs#2755 Issue openzfs#2766
Closing, this was fixed in master. |
On every boot on this system, I get this message:
Running:
lets the system boot.
This issue was first reported on the mailing list at https://groups.google.com/a/zfsonlinux.org/d/topic/zfs-discuss/RggMKyj-64A/discussion but no resolution was reached. I am unsure if it is a zfs or zfs-pkg bug. I do not appear to have hostid issues. The disk is never in use by another system.
The text was updated successfully, but these errors were encountered: