Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy up runsvdir symlinks during runit phase 1, so that the preparing… #52

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

datenwolf
Copy link

… execution of runsvchdir in phase 2 doesn't choke, if those symlinks are broken.


I just stumbled over my system not being able to fully enter runit phase 2, due to me messing around with the service directories and leaving behind broken current and default symlinks. This little patch adds extra cleanup that tidies up by always entering phase 2 with current → default and previous being removed.

… execution of runsvchdir in phase 2 doesn't choke, if those symlinks are broken
@ericonr
Copy link
Member

ericonr commented Aug 1, 2020

Is there a chance this sort of issue could happen in usual usage? More importantly, how does this work if the file system is read only?

@datenwolf
Copy link
Author

datenwolf commented Aug 3, 2020

I had a lengthy response typed out, in the browser, then wanted to test something with runit, which promptly killed my user session, and the stuff I wrote. Don't want to type it out again, so I'll keep it brief, feel free to ask for clarification…

Is there a chance this sort of issue could happen in usual usage?

Yes, if you're cleaning up old runlevels and are clumsy about it. Happens usually in already stressfull situations and you want to remove that potential footgun. Keep in mind that this kind of situation will also prevent the selection of the destination runlevel via the kernel command line. This is a problem, if that's your default path of recovering from boot time issues that happen after the kernel launched PID=1.

More importantly, how does this work if the file system is read only?

It will lead to a couple of error messages at boot and that's it. Zero negative impact. However it should be noted that runsvchdir itself can't operate if /etc/runit/runsvdir resides on a readonly filesystem (so having / readonly will also prevent kernel command line target runlevel selection!).

Strictly speaking not initializing /etc/runit/runsvdir to a well defined state upon boot makes graceful error recovery unneccesarily hard. This in fact also applies to the read-only situation. The actual solution would be to have runsvchdir create the symlinks directly in /run/runit/runsvdir dereferencing to the directories in /etc/runit/runsvdir. That would also in the same stroke deal with the broken/dangling current/previous symlink problem I seeked to address.

Maybe we can do some weird bind-mounting of symlinks. But the more I think about it, the more I'd rather see runsvchdir to be modified.

@sgn
Copy link
Member

sgn commented Aug 3, 2020

I don't have a strong opinion on this.

But, punish everyone with unlink(2) and link(2) on
every startup because some people messed up is not nice.

And this will break stateless /etc

@ahesford
Copy link
Member

ahesford commented Aug 3, 2020

This isn't the right fix. What you ought to do is

ln -sf "${runlevel}" /etc/runit/runsvdir/current

in 2 in place of the runsvchdir "${runlevel}" that is there now. We aren't really changing the runlevel in 2, we are setting it, so the migration that is imposed in runsvchdir is inappropriate. Also, runsvchdir verifies the existence of the target runlevel directory, but we've already done that in the cmdline loop in 2.

If you really care about a broken previous symlink, you could add rm -f /etc/runit/runsvdir/previous (there is no need to test for existence and remove, just force remove) right before the current symlink is made, but why bother? A broken previous symlink is irrelevant.

@datenwolf
Copy link
Author

But, punish everyone with unlink(2) and link(2)

How is that in any way punishing? Those syscalls are dirt cheap when done on symlinks.

And this will break stateless /etc

It doesn't take my patch for do this, it's already happening as part of core-services:

rm -f /etc/nologin /forcefsck /forcequotacheck /fastboot

@sgn
Copy link
Member

sgn commented Aug 3, 2020 via email

@sgn
Copy link
Member

sgn commented Aug 4, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants