Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ctrl-c is the wrong key to trigger cancellation #16

Open
jaywonchung opened this issue Apr 22, 2022 · 4 comments
Open

Ctrl-c is the wrong key to trigger cancellation #16

jaywonchung opened this issue Apr 22, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@jaywonchung
Copy link
Owner

Pressing ctrl-c when Pegasus is running will also send ctrl-c to the entire foreground process group, thus also sending SIGINT to the ssh processes.

Instead of waiting for ctrl-c with the tokio ctcl-c catcher, find another way to catch the user's intent to cancel.

Possible solutions:

  • Listen to stdin for something random. Like q<Enter>.
  • Create a pegasus stop command. This should make sure to identify the specific pegasus process that is running on the current working directory. This will probably require a pid file.
    • It's okay to assume that there is only one instance of pegasus per directory (if not, that means more than one pegasus processes are manipulating queue.yaml and consumed.yaml).
@jaywonchung jaywonchung added the enhancement New feature or request label Apr 22, 2022
@NobodyXu
Copy link
Contributor

NobodyXu commented Apr 22, 2022

I think the best way to fix this is to double fork and create a new session, or run pegasus as a daemon since many users have the habit of using ctrl+c

@jaywonchung
Copy link
Owner Author

@NobodyXu Thanks for the advice! I gave it some more thought. One possible solution that popped up is for the ssh command builder to include this:

pre_exec(|| unsafe { nix::libc::setsid() })

Single-forking would still be okay of ssh does not try to open("/dev/console") somewhere.

I think this will make the ssh processes ctrl-c-proof while still allowing Pegasus to be ergonomic (accepts ctrl-c). But probably this cannot be done without introducing changes in the openssh crate. WDYT?

@NobodyXu
Copy link
Contributor

@NobodyXu Thanks for the advice! I gave it some more thought. One possible solution that popped up is for the ssh command builder to include this:

pre_exec(|| unsafe { nix::libc::setsid() })

Single-forking would still be okay of ssh does not try to open("/dev/console") somewhere.

I think this will make the ssh processes ctrl-c-proof while still allowing Pegasus to be ergonomic (accepts ctrl-c). But probably this cannot be done without introducing changes in the openssh crate. WDYT?

Yes that requires modifications to openssh, however I don't think this is a great solution.

That would mean the ssh multiplex master won't exit and release the resource (including the TCP connection).

Next invocation of pegasus would create another ssh multiplex master and TCP connection.

Since pegasus is designed to run tasks on clusters, I think it is more reasonable to run it as a daemon that keep running tasks on the remote until you explicitly terminate it.

@jaywonchung
Copy link
Owner Author

Ah, I get your point. It's not easy to guarantee that all the ssh masters terminate when pegasus dies/terminates/is killed/panics.

Making Pegaus a daemon seems good. 👍 I think I'll implement a stop command for termination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants