-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shell: fix lost log messages during initialization #6578
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just that one typo
@@ -220,4 +220,18 @@ test_expect_success 'flux-shell: stdout/err from task.exec works' ' | |||
grep "^this is stderr" print.err && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit message typo "erorrs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fixed and will set MWP.
Problem: The `struct flux_shell` object is allocated in main() on the stack without initialization of all fields to 0, but several shell components check for non-NULL members of this structure for safety. Ensure all fields of the struct are initialized to zero via a call to memset(3).
Problem: `flux_shell_raise(3)` fails to check for a non-NULL `shell->info` object, then proceeds to dereference it, causing a potential segfault. Add a check for non-NULL `shell->info` along with the other checks in `flux_shell_raise(3)`.
Problem: The shell calls `exit` instead of `shell_die` during some early initialization errors, but this doesn't give the logging system a chance to detect a fatal error when the shell abruptly exits. Switch the `exit (1)` calls to `shell_die()`. Fixes flux-framework#6576
Problem: No test in the testsuite ensures that early job shell errors, such as a failure to parse jobspec, are displayed in `flux job attach` output. Add a test to t2608-job-shell-log.t that submits a jobspec with version=2 and ensures the expected error message is emitted.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6578 +/- ##
==========================================
+ Coverage 79.44% 79.45% +0.01%
==========================================
Files 531 531
Lines 88311 88312 +1
==========================================
+ Hits 70157 70168 +11
+ Misses 18154 18144 -10
|
This PR (hopefully) fixes the issue reported in #6576. Early shell errors are lost because, for unknown reason, the shell was calling
exit (1);
instead ofshell_die()
.This fixes that issue and adds a test.