-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
child_process.execFile
may resolve with incomplete output
#56430
Comments
It feels like your unspoken assumption here is:
That assumption is wrong though: your child processes sleep for at least 0.9 seconds but they can definitely sleep longer; and at that point there's intrinsically a race between waking up and getting killed. Yes, node stops reading stdio when it receives the "child exited" signal from the operating system. What would you have it do instead? Sit around waiting for output that may never come? Even though the child process exited, its children may still be alive - and keeping the stdio alive. In the limit, the stdout and stderr streams may never see EOF. Unless you have a specific suggestion how to improve execFile(), I think the conclusion here should be that there's nothing fundamentally wrong with it; it's working as designed. Maybe it's the wrong tool for the job for you but that's why spawn() exists. |
In the case the child process is killed by the signal, we will see the reject path instead.
This is not correct, each stream increases
This is what other parts of node do,
In my opinion, node should either
|
Well, of course. spawn() gives you streams that you can close (or not) at your leisure, exec() and execFile() give you buffers. Apples and oranges.
I had a look at child_process.md just now and I don't think it's particularly unclear but if you think it can be improved, go ahead and send a pull request.
Backwards incompatible change so DOA, and since you're the first one AFAIK asking for this, and node doesn't add features until there's broad demand, I'd say a new option is unlikely to get accepted.
It does. It rejects the promise and has Maybe your confusion is caused by child processes catching and handling SIGTERM? Try sending SIGKILL instead. |
I'm a bit confused, do you mean that "timeout triggers" is the same as "child process killed by the signal"? The promise may resolve even if the timeout is triggered, that's why the output may be incomplete. |
All SIGTERM usually terminates processes but can be caught or ignored. SIGKILL is special in that it cannot be caught and always terminates the recipient immediately. But keep in mind the time window between "timeout expires" and "process receives signal" wherein the target process may choose to exit on its own accord. |
This is not related to the provided example as python does not catch SIGTERM by default.
I guess what you mean is:
, which is similar to the first scenario described in the report. In this case, promise resolves with In the second scenario, promise resolves with |
@bnoordhuis I believe the issue lies in the following behavior: Under certain race conditions, the promise can be resolved with Why I Think It's a BugA user of
Both approaches produce false-negative results. From the user's perspective, there’s no reliable way to determine whether the result is "correct," unless the implementation properly rejects the promise in such cases. Simplified Reproducible Example:// Example usage: `node example.js 1000`
import cp from 'child_process';
import util from 'util';
const count = Number(process.argv[2]);
const execFile = util.promisify(cp.execFile);
await Promise.allSettled(
Array.from({ length: count }, async () => {
const p = execFile(
'python3',
['-c', 'import time; time.sleep(0.9); print(\'Hello, World!\')'],
{ timeout: 1000, killSignal: 'SIGKILL' },
);
const { stdout } = await p;
if (!stdout && !p.child.killed) {
console.log(
'OH NO! The promise resolved, child seems not killed, but stdout is empty!'
);
}
}),
); This example demonstrates how the promise may resolve successfully while leaving stdout incomplete, despite the process not being marked as killed. Would you mind taking a closer look? Thank you! |
The issue also occurs with the non-promisified version of Example:// Example usage: `node example.js 1000`
import { execFile } from 'child_process';
const count = Number(process.argv[2]);
for (let i = 0; i < count; i++) {
const child = execFile(
'python3',
['-c', 'import time; time.sleep(0.9); print(\'Hello, World!\')'],
{ timeout: 1000, killSignal: 'SIGKILL' },
(err, stdout, stderr) => {
if (err === null && child.killed === false && stdout === '') {
console.log('OH NO! No error, child not killed, but stdout is empty!');
}
},
);
} |
Version
v20.18.0
Platform
Subsystem
child_process
What steps will reproduce the bug?
script.js
:In command line, run:
How often does it reproduce? Is there a required condition?
The bug is always reproducible on a 32-core computer with low background cpu usage.
What is the expected behavior? Why is that the expected behavior?
Assuming no promise is rejected. Nothing should be logged to the terminal, and the main program should exit with code 0, indicating all subprocesses resolved successfully with complete
stdout
output.What do you see instead?
Many subprocesses resolve with an empty
stdout
, and theirchild.killed
attributes are logged in the terminal. The logged output consists of multiple consecutivetrue
values, followed by multiple consecutivefalse
values.Additional information
Related Issue
The root cause behind the bug is the same as https://github.com/orgs/nodejs/discussions/47062. The discussion had several mistakes, which results in a different conclusion.
Root Cause
The internal
kill
function ofexecFile
destroysstdout
andstderr
streams, leading to a race condition when the following requirement is met:When this occurs, the race condition manifests as one of two scenarios:
The timeout triggers before
child._handle.onexit
completes:kill
function destroys the IO streams.child._handle.onexit
triggers withexitCode
0.stdout
, andchild.killed
istrue
since it is valid to send a signal to a zombie process.child._handle.onexit
completes before the timeout triggers:child._handle.onexit
triggers withexitCode
0, but theclose
event is not emitted since streams are still open.kill
function destroys the IO streams, thechild.kill
function is effectively a no-op becausechild._handle
is cleared.stdout
, andchild.killed
isfalse
sincechild.kill
function did nothing.Why I think this is a bug
execFile
is different fromspawn
, the latter one does not destroy the output streams, which means that it is possible to have a valid implementation that resolves the promise correctly. (Actually, this is our current workaround.)The text was updated successfully, but these errors were encountered: