Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: fix parseEnv handling of invalid lines #56778

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
105 changes: 70 additions & 35 deletions src/node_dotenv.cc
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,11 @@ Local<Object> Dotenv::ToObject(Environment* env) const {
return result;
}

// Removes leading and trailing spaces from a string_view
// Returns an empty string_view if the input is empty
// Example:
// trim_spaces(" hello ") -> "hello"
// trim_spaces("") -> ""
std::string_view trim_spaces(std::string_view input) {
if (input.empty()) return "";
if (input.front() == ' ') {
Expand All @@ -130,50 +135,81 @@ void Dotenv::ParseContent(const std::string_view input) {

while (!content.empty()) {
// Skip empty lines and comments
// Example:
// # This is a comment
//
// KEY=value
if (content.front() == '\n' || content.front() == '#') {
auto newline = content.find('\n');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If content.front() is \n this will always return 0 (the first character). I wonder why we did it like this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simply replace line 144 with if (content.front() == '\n') and remove the content.find('\n') check I guess...

if (newline != std::string_view::npos) {
content.remove_prefix(newline + 1);
// Trim spaces after skipping comments or empty lines to handle
// cases where there might be trailing whitespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example to this statement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure but:

Suggested change
// cases where there might be trailing whitespace
// cases where there might be trailing whitespace
// Example: " # comment\n KEY=value"

content = trim_spaces(content);
continue;
} else {
// If no newline is found, we've reached the end of content
AugustinMauroy marked this conversation as resolved.
Show resolved Hide resolved
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here. Explain it as much as you can. Possibly with examples so it is more readable/understandable to reviewers/contributors.

}
}

// If there is no equal character, then ignore everything
auto equal = content.find('=');
if (equal == std::string_view::npos) {
// Find the next equals sign or newline in a single pass
// This optimizes the search by avoiding multiple iterations
auto pos = content.find_first_of("=\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the variable name to be something more readable?

Suggested change
auto pos = content.find_first_of("=\n");
auto equal_or_newline = content.find_first_of("=\n");


// If we found nothing or found a newline before equals, the line is invalid
if (pos == std::string_view::npos || content[pos] == '\n') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (pos == std::string_view::npos || content[pos] == '\n') {
if (pos == std::string_view::npos || content.at(pos) == '\n') {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is diff bettween [xxx] and .at(xxx) ?

if (pos != std::string_view::npos) {
content.remove_prefix(pos + 1);
content = trim_spaces(content);
continue;
}
break;
}

key = content.substr(0, equal);
content.remove_prefix(equal + 1);
// We found an equals sign, extract the key
key = content.substr(0, pos);
content.remove_prefix(pos + 1);
key = trim_spaces(key);
content = trim_spaces(content);

// Skip lines with empty keys after trimming spaces
// Examples of invalid keys that would be skipped:
// =value
// " "=value
if (key.empty()) {
auto newline = content.find('\n');
if (newline != std::string_view::npos) {
content.remove_prefix(newline + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if newline is the last character, this will literally clear the content and we don't need to call trim_spaces afterwards.

content = trim_spaces(content);
continue;
}
break;
}

// Remove export prefix from key
// Remove export prefix from key and ensure proper spacing
// Example: export FOO=bar -> FOO=bar
if (key.starts_with("export ")) {
key.remove_prefix(7);
// Trim spaces after removing export prefix to handle cases like:
// export FOO=bar
key = trim_spaces(key);
AugustinMauroy marked this conversation as resolved.
Show resolved Hide resolved
}

// SAFETY: Content is guaranteed to have at least one character
// If content is empty after the equals sign, store empty value and break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is unneccessary. It explains the code, not how it got end up here. I recommend this article: https://swimm.io/learn/code-collaboration/comments-in-code-best-practices-and-mistakes-to-avoid

Specifically:

Explain the reasoning behind the code, not just what the code is doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

if (content.empty()) {
// In case the last line is a single key without value
// Example: KEY= (without a newline at the EOF)
store_.insert_or_assign(std::string(key), "");
break;
}

// Expand new line if \n it's inside double quotes
// Example: EXPAND_NEWLINES = 'expand\nnew\nlines'
// Handle different types of value formats (quoted, multi-line, etc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert this code change. The previous comment was better for understanding where we are at the parser.

if (content.front() == '"') {
auto closing_quote = content.find(content.front(), 1);
if (closing_quote != std::string_view::npos) {
value = content.substr(1, closing_quote - 1);
std::string multi_line_value = std::string(value);

// Replace \n with actual newlines in double-quoted strings
size_t pos = 0;
while ((pos = multi_line_value.find("\\n", pos)) !=
std::string_view::npos) {
Expand All @@ -190,58 +226,57 @@ void Dotenv::ParseContent(const std::string_view input) {
}
}

// Check if the value is wrapped in quotes, single quotes or backticks
if ((content.front() == '\'' || content.front() == '"' ||
content.front() == '`')) {
// Handle quoted values (single quotes, double quotes, backticks)
if (content.front() == '\'' || content.front() == '"' ||
content.front() == '`') {
auto closing_quote = content.find(content.front(), 1);

// Check if the closing quote is not found
// Example: KEY="value
if (closing_quote == std::string_view::npos) {
// Check if newline exist. If it does, take the entire line as the value
// Example: KEY="value\nKEY2=value2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert comments for the changes that are unrelated to your code? This change for example doesn't make things better.

// The value pair should be `"value`
// No closing quote - take until newline
auto newline = content.find('\n');
if (newline != std::string_view::npos) {
value = content.substr(0, newline);
store_.insert_or_assign(std::string(key), value);
content.remove_prefix(newline);
} else {
// No newline - take rest of content
value = content;
store_.insert_or_assign(std::string(key), value);
break;
}
} else {
// Example: KEY="value"
// Found closing quote - take content between quotes
value = content.substr(1, closing_quote - 1);
store_.insert_or_assign(std::string(key), value);
// Select the first newline after the closing quotation mark
// since there could be newline characters inside the value.
auto newline = content.find('\n', closing_quote + 1);
if (newline != std::string_view::npos) {
content.remove_prefix(newline);
} else {
break;
}
}
} else {
// Regular key value pair.
// Example: `KEY=this is value`
// Handle unquoted values
auto newline = content.find('\n');

if (newline != std::string_view::npos) {
value = content.substr(0, newline);
auto hash_character = value.find('#');
// Check if there is a comment in the line
// Example: KEY=value # comment
// The value pair should be `value`
if (hash_character != std::string_view::npos) {
value = content.substr(0, hash_character);
value = value.substr(0, hash_character);
}
value = trim_spaces(value);
store_.insert_or_assign(std::string(key), std::string(value));
content.remove_prefix(newline);
} else {
// In case the last line is a single key/value pair
// Example: KEY=VALUE (without a newline at the EOF)
value = content.substr(0);
// Last line without newline
value = content;
value = trim_spaces(value);
store_.insert_or_assign(std::string(key), std::string(value));
break;
}

value = trim_spaces(value);
store_.insert_or_assign(std::string(key), value);
}

content = trim_spaces(content);
}
}

Expand Down
8 changes: 8 additions & 0 deletions test/fixtures/dotenv/invalid-syntax.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
foo

bar
baz=whatever
VALID_AFTER_INVALID=test
multiple_invalid
lines_without_equals
ANOTHER_VALID=value
25 changes: 25 additions & 0 deletions test/parallel/test-dotenv-edge-cases.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
'use strict';

Check failure on line 1 in test/parallel/test-dotenv-edge-cases.js

View workflow job for this annotation

GitHub Actions / test-linux

--- stdout --- Test failure: 'should handle invalid syntax in .env file' Location: test/parallel/test-dotenv-edge-cases.js:186:3 AssertionError [ERR_ASSERTION]: Expected values to be strictly equal: + actual - expected + '[eval]:3\n' + + ' const input = fs.readFileSync(\'/home/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + + ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' + + " x Expected ',', got 'åß'\n" + + ' ,-[3:1]\n' + + " 1 | const { parseEnv } = require('node:util');\n" + + ' 2 | \n' + + ' 3 | const input = fs.readFileSync(\'/home/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + + ' : ^^\n' + + ' 4 | const result = parseEnv(input);\n' + + ' 5 | assert.strictEqual(Object.keys(result).length, 3);\n' + + " 6 | assert.strictEqual(result.baz, 'whatever');\n" + + ' `----\n' + + '\n' + + '\n' + + 'SyntaxError: missing ) after argument list\n' + + ' at makeContextifyScript (node:internal/vm:185:14)\n' + + ' at compileScript (node:internal/process/execution:386:10)\n' + + ' at evalTypeScript (node:internal/process/execution:255:22)\n' + + ' at node:internal/main/eval_string:71:3\n' + + '\n' + + 'Node.js v24.0.0-pre\n' - '' at TestContext.<anonymous> (/home/runner/work/node/node/dir%20with $unusual"chars?'åß∂ƒ©∆¬…`/test/parallel/test-dotenv-edge-cases.js:207:12) at process.processTicksAndRejections (node:internal/process/task_queues:105:5) at async Test.run (node:internal/test_runner/test:981:9) at async Suite.processPendingSubtests (node:internal/test_runner/test:678:7) { generatedMessage: true, code: 'ERR_ASSERTION', actual: '[eval]:3\n' + ' const input = fs.readFileSync(\'/home/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' + " x Expected ',', got 'åß'\n" + ' ,-[3:1]\n' + " 1 | const { parseEnv } = require('node:util');\n" + ' 2 | \n' + ' 3 | const input = fs.readFileSync(\'/home/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + ' : ^^\n' + ' 4 | const result = parseEnv(input);\n' + '...', expected: '', operator: 'strictEqual' } Command: out/Release/node --test-reporter=./test/common/test-error-reporter.js --test-reporter-destination=stdout "/home/runner/work/node/node/dir%20with $unusual"chars?'åß∂ƒ©∆¬…`/test/parallel/test-dotenv-edge-cases.js"

Check failure on line 1 in test/parallel/test-dotenv-edge-cases.js

View workflow job for this annotation

GitHub Actions / test-macOS (macos-13)

--- stdout --- Test failure: 'should handle invalid syntax in .env file' Location: test/parallel/test-dotenv-edge-cases.js:186:3 AssertionError [ERR_ASSERTION]: Expected values to be strictly equal: + actual - expected + '[eval]:3\n' + + ' const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + + ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' + + " x Expected ',', got 'åß'\n" + + ' ,-[3:1]\n' + + " 1 | const { parseEnv } = require('node:util');\n" + + ' 2 | \n' + + ' 3 | const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + + ' : ^^\n' + + ' 4 | const result = parseEnv(input);\n' + + ' 5 | assert.strictEqual(Object.keys(result).length, 3);\n' + + " 6 | assert.strictEqual(result.baz, 'whatever');\n" + + ' `----\n' + + '\n' + + '\n' + + 'SyntaxError: missing ) after argument list\n' + + ' at makeContextifyScript (node:internal/vm:185:14)\n' + + ' at compileScript (node:internal/process/execution:386:10)\n' + + ' at evalTypeScript (node:internal/process/execution:255:22)\n' + + ' at node:internal/main/eval_string:71:3\n' + + '\n' + + 'Node.js v24.0.0-pre\n' - '' at TestContext.<anonymous> (/Users/runner/work/node/node/dir%20with $unusual"chars?'åß∂ƒ©∆¬…`/test/parallel/test-dotenv-edge-cases.js:207:12) at async Test.run (node:internal/test_runner/test:981:9) at async Suite.processPendingSubtests (node:internal/test_runner/test:678:7) { generatedMessage: true, code: 'ERR_ASSERTION', actual: '[eval]:3\n' + ' const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' + " x Expected ',', got 'åß'\n" + ' ,-[3:1]\n' + " 1 | const { parseEnv } = require('node:util');\n" + ' 2 | \n' + ' 3 | const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + ' : ^^\n' + ' 4 | const result = parseEnv(input);\n' + '...', expected: '', operator: 'strictEqual' } Command: out/Release/node --test-reporter=./test/common/test-error-reporter.js --test-reporter-destination=stdout "/Users/runner/work/node/node/dir%20with $unusual"chars?'åß∂ƒ©∆¬…`/test/parallel/test-dotenv-edge-cases.js"

Check failure on line 1 in test/parallel/test-dotenv-edge-cases.js

View workflow job for this annotation

GitHub Actions / test-macOS (macos-14)

--- stdout --- Test failure: 'should handle invalid syntax in .env file' Location: test/parallel/test-dotenv-edge-cases.js:186:3 AssertionError [ERR_ASSERTION]: Expected values to be strictly equal: + actual - expected + '[eval]:3\n' + + ' const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + + ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' + + " x Expected ',', got 'åß'\n" + + ' ,-[3:1]\n' + + " 1 | const { parseEnv } = require('node:util');\n" + + ' 2 | \n' + + ' 3 | const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + + ' : ^^\n' + + ' 4 | const result = parseEnv(input);\n' + + ' 5 | assert.strictEqual(Object.keys(result).length, 3);\n' + + " 6 | assert.strictEqual(result.baz, 'whatever');\n" + + ' `----\n' + + '\n' + + '\n' + + 'SyntaxError: missing ) after argument list\n' + + ' at makeContextifyScript (node:internal/vm:185:14)\n' + + ' at compileScript (node:internal/process/execution:386:10)\n' + + ' at evalTypeScript (node:internal/process/execution:255:22)\n' + + ' at node:internal/main/eval_string:71:3\n' + + '\n' + + 'Node.js v24.0.0-pre\n' - '' at TestContext.<anonymous> (/Users/runner/work/node/node/dir%20with $unusual"chars?'åß∂ƒ©∆¬…`/test/parallel/test-dotenv-edge-cases.js:207:12) at async Test.run (node:internal/test_runner/test:981:9) at async Suite.processPendingSubtests (node:internal/test_runner/test:678:7) { generatedMessage: true, code: 'ERR_ASSERTION', actual: '[eval]:3\n' + ' const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + ' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n' + " x Expected ',', got 'åß'\n" + ' ,-[3:1]\n' + " 1 | const { parseEnv } = require('node:util');\n" + ' 2 | \n' + ' 3 | const input = fs.readFileSync(\'/Users/runner/work/node/node/dir%20with $unusual"chars?\'åß∂ƒ©∆¬…`/test/fixtures/dotenv/invalid-syntax.env\', \'utf8\');\n' + ' : ^^\n' + ' 4 | const result = parseEnv(input);\n' + '...', expected: '', operator: 'strictEqual' } Command: out/Release/node --test-reporter=./test/common/test-error-reporter.js --test-reporter-destination=stdout "/Users/runner/work/node/node/dir%20with $unusual"chars?'åß∂ƒ©∆¬…`/test/parallel/test-dotenv-edge-cases.js"

const common = require('../common');
const assert = require('node:assert');
Expand Down Expand Up @@ -182,4 +182,29 @@
assert.strictEqual(child.code, 9);
assert.match(child.stderr, /bad option: --env-file-ABCD/);
});

it('should handle invalid syntax in .env file', async () => {
const invalidEnvFilePath = path.resolve(__dirname, '../fixtures/dotenv/invalid-syntax.env');
const code = `
const { parseEnv } = require('node:util');

const input = fs.readFileSync('${invalidEnvFilePath}', 'utf8');
const result = parseEnv(input);
assert.strictEqual(Object.keys(result).length, 3);
assert.strictEqual(result.baz, 'whatever');
assert.strictEqual(result.VALID_AFTER_INVALID, 'test');
assert.strictEqual(result.ANOTHER_VALID, 'value');
assert.strictEqual(result.foo, undefined);
assert.strictEqual(result.bar, undefined);
assert.strictEqual(result.multiple_invalid, undefined);
assert.strictEqual(result.lines_without_equals, undefined);
`.trim();
const child = await common.spawnPromisified(
process.execPath,
[ '--eval', code ],
{ cwd: __dirname },
);
assert.strictEqual(child.stderr, '');
assert.strictEqual(child.code, 0);
});
});
Loading