-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for multiple --loader flags #18914
Conversation
/cc @nodejs/modules |
Allowing multiple loaders is good thing. |
properties to call the `resolve` and `dynamicInstantiate` hooks of the parent | ||
loader. The default loader has a `resolve` hook and `null` for the value | ||
`dynamicInstantiate`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👆 By default loader do you mean the last loader (far right)? Or something internal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
he's referring to the default internal loader that is used if you don't hook anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the default dynamic instantiate be a function that just throws "no instantiate hook"? Then each caller can assume chaining, with the fallthrough being an error naturally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it doesn't impact the user exposed code and not something the user can observe (is that right?) would that make more sense as a side note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to the user its just a parent
with { resolve: [Function], dynamicInstantiate: null }
, they wouldn't (and probably shouldn't) have any knowledge of where in the chain they are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, the user should be able to chain the first loader identically to any other loader, so they should specifically call the dynamicInstantiate
if creating their own to ensure chaining. Thus the first one as throwing is needed to maintain this equivalence (although the loader needn't necessarily need to know that it is throwing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the dynamicInstantiate
to be a throwing function.
can be passed multiple times to compose loaders like | ||
`--loader ./loader-coverage.mjs --loader ./loader-mocking.mjs`. The last loader | ||
will be used for module loading and must explicitly call to the parent loader | ||
in order to provide compose behavior. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👆 aren't all loaders used for module loading? Is there a reason to call out the last one as special. Does that mean order is significant? If it's just about calling the parent
there may be a way to word that better. Something like how acorn plugins allow calling a parent in case there are other plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order is significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just remove "will be used for module loading and"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain a bit more on the significance of the last loader (far right)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a chaining perspective, the last loader is responsible for maintaining the chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't any loader not calling their parent effect the chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a loader doesn't call their parent, then all preceding loaders get ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek i've removed that phrase.
doc/api/esm.md
Outdated
@@ -106,11 +106,20 @@ fs.readFile('./foo.txt', (err, body) => { | |||
<!-- type=misc --> | |||
|
|||
To customize the default module resolution, loader hooks can optionally be | |||
provided via a `--loader ./loader-name.mjs` argument to Node. | |||
provided via a `--loader ./loader-name.mjs` argument to Node. This argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit: Node -> Node.js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
doc/api/esm.md
Outdated
|
||
When hooks are used they only apply to ES module loading and not to any | ||
CommonJS modules loaded. | ||
|
||
All loaders are created by invoking the default export of their module as a | ||
function. The parameters given to the the function is a single object with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the the -> the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Multiple loaders" is super important 🎉
provided via a `--loader ./loader-name.mjs` argument to Node. This argument | ||
can be passed multiple times to compose loaders like | ||
`--loader ./loader-coverage.mjs --loader ./loader-mocking.mjs`. The last loader | ||
must explicitly call to the parent loader in order to provide compose behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that if any loader forgets to call to the previous loader, it will break the chain?
If so, that seems like a pit of failure; could we make this a pit of success by forcing loaders that want to break the chain to explicitly do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps if a hook returns null
we automatically walk up the chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually like the current design - it pretty much mimics the behavior of middleware in koa
, which is kinda similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek == null
, probably :-)
@weswigham the difference I see here is that if a middleware breaks the contract, your entire request likely fails, because nothing further proceeds; however, depending on what the loaders do, node might silently do the wrong thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely not just check for undefined. We should require full responsibility to handle all cases and avoid polymorphic return types. We could add something more complicated to model having defaults, but I'm more curious about a code example of this idea of the current design being a pit of failure.
Since the parameters are not sufficient to produce a valid return value, loaders must do something to return a proper value. Is there a real world use case where you are not breaking the chain but are also not fundamentally always calling the parent to do useful things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw I agree that all of the use cases where "being able to explicitly call the parent loader" are valuable, including "not calling the parent loader" - however, I think it's important that the default, when no explicit choice is made, should be "call the parent loader".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljharb the isn't really a way to explicitly say you are not choosing your return value. We can only make well known return values, which I would want to be the same type as the other return values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljharb to clarify. I am not really seeing the case where you want to default to the parent, but do not want to take any action using the result of the parent as the common case. Most loaders want to redirect behaviors or mutate what would be loaded. The only way to know what would be loaded is to query the parent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however, I think it's important that the default, when no explicit choice is made, should be "call the parent loader".
I'm not sure that's correct - If a given loader wasn't made with a loader pipeline in mind (ie, never explicitly calls the parent), how am I even sure the loader is capable of yielding output another loader can usefully use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like return null
is also probably more cryptic than return parentResolve()
.
|
||
When hooks are used they only apply to ES module loading and not to any | ||
CommonJS modules loaded. | ||
|
||
All loaders are created by invoking the default export of their module as a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ESM, default
, for CJS, module.exports
? (since a loader could be either)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CJS has the well defined interop steps, so yes module.exports
for CJS / .node / .json / ...
return { | ||
url: new URL(specifier, parentModuleURL).href, | ||
format: 'esm' | ||
async resolve(specifier, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why resolve
? If this is always meant to be an async function, should it just be called then
, so that Promise.resolve
-wrapping it always does what's expected?
Alternatively, should this just return a function directly? Why the wrapping object?
(also, if it's not always meant to be an async function, why async
here? Is the return value always wrapped in Promise.resolve
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its called resolve because it resolves the location of a module request. async so you can do whatever magic you need to do e.g. reading files (package.json for instance) without blocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - so it doesn't have to be an async function but its return value will always be await
ed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually that brings up a good point. in the loader we do await
but people making hooks might not know to use the same abstraction when calling parent hooks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well it is always async, you do need to await it if you aren't just passing it through (ala return parent.resolve(...)
).
lib/internal/process/modules.js
Outdated
let next = factory({ | ||
__proto__: null, | ||
resolve: Object.setPrototypeOf(async (url, referrer) => { | ||
const ret = await cachedResolve(url, referrer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this node core style rules, or is this function body indentation one level too deep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was bad, fixed
src/node_config.cc
Outdated
auto loaders = v8::Array::New(isolate, config_userland_loaders.size()); | ||
for (size_t i = 0; i < config_userland_loaders.size(); i++) { | ||
auto item = config_userland_loaders[i]; | ||
loaders->Set( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the overload that takes a Local<Context>
and call .FromJust() on the return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/node_config.cc
Outdated
v8::NewStringType::kNormal).ToLocalChecked()); | ||
} | ||
|
||
loaders->SetIntegrityLevel(context, v8::IntegrityLevel::kFrozen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.FromJust()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
const { getURLFromFilePath } = require('internal/url'); | ||
const Loader = require('internal/loader/Loader'); | ||
const path = require('path'); | ||
const { URL } = require('url'); | ||
|
||
// fires a getter or reads the value off a descriptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What’s the need to support getters? node could mandate that these things all be data properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljharb is there precedent of something that fails when passed a getter but not a data property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we at least use Reflect.get instead of this whole function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that i know of in the language. node be as restrictive here as it wants; it could even reject loaders whose resolve functions don’t have the right .length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek nope, because that would separate the check for property existing and the reading in a way that proxies could see. They could change their keys between that, generate a new property descriptor between that, etc. This function is also a bit gross because property descriptors inherit from Object
and people can taint value
/ get
and we don't want to use Reflect.get
because it would grab those as well:
Object.prototype.value = 123;
Object.getOwnPropertyDescriptor({get x(){}}, 'x').value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not unreliable :-) defaults aren't part of the length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confusing to the point of not being reliable. Well defined, just not reliable to my eyes.
[((g = 0, h) => g)].map(f => f.length)
// 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s invalid iirc; defaults can’t come before non-defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, TIL. Fair enough.
Older iterations of this used |
I prefer a bare function; those who want a class can |
will classes have to be used for proper context with per-package hooks? I would like all loaders to be the same format no matter how they are used |
@devsnek they would match whatever global hooks are generated using. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of comments, but otherwise this seems great.
url: new URL(specifier, parentModuleURL).href, | ||
format: 'esm' | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps make the example demonstrate chaining, by only applying to packages within the baseURL, while packages outside of the baseURL get the parent resolve or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
doc/api/esm.md
Outdated
// get and set functions provided for pre-allocated export names | ||
exports.customExportName.set('value'); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, perhaps this can be filtered to demonstrate chaining as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
lib/internal/process/modules.js
Outdated
@@ -31,15 +45,65 @@ exports.ESMLoader = undefined; | |||
exports.setup = function() { | |||
setInitializeImportMetaObjectCallback(initializeImportMetaObject); | |||
|
|||
let ESMLoader = new Loader(); | |||
const ESMLoader = new Loader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this was a let
because a separate loader was used to load the hook loaders than to load the main in order to support that any dependencies of the loaders themselves can support hooking in the main app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a regression, nice catch. should we give each loader its own module map space, or just the loaders vs runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've split this into 2 distinct loaders. I did notice a few odd things right now outside of this PR:
-
it looks like
import()
always uses the bootstrap loader if a custom loader exists :node/lib/internal/process/modules.js
Lines 34 to 51 in 3a19122
let ESMLoader = new Loader(); const loaderPromise = (async () => { const userLoader = process.binding('config').userLoader; if (userLoader) { const hooks = await ESMLoader.import( userLoader, getURLFromFilePath(`${process.cwd()}/`).href); ESMLoader = new Loader(); ESMLoader.hook(hooks); exports.ESMLoader = ESMLoader; } return ESMLoader; })(); loaderResolve(loaderPromise); setImportModuleDynamicallyCallback(async (referrer, specifier) => { const loader = await loaderPromise; return loader.import(specifier, normalizeReferrerURL(referrer)); }); -
it looks like the bootstrap loader and the runtime loader are sharing
require.cache
so the isolation doesn't fully hold for CJS stuff.
format: parentResolved.format | ||
}; | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note we do have a folder called test/fixtures/es-module-loaders
for the loaders themselves which may be worth using here for these new loader cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about using that but we also had other non-test esm files in this folder, can move if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we could refactor towards test/es-module
just containing the direct top-level test files, with everything else in fixtures directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 That can be done later in another PR
doc/api/errors.md
Outdated
<a id="ERR_LOADER_HOOK_BAD_TYPE"></a> | ||
### ERR_LOADER_HOOK_BAD_TYPE | ||
|
||
An Loader defined an invalid value for a hook. See the documentation for [ECMAScript Modules][] for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: A loader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: also line wrapping needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
This has been sitting for a while. If there is nothing in the next 24 hours I plan on merging tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a nit
execute: ret.execute, | ||
}; | ||
}, null), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments about these two functions? For somebody not familiar with loaders, they will be hard to read.
This needs a rebase. |
What is the status here? |
@BridgeAR blocked because its a new feature and modules team doesn't want to land new features atm |
@bmeck do you think we could get this going again? Seems related to nodejs/modules#82 as well. |
@guybedford I've put moratorium on my PRs for anything outside of |
@bmeck would it help if I rebase and continue to drive this one forward? Feels important to me. |
@guybedford feel free to take it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks promising but as a loader author I will need to check specifier
within the loader instead of calling the loader if and only if it's required for resolving specific import. And I feel like it can save a lot of time to call specific loader for specific import instead of just throwing every import to loaders
UPD Issue closed
TL;DR; it is slow to test loaders by regexp and load only if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks promising
What's the status on this? |
@jasnell Still stuck due to Modules group moratorium. |
Still stalled on @nodejs/modules? |
@Trott yup |
@bmeck is this something that might be worked on again anytime soon? Otherwise I suggest we close this for now and you reopen it as soon as there is agreement in the modules working group about it? |
It now refuses to run if 1. User has not configured upstream or branch - no more defaults. 2. User is not on the branch configured 3. User is on detached HEAD Refs: nodejs/node#18914 Fixes: nodejs/node-core-utils#198
It now refuses to run if 1. User has not configured upstream or branch - no more defaults. 2. User is not on the branch configured 3. User is on detached HEAD Refs: nodejs/node#18914 Fixes: nodejs/node-core-utils#198
It now refuses to run if 1. User has not configured upstream or branch - no more defaults. 2. User is not on the branch configured 3. User is on detached HEAD Refs: nodejs/node#18914 Fixes: nodejs/node-core-utils#198
It now refuses to run if 1. User has not configured upstream or branch - no more defaults. 2. User is not on the branch configured 3. User is on detached HEAD Refs: nodejs/node#18914 Fixes: nodejs/node-core-utils#198
It now refuses to run if 1. User has not configured upstream or branch - no more defaults. 2. User is not on the branch configured 3. User is on detached HEAD Refs: nodejs/node#18914 Fixes: nodejs/node-core-utils#198
This allows multiple
--loader
flags to be used at a given time. So if someone is doingnode --loader ./my-loader.mjs
they could add a second--loader
for APM or mocking purposesNODE_OPTIONS='--experimental-modules --loader ./mock-loader.mjs' node --experimental-modules --loader ./my-loader.mjs node --experimental-modules --loader ./mock-loader.mjs --loader ./my-loader.mjs
We should probably do some design work around this composition API. I've tried to prevent loaders from passing arbitrary data between each other to guard against tight coupling. There is also a problem of child loaders being able to disagree with parents on the format of a URL rather than having a design that enforces a single value for the format.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
module