-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat apilinks.json generator #153
base: main
Are you sure you want to change the base?
Conversation
Closes #152 Signed-off-by: flakey5 <[email protected]>
*/ | ||
export function getGitRepository(directory) { | ||
try { | ||
const trackingRemote = execSync(`cd ${directory} && git remote`); |
Check warning
Code scanning / CodeQL
Unsafe shell command constructed from library input Medium
library input
shell command
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI about 1 month ago
To fix the problem, we should avoid using execSync
with unsanitized input. Instead, we can use execFileSync
from the child_process
module, which allows us to pass arguments as an array, avoiding shell interpretation. This change ensures that the input is treated as a literal string and not as a part of the shell command.
- Replace
execSync
withexecFileSync
in thegetGitRepository
andgetGitTag
functions. - Modify the commands to use
execFileSync
with arguments passed as an array.
-
Copy modified line R3 -
Copy modified lines R15-R16 -
Copy modified line R39 -
Copy modified line R41
@@ -2,3 +2,3 @@ | ||
|
||
import { execSync } from 'child_process'; | ||
import { execFileSync } from 'child_process'; | ||
|
||
@@ -14,6 +14,4 @@ | ||
try { | ||
const trackingRemote = execSync(`cd ${directory} && git remote`); | ||
const remoteUrl = execSync( | ||
`cd ${directory} && git remote get-url ${trackingRemote}` | ||
); | ||
const trackingRemote = execFileSync('git', ['-C', directory, 'remote']).toString().trim(); | ||
const remoteUrl = execFileSync('git', ['-C', directory, 'remote', 'get-url', trackingRemote]).toString().trim(); | ||
|
||
@@ -40,7 +38,5 @@ | ||
const hash = | ||
execSync(`cd ${directory} && git log -1 --pretty=%H`) || 'main'; | ||
execFileSync('git', ['-C', directory, 'log', '-1', '--pretty=%H']).toString().trim() || 'main'; | ||
const tag = | ||
execSync(`cd ${directory} && git describe --contains ${hash}`).split( | ||
'\n' | ||
)[0] || hash; | ||
execFileSync('git', ['-C', directory, 'describe', '--contains', hash]).toString().split('\n')[0] || hash; | ||
|
try { | ||
const trackingRemote = execSync(`cd ${directory} && git remote`); | ||
const remoteUrl = execSync( | ||
`cd ${directory} && git remote get-url ${trackingRemote}` |
Check warning
Code scanning / CodeQL
Unsafe shell command constructed from library input Medium
library input
shell command
export function getGitTag(directory) { | ||
try { | ||
const hash = | ||
execSync(`cd ${directory} && git log -1 --pretty=%H`) || 'main'; |
Check warning
Code scanning / CodeQL
Unsafe shell command constructed from library input Medium
library input
shell command
const hash = | ||
execSync(`cd ${directory} && git log -1 --pretty=%H`) || 'main'; | ||
const tag = | ||
execSync(`cd ${directory} && git describe --contains ${hash}`).split( |
Check warning
Code scanning / CodeQL
Unsafe shell command constructed from library input Medium
library input
shell command
@@ -0,0 +1,52 @@ | |||
'use strict'; | |||
|
|||
import { execSync } from 'child_process'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import { execSync } from 'child_process'; | |
import { execSync } from 'node:child_process'; |
@@ -1,10 +1,12 @@ | |||
// @ts-check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// @ts-check |
use actualy didn't use type-checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(?)
@flakey5 do you need review here? I think I forgot to review this PR! |
As far as the approach yes please |
.filter(path => path !== undefined && path.endsWith('.js')) | ||
); | ||
|
||
const parsedJsFiles = await parseJsSources(sourceFiles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this process needs to be blocking? Do we need to parse all these sources beforehand and not on-demand?
const sourceFiles = loadJsFiles( | ||
parsedApiDocs | ||
.map(apiDoc => apiDoc.source_link_local) | ||
.filter(path => path !== undefined && path.endsWith('.js')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.filter(path => path !== undefined && path.endsWith('.js')) | |
.filter(path => path?.endsWith('.js')) |
const { runGenerators } = createGenerator(parsedApiDocs); | ||
const sourceFiles = loadJsFiles( | ||
parsedApiDocs | ||
.map(apiDoc => apiDoc.source_link_local) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.map(apiDoc => apiDoc.source_link_local) | |
.map(({ source_link_local }) => source_link_local) |
*/ | ||
const createGenerator = input => { | ||
const createGenerator = (input, parsedJsFiles) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const createGenerator = (input, parsedJsFiles) => { | |
const createGenerator = (markdownInput, jsInput) => { |
const cachedGenerators = { ast: Promise.resolve(input) }; | ||
const cachedGenerators = { | ||
ast: Promise.resolve(input), | ||
'ast-js': Promise.resolve(parsedJsFiles), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
|
||
findDefinitions(program, programBasename, nameToLineNumberMap, exports); | ||
|
||
if (!baseGithubLink) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is resolved only once. Since it seems that it grabs on the first value, could you do:
const [first, ...programs] = input;
const directory = dirname(first.path);
const repository = getGitRepository(directory);
const tag = getGitTag(directory);
const baseGithubLink = `https://github.com/${repository}/blob/${tag}`;
(Do this outside of the foreach), also this should probably be its own function and unit tested, ie: getGitHubLinkFromProgram
Also, it is GitHub, not Github HAHAHA
@@ -0,0 +1,215 @@ | |||
// @ts-check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this comment?
*/ | ||
function extractExpression(expression, loc, basename, nameToLineNumberMap) { | ||
/** | ||
* @example `a=b`, lhs=`a` and rhs=`b` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make these inline with just //
|
||
exports.identifiers.push(value.name); | ||
|
||
if (/^[A-Z]/.test(value.name[0])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain the Regex and store it in a constant
|
||
switch (statement.type) { | ||
case 'ExpressionStatement': { | ||
const { expression } = statement; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to deconstruct this?
|
||
break; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how common it is to have empty lines between cases. I think it would be valid to have an inline doc
// @see reference to what `VariableDeclaration` is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how expensive these functions are, as it seems that this can quickly get bloated depending on the size of source files.
BTW, could you use https://unifiedjs.com/explore/package/estree-util-visit/ to walk/visit the nodes? This is what I was referring to, instead of plain forEaches and while loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it might be worth to move "estree" moving logic to a dedicated "estree parser" IDK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have a parser for the Markdown files, having one just for Estree makes sense I guess.
}); | ||
}; | ||
|
||
return { loadFiles, loadJsFiles }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename loadFiles to loadMarkdownFiles
return filePaths.map(async filePath => { | ||
const fileContents = await readFile(filePath, 'utf-8'); | ||
|
||
return new VFile({ path: filePath, value: fileContents }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit concerned of the memory allocation that this will cause, some of the source files can be humungous. Have you made some comparisons of heap size differences?
@@ -182,7 +184,55 @@ const createParser = () => { | |||
return resolvedApiDocEntries.flat(); | |||
}; | |||
|
|||
return { parseApiDocs, parseApiDoc }; | |||
/** | |||
* TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why TODO? What's missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about some of these executions here; they can be downright unsafe. Can we use maybe an existing NPM package that allows us to run git apis from JS? Chances we don't even need to run a git command (exec) since technically .git
folder has all the data we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach in general sounds good; Just, some of these utilities are way too complex and big and the git manipulation scripts are possibly dangerous or at least an easy vulnerability hell.
Closes #152
Opening this as a draft currently to get feedback on the approach since it's a bit non-trivial relative to the other generators. The
apilinks.json
file maps things exported by modules to their source locations on Github.Example:
This means we need to parse the module's javascript source in addition to its markdown.
What the current approach does is doing:
acorn
is used for parsing the source filessource_link
metadata in the docsast-js
generatorapi-links
generator is based off of theast-js
resultWith the current approach, the generator is almost there. Some todos remain though: