-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for JavaScript #59
base: main
Are you sure you want to change the base?
Conversation
Fixes #51 Add support for JavaScript code analysis using tree-sitter. * Add `api/analyzers/javascript/analyzer.py` implementing `JavaScriptAnalyzer` class using tree-sitter for JavaScript. - Implement methods for first and second pass analysis. - Use tree-sitter to parse JavaScript code. - Extract functions and classes from JavaScript code. - Connect entities in the graph. * Update `api/analyzers/source_analyzer.py` to include `JavaScriptAnalyzer` in the analyzers list. * Add `tree-sitter-javascript` dependency to `pyproject.toml`. * Add utility functions for JavaScript analysis in `api/analyzers/utils.py`. --- For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/FalkorDB/code-graph-backend/issues/51?shareId=XXXX-XXXX-XXXX-XXXX).
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
WalkthroughThis pull request introduces comprehensive support for JavaScript source code analysis in the Code-Graph backend. A new Changes
Sequence DiagramsequenceDiagram
participant SA as SourceAnalyzer
participant JSA as JavaScriptAnalyzer
participant TS as Tree-sitter Parser
participant G as Code Graph
SA->>JSA: Analyze JavaScript file
JSA->>TS: Parse source code
TS-->>JSA: Return AST
JSA->>JSA: First pass: Extract functions/classes
JSA->>G: Add function/class entities
JSA->>TS: Parse source code again
TS-->>JSA: Return AST
JSA->>JSA: Second pass: Link function calls
JSA->>G: Establish function relationships
Assessment against linked issues
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
api/analyzers/javascript/analyzer.py (3)
22-71
: Enhance function declaration processing and handle arrow functions.Currently,
process_function_declaration
only captures standard function declarations. If you'd like to capture arrow functions or function expressions, consider expanding the query or logic. Also, note that usingfind_child_of_type(node, 'identifier')
might skip function declarations without a named identifier (like anonymous functions).🧰 Tools
🪛 Ruff (0.8.2)
22-22:
Function
may be undefined, or defined from star imports(F405)
35-35:
find_child_of_type
may be undefined, or defined from star imports(F405)
45-45:
find_child_of_type
may be undefined, or defined from star imports(F405)
62-62:
Function
may be undefined, or defined from star imports(F405)
72-102
: Extend class declaration handling for inheritance.This method correctly extracts the class name from the
identifier
child. You may want to handleextends
clauses (e.g.,class Foo extends Bar
) or keep track of implemented interfaces in the future.🧰 Tools
🪛 Ruff (0.8.2)
72-72:
Class
may be undefined, or defined from star imports(F405)
85-85:
find_child_of_type
may be undefined, or defined from star imports(F405)
99-99:
Class
may be undefined, or defined from star imports(F405)
157-220
: Protect against missing function entities and arrow function calls.
- The second pass currently assumes function declarations are always standard. Arrow functions won't be captured, so the calls might remain unresolved.
assert(caller_f is not None)
may crash if the function is somehow not recognized. Consider a safer check, logging a warning, or creating a placeholder entity to avoid halting the entire analysis.-assert(caller_f is not None) +if caller_f is None: + logger.warning(f"Caller function '{caller_name}' not found. Skipping relationship.") + continue🧰 Tools
🪛 Ruff (0.8.2)
216-216:
Function
may be undefined, or defined from star imports(F405)
api/analyzers/utils.py (2)
25-38
: Augment arrow function or unnamed function handling.
extract_js_function_name
assumes there's an identifier child. Consider fallback logic for arrow or anonymous functions if needed (e.g., generating a placeholder name).
40-53
: Handle anonymous or default-export classes.Similar to functions, classes can sometimes be declared without a direct identifier (export default class, etc.). Consider a fallback name or a distinct approach for these cases.
api/analyzers/source_analyzer.py (1)
20-21
: Consider broader JavaScript-related extensions.Mapping
.js
toJavaScriptAnalyzer
is a good start. If you plan to handle.mjs
,.cjs
, or.jsx
, consider extending the analyzers dictionary or making it more flexible.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
api/analyzers/javascript/analyzer.py
(1 hunks)api/analyzers/source_analyzer.py
(2 hunks)api/analyzers/utils.py
(1 hunks)pyproject.toml
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/analyzers/javascript/analyzer.py
3-3: from ..utils import *
used; unable to detect undefined names
(F403)
5-5: from ...entities import *
used; unable to detect undefined names
(F403)
22-22: Function
may be undefined, or defined from star imports
(F405)
35-35: find_child_of_type
may be undefined, or defined from star imports
(F405)
45-45: find_child_of_type
may be undefined, or defined from star imports
(F405)
62-62: Function
may be undefined, or defined from star imports
(F405)
72-72: Class
may be undefined, or defined from star imports
(F405)
85-85: find_child_of_type
may be undefined, or defined from star imports
(F405)
99-99: Class
may be undefined, or defined from star imports
(F405)
123-123: File
may be undefined, or defined from star imports
(F405)
216-216: Function
may be undefined, or defined from star imports
(F405)
🔇 Additional comments (5)
api/analyzers/javascript/analyzer.py (3)
18-21
: Analyzer constructor looks good.The constructor properly initializes the Parser with
JS_LANGUAGE
. This is a succinct and clear approach.
103-156
: Consider robust UTF-8 handling.The code tries to decode the file content using UTF-8, which might fail if the file is in another encoding. This fallback is fine in many cases, but you might want to log or handle suspected encoding issues more explicitly if supporting diverse codebases.
🧰 Tools
🪛 Ruff (0.8.2)
123-123:
File
may be undefined, or defined from star imports(F405)
10-13
: Confirm Tree-sitter JavaScript library availability.The import statements
import tree_sitter_javascript as tsjs
andLanguage, Parser
fromtree_sitter
look correct for parsing JavaScript. Just verify that thepyproject.toml
changes (addingtree-sitter-javascript
) are indeed installed in your environment.✅ Verification successful
tree-sitter-javascript dependency is properly configured
The
pyproject.toml
file shows thattree-sitter-javascript
version^0.23.2
is correctly listed as a dependency in the project's Poetry configuration. This confirms that the import statementimport tree_sitter_javascript as tsjs
will work as expected when the dependencies are installed.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Ensure tree-sitter-javascript is installed in the current environment pip freeze | grep "tree-sitter-javascript"Length of output: 136
Script:
#!/bin/bash # Check if pyproject.toml exists and contains tree-sitter-javascript dependency fd pyproject.toml --exec cat {}Length of output: 877
Script:
#!/bin/bash # Check if requirements.txt exists and contains tree-sitter-javascript dependency fd requirements.txt --exec cat {}Length of output: 65574
Script:
#!/bin/bash # Check if setup.py exists and contains tree-sitter-javascript dependency fd setup.py --exec cat {}Length of output: 29
Script:
#!/bin/bash # Look for any other Python dependency files fd ".*requirements.*\.txt|poetry\.lock|Pipfile" --exec cat {}Length of output: 65602
api/analyzers/source_analyzer.py (1)
11-11
: JavaScript analyzer import addition confirmed.Adding
JavaScriptAnalyzer
aligns with the pull request objective to support.js
files.pyproject.toml (1)
16-16
: Dependency addition looks appropriate.
tree-sitter-javascript = "^0.23.2"
is correctly specified. Validate that this version meets any advanced language feature requirements you may want to parse (e.g., newer JS syntax).✅ Verification successful
Version 0.23.2 is appropriate for the JavaScript parsing requirements
Based on the code analysis, the tree-sitter-javascript dependency is used for basic JavaScript AST parsing, focusing on:
- Function declarations and their parameters
- Class declarations
- Function call expressions
- Basic identifier extraction
The specified version (^0.23.2) is actually newer than the latest PyPI version (0.23.1) and supports all the JavaScript syntax features currently being used in the analyzer. The codebase doesn't use any advanced JavaScript features that would require a newer version, as it's primarily focused on fundamental syntax elements like functions, classes, and method calls.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if the newly added dependency is at the latest stable version curl -s https://pypi.org/pypi/tree-sitter-javascript/json | jq '.info.version'Length of output: 87
Script:
#!/bin/bash # Check if there are any JavaScript files in the repository to understand the JS features being used fd -e js -x head -n 50Length of output: 24
Script:
#!/bin/bash # Look for any configuration files that might specify JavaScript version/features fd -g '*babel*' -g '*eslint*' -g '*tsconfig*' -g '*jsconfig*'Length of output: 273
Script:
#!/bin/bash # Check if tree-sitter-javascript is already in use and how rg -l "tree-sitter-javascript"Length of output: 57
Script:
#!/bin/bash # Check the content of poetry.lock to see how tree-sitter-javascript is being used cat poetry.lock | grep -A 10 "tree-sitter-javascript"Length of output: 678
Script:
#!/bin/bash # Look for Python files that might be using tree-sitter-javascript rg -l "from tree_sitter"Length of output: 147
Script:
#!/bin/bash # Check how tree-sitter-javascript is being used in the JavaScript analyzer cat api/analyzers/javascript/analyzer.pyLength of output: 7853
Script:
#!/bin/bash # Check utils.py to understand the common tree-sitter usage patterns cat api/analyzers/utils.pyLength of output: 1429
import io | ||
import os | ||
from ..utils import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid import *
from relatively unknown modules.
Using star imports (from ..utils import *
and from ...entities import *
) can cause namespace conflicts and obfuscate which names are actually used. Consider importing only the required objects or using explicit imports to maintain clearer code and prevent accidental overrides.
-from ..utils import *
-from ...entities import *
+from ..utils import find_child_of_type # and any other required imports
+from ...entities import File, Function, Class # or whichever entities are actually used
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
3-3: from ..utils import *
used; unable to detect undefined names
(F403)
def extract_js_function_name(node: Node) -> str: | ||
""" | ||
Extract the function name from a JavaScript function node. | ||
|
||
Args: | ||
node (Node): The AST node representing the function. | ||
|
||
Returns: | ||
str: The name of the function. | ||
""" | ||
for child in node.children: | ||
if child.type == 'identifier': | ||
return child.text.decode('utf-8') | ||
return '' | ||
|
||
def extract_js_class_name(node: Node) -> str: | ||
""" | ||
Extract the class name from a JavaScript class node. | ||
|
||
Args: | ||
node (Node): The AST node representing the class. | ||
|
||
Returns: | ||
str: The name of the class. | ||
""" | ||
for child in node.children: | ||
if child.type == 'identifier': | ||
return child.text.decode('utf-8') | ||
return '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Language specific utilizes shouldn't be added to utils
""" | ||
Perform the first pass processing of a JavaScript source file. | ||
|
||
Args: | ||
path (Path): The path to the JavaScript source file. | ||
f (io.TextIOWrapper): The file object representing the opened JavaScript source file. | ||
graph (Graph): The Graph object where entities will be added. | ||
|
||
Returns: | ||
None | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment should include information about what the function does, "processing JavaScript file" is too general.
specify which entities are extracted.
try: | ||
# Parse file | ||
content = f.read() | ||
tree = self.parser.parse(content) | ||
except Exception as e: | ||
logger.error(f"Failed to process file {path}: {e}") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit of a waste, we've already read the file and parsed it on the first pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
caller = function_def[0] | ||
caller_name = caller.text.decode('utf-8') | ||
caller_f = graph.get_function_by_name(caller_name) | ||
assert(caller_f is not None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using assert in production code is unconventional. Consider handling this case more gracefully.
assert(caller_f is not None) | |
if caller_f is None: logger.error(f'Caller function not found: {caller_name}'); continue |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
Fixes #51
Add support for JavaScript code analysis using tree-sitter.
api/analyzers/javascript/analyzer.py
implementingJavaScriptAnalyzer
class using tree-sitter for JavaScript.api/analyzers/source_analyzer.py
to includeJavaScriptAnalyzer
in the analyzers list.tree-sitter-javascript
dependency topyproject.toml
.api/analyzers/utils.py
.For more details, open the Copilot Workspace session.
Summary by CodeRabbit
New Features
Dependencies
Improvements