Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JavaScript runtime] The JavaScript runtime does not correct error node name in parse tree. #4753

Open
kaby76 opened this issue Jan 3, 2025 · 0 comments

Comments

@kaby76
Copy link
Contributor

kaby76 commented Jan 3, 2025

This is a bug that I found while trying to validate the parse trees produced for input across different ports (aka "target"). The problem is specifically in the JavaScript runtime, at this line:

tokenText = "<missing " + recognizer.literalNames[expectedTokenType] + ">";

The problem is that the token referenced is out of bounds for the array.

For grammar antlr/antlr4, input LexerElementLabel.g4, the parse tree is wrong for the JavaScript target.

$ git diff .
diff --git a/antlr/antlr4/examples/LexerElementLabel.g4.tree b/antlr/antlr4/examples/LexerElementLabel.g4.tree
index 24a5362f..1379cabe 100644
--- a/antlr/antlr4/examples/LexerElementLabel.g4.tree
+++ b/antlr/antlr4/examples/LexerElementLabel.g4.tree
@@ -1 +1 @@
-(grammarSpec (grammarDecl (grammarType lexer grammar) (identifier LexerElementLabel) ;) (rules (ruleSpec (lexerRuleSpec Token : (lexerRuleBlock (lexerAltList (lexerAlt lexerElements))) <missing SEMI>)) (ruleSpec (parserRuleSpec var = 'token' ;))) <EOF>)
\ No newline at end of file
+(grammarSpec (grammarDecl (grammarType lexer grammar) (identifier LexerElementLabel) ;) (rules (ruleSpec (lexerRuleSpec Token : (lexerRuleBlock (lexerAltList (lexerAlt lexerElements))) <missing undefined>)) (ruleSpec (parserRuleSpec var = 'token' ;))) <EOF>)
\ No newline at end of file
diff --git a/antlr/antlr4/examples/three.g4.tree b/antlr/antlr4/examples/three.g4.tree
index efea5989..82075168 100644
--- a/antlr/antlr4/examples/three.g4.tree
+++ b/antlr/antlr4/examples/three.g4.tree
@@ -1 +1 @@
-(grammarSpec (grammarDecl grammarType identifier <missing SEMI>) rules <EOF>)
\ No newline at end of file
+(grammarSpec (grammarDecl grammarType identifier <missing undefined>) rules <EOF>)
\ No newline at end of file
01/03-16:21:24 ~/issues/g4-all-trees/antlr/antlr4/examples

The runtime references the array recognizer.literalNames[expectedTokenType] directly, and out of bounds for the value of expectedTokenType. The code should be making a method call, to recognizer.Vocabulary.GetDisplayName(expectedTokenType).

Python3 may have a similar issue.

The workaround is to not save parse trees for input that we know have a parse error.

Cpp

Dart

Go

Java

Python3

Swift

@kaby76 kaby76 changed the title [JavaScript runtime] The JavaScript runtime does not use the correct name for getMissing Symbol(). [JavaScript runtime] The JavaScript runtime does not correct error node name in parse tree. Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant