-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Position information is broken for %empty
#21
Comments
This is due to the fact, that position information for non-terminals is generated from the position information of the corresponding terminal symbols (which correspond to input lexemes). No tokens, no useful position information. Possible solutions (which will be implemented is open to discussion):
In any case, the position handling of empty productions must be special cased to behave correctly. As a last remark, the behaviour of the position tracking is worse if the %empty reduction is not the first reduction the behaviour will be even worse. The offending code is a reference to |
That at least explains the results I’m seeing for {line,col}{0,1}. I can think of more options for the %empty position, but I don’t know how hard to implement those would be:
Aside from my last suggestion, In the end, I should probably simply re-write the position of the AST element coming out of the |
Rewriting the position later on is not a completely safe workaround (although it works in this case), because the position of the %empty reduction is used to calculate the position span of the surrounding nodes. As to the other possible solutions: The previous token is not readily available to the parser when it encounters an empty reduction (but we can extract the end position from the top stack element, which may already be a non-terminal, but whose span will end at the end of the previous token) also there might not be a previous token (this of course corresponds to the position 0:0-0:0, but the parser does not know the current file-name). The span between the previous and next token will most definitely result in confusing position spans for further derived objects. The problem with |
For the following parser definition (
syntax
):Together with a script
bar.py
:And the following input (
text
):We get the following:
As you can see, even the file name is missing from the position information for the
%empty
production. I understand that it might be difficult to get a coherent range of characters, but the file name should be available correctly.If possible,
col0 == col1
andline0 == line1
would be nice, too, but I don’t know if it makes sense.The text was updated successfully, but these errors were encountered: