-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dedication #11
Comments
I'll try my best to get it done, but so far school is really getting in the way. |
Apparently you started again. I kinda lost motivation on mine, however mine is very close to completion I believe. If I can give you a tip: You might want to solve those first and create super tight test cases so that you can't regress.
If you also lose motivation, maybe we can team up. Either way feel free to use my code and test cases. Edit: I just saw that you made the https://github.com/alaviss/nim.nvim plugin. I've been using it for a while and it does a beautiful job at highlighting. Thank you :) |
Thanks for the kind words.
Indeed, these are the reason I'm restructuring the external scanner and the grammar as a whole. One thing I've been doing is to not use the official grammar more than I have to. The grammar was, well, hand derived from hand written code, so it's not precise in the first place. What I've been doing is to test snippets of code and cross reference with the |
That's sensible. There are a lot of things missing in the grammar, some rules are described in an odd or inconsistent way, and some are even partially wrong. I've made a collection of the issues I noticed (in the bottom are also some things I asked): https://github.com/aMOPel/tree-sitter-nim/blob/main/grammar_bugs_to_report.md Also I don't know how you're constructing your test cases right now, but I put some mappings for vim users in my readme to easily write and iterate on test cases: https://github.com/aMOPel/tree-sitter-nim/tree/main#for-vim-users And I can really recommend you to harvest my test corpus. Of course you can't use the AST's but you could use the examples and I've been trying to write down a lot of cases. For example in literals and operators I've tried to really cover all the cases, if possible. Also for many other things I've put a lot of corner cases, like the classic "is it a tuple or a function call?": echo() # function call
echo () # tuple
echo(1, 2) # function call
echo (1, 2) # function call
echo (a: 1,b: 2) # tuple
echo(1) # function call
echo (1) # function call
echo (1,) # tuple
echo (a: 1) # tuple, no trailing comma, still tuple
echo((1,2)) # tuple
echo ((1,2)) # tuple
echo(((1,2),),) # double tuple
echo (((1,2),),) # triple tuple Anyway. Thank you for giving the nim tree-sitter another go. I'd love to use it somewhen in the future :) |
I have nothing like that :P I really just write the test, add
Thanks, I'll consult them for sure. |
Little update: I've started a rework of my grammar as well. Somebody wrote some highlight queries for it and I was surprised to see that it already did like 95% of everything. Gave me some new motivation. Anyway I wanted to tell you I have a great workflow for updating test cases. Since it can be super tedious when done manually. https://github.com/aMOPel/tree-sitter-nim/blob/main/.lvimrc With those mappings you can literally go |
While fighting with command call syntax and expression precedences I stumbled upon this:
echo $foo
# is parsed as
echo($foo) from here: I didn't see a test case in your tests yet, maybe I missed it.
should be parsed as
and
should be parsed as
I figure this is rather impossible, since you need to use And I don't see how to resolve this with |
Feel free to use mine: tree-sitter-nim/src/scanner.cc Lines 262 to 522 in 646fe39
Command call is a pain to parse, it's only recently that I managed to resolve the conflicts with unary & binary operators. So far my grammar managed this suite (up until block statements): https://github.com/alaviss/tree-sitter-nim/blob/646fe39ba9850c26b7cbc64515af2116ff366f6f/corpus/expressions.txt |
That's the huge effort I was talking about. Props. |
Reading your operator lexer right now. If I understand correctly you also decided to not use the dotlikeop yet, since it needs a compiler flag and that's impossible to check for from tree-sitter. So the dotlikeop code in there is more like a placeholder for when it becomes the default, right? Edit: Oh, but you are parsing dotlikeops differently in grammar.js. Consider: proc `.?`(a: int, b: int): int = a+b
let
a = 5
b = 6
echo a.?b.float with So I believe it's more sensible for now to not treat dotlikeops differently. |
I think your |
I parse dot-like ops by default since there's no reason not to. It's a pretty harmless part of the parser that I can toggle off if it becomes an issue. As far as the grammar goes I try to parse a superset of Nim to simplify the process (hence the support for unicode ops without any experimental flags). |
Thing is with the dotlikeops, it's no superset it's different behaviour, depending on the compiler flag. |
I looked further into this and looks like dot-like ops is slated for removal (from 2.0 changelog). I misremembered the feature to be support for dot ops altogether, sorry. Thanks for bringing my attention to this, I'll remove it from the grammar. |
This feature is gonna be removed, and it was not the correct way to handle dot ops in the first place. See #11 (comment)
Another thing I noticed: |
It doesn't seem to support accent_quoted from this snippet I tested on playground: import macros
dumpTree:
`foo`"str\"
|
Hm, you're right. proc `>>`(a: string) = echo a
`>>`"hi" compiles and prints proc `>>`(a: string) = echo a
`>>`"h\ni" prints
while proc x(a: string) = echo a
x"h\ni" prints So the case with ticks should actually be parsed as a command call. |
I stumbled upon this issue after trying to get proper Nim support with Keep up the great work! |
Just a small update, the rewrite branch now have a pretty complete Nim grammar, but now I'm looking at yet another rewrite to reduce the parser size (which is 65MiB and takes 1mins to build, not to mention that some changes can explode the number of states to beyond what tree-sitter can support) and fix command call induced precedence quirks. The better news is that the 2nd rewrite (not uploaded yet) is maintaining a more bareable 1.5MiB in size and I'm making a lot of headway on handling command call expressions and (hopefully) avoid invoking the GLR parser. |
That's good to hear! Any ideas why the size difference between the two implementations is this big? |
The gap is closing, and closing too fast. Right now the next rewrite is already at 38MiB! The good news is that I cracked command expression, so at least we will have something correct, even if it's insanely big... From what I can tell, the size delta boils down to conflicts between expressions with Here's an unfinished parser state count:
Since the explosion is exponential, we will quickly approach tree-sitter state limit of 2^16, which will render a broken parser. |
I think I finally pinned the problem down: command expressions and conditionals are exploding the parser size. So tree-sitter uses an Nim's grammar have a lot of right-associative nodes: And to make the problem worse, in my "superset" grammar, some stuff like command expressions can nest So the bad news is my approach is a dead end. Trying to make a "nice" and "flat" grammar won't cut it. The good news is that one source of combinatorial explosion (command expression) have some pretty restrictive rules in the Nim grammar that I should be able to leverage, so that's progress. |
Interesting, can you elaborate more on this? |
I'm proud to say that the The grammar is written to cover pretty much all of Nim so please test it if you're interested. When you find bugs, feel free to open issues for them, just remember to say that you're using the A lot of testing and cleanup still has to be done before this can replace Right now there's a small blocker on generalized strings due to: I have learnt a whole lot about how tree-sitter splits/merges its states and managed to optimize the grammar by a fair bit to cut down on those, but there are more to be done in the future. I'll save the write up about this part for a later date. For now I'll take a small break from this project. |
Hi there,
I know it's kinda odd to ask about your plans, it's just that I happen to also have worked a bunch on a tree-sitter-nim repo and was wondering how dedicated you are to finishing yours. I haven't had the time to really continue since mid august and frankly the last bits are gonna be difficult (I made a bunch of issues where you could see what's missing in mine).
You seem to have made some good progress and also you seem to have been able to keep the
grammar.js
a lot smaller and possibly cleaner than me so far. Are you dedicated to finishing it? Because then I would not try to push mine to the end.The text was updated successfully, but these errors were encountered: