Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The future of syntax highlighting #401

Open
muirdm opened this issue Apr 2, 2022 · 6 comments
Open

The future of syntax highlighting #401

muirdm opened this issue Apr 2, 2022 · 6 comments

Comments

@muirdm
Copy link
Collaborator

muirdm commented Apr 2, 2022

As I hack on go-mode's fontification to support generics, I also am looking at other options.

To summarize how go-mode currently fontifies, it uses a combination of regexes and structural understanding of paren/bracket pairs. We make use of more advanced font-lock facilities such as function matchers and anchored matchers. These are relatively slow, but jit font lock mode does a really good job of only fontifying parts of the file that change, so performance is fine.

Tree-sitter is hot, but unfortunately also only has syntactic information, so does not help in ambiguous cases such as foo[bar](). It also may be more sensitive to syntax errors than go-mode's approach, but on the other hand is almost certainly faster. I think we should consider adopting tree sitter if/when it is part of core emacs, but before that the benefits don't seem that great. (It may have other benefits beyond syntax highlighting, though.)

The other option is LSP's semantic tokens. gopls does support semantic tokens and I was able to get it working with lsp-mode after fixing some things in gopls. In general it works, and of course it has full type information so all our ambiguity problems are solved. The fortification via lsp-mode is asynchronous, so it doesn't cause any lag, although that means it doesn't pop in immediately as you type. You can configure the idle delay before it fontifies. Setting it to 100ms gives a pretty good experience (although if you are working in a package that takes longer than 100ms to type check, the semantic tokens will be slower as well). One of the main problems is syntax errors. Without proper AST and type info, semantic tokens fall apart. To address this, I think we would need better support for partial fontification where gopls and/or lsp-mode know to keep fontification around for parts of the file that can't currently be type checked. This may be easier said than done.

My general idea is to continue to maintain basic fontification in go-mode, but support optionally augmenting it with gopls semantic tokens to fill in the holes. Once that is working well, we can consider completely offloading syntax highlighting to gopls. Thoughts?

@dominikh
Copy link
Owner

dominikh commented Apr 3, 2022

My opinions on the matter:

At its core, it's syntax highlighting. We've spent too much time and code on trying to guess type information. This never worked 100%, and generics make it worse. I'd be more than happy to limit go-mode's native support to actual syntax.

I'd also like to switch to tree-sitter once that's been in core for a couple releases. The core of our syntax highlighting is based on regular expressions, which is of course the wrong tool for Go. We have some custom parsing routines, which add complexity to our code and still aren't perfect. I was also planning on switching to tree-sitter for most custom movement commands. We'll of course have to see how tree-sitter handles invalid code, but from quick testing I've done, it seems to perform fine.

I'm fine with relying on LSP for adding type information to our highlighting. However, I wouldn't be fine with relying on LSP for all syntax highlighting. Having a noticeable delay between typing and any kind of syntax highlighting is IMO a no-go. Having a delay for type information, OTOH, is fine.

To summarize, my end-goal would be:

  • tree-sitter for syntax highlighting
  • LSP for semantic highlighting

In the interim, I would be fine with:

  • simplifying our syntax highlighting to remove the bits that pertain to type information
  • using LSP for semantic highlighting

@muirdm
Copy link
Collaborator Author

muirdm commented Apr 3, 2022

Thanks for that. It sounds like our views mostly align. I suggest we do at least 4baab54 from my branch since that fixes existing fontification.

In the meantime I will mail some gopls fixes and work on getting its semantic token support out of experimental.

@dominikh
Copy link
Owner

dominikh commented Apr 4, 2022

Note, I know next to nothing about semantic tokens in LSP, but the comment in golang/go#45313 (comment) suggests that it doesn't need type information, which is either wrong, or means semantic tokens are less powerful than we need them to be?

@pjweinb
Copy link

pjweinb commented Apr 4, 2022

The existing implementation of semantic tokens in gopls uses type information when that is available. One might be able to get by solely using the ast, but the code would have to be rewritten. Or, by providing less information, which seems to defeat the point.

@dominikh
Copy link
Owner

dominikh commented Apr 4, 2022

Thank you for the clarification.

@the42
Copy link

the42 commented Nov 24, 2022

As tree-sitter is merged into 29 that's the way to go forward

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants