Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade completions, definition, hover #166

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

milesziemer
Copy link
Contributor

@milesziemer milesziemer commented Sep 24, 2024

This commit is a rewrite of how language features (i.e. completions, definition, hover) are implemented. It improves the accuracy and expands the functionality of each feature significantly. Improvements include:

  • Completions - Trait values - Builtin control keys and metadata - Namespaces, based on other namespaces in the project - Keywords - Member names (like inside resources, maps) - Member values (like inside the list of operation errors, resource property targets, etc.) - Elided members - Some trait values have special completions, like examples has completions for the target operation's input/output parameters
  • Definition - Trait values - Elided members - Shape ids referenced within trait values
  • Hover - Trait values - Elided members - Builtin metadata

There's a lot going on here, but there's a few key pieces of this commit that all work together to make this work:

At the core of these improvements is the addition of a custom parser for the IDL that provides the needed syntactic information to implement these features. See the javadoc on the Syntax class for more details on how the parser works, and why it was written that way. At a high level though, the parser produces a flat list of Syntax.Statement, and that list is searched through to find things, such as the statement the cursor is currently in. It is also used to search 'around' a statement, like to find the shape a trait is being applied to.

Another key piece of these changes is NodeCursor and NodeSearch. There are a few places in the syntax of a smithy file where you may have a node value whose structure is (or can be) described by a Smithy model. For example, trait values. NodeCursor is basically two things: 1. A path from the start of a Node to a position within that Node, 2. An index into that path. NodeSearch is used to search a model along the path of a NodeCursor, from a starting shape. For example, when the cursor is within a trait value, the NodeCursor is that path from the root of the trait value, to the cursor position, and NodeSearch is used to search in the model, starting at the trait's definition, along the path of the NodeCursor, to find what shape corresponds to the cursor's location. That shape can then be used e.g. to provide completions.

Finally, there's the Builtins class, and the corresponding Smithy model it uses. I originally had a completely different abstraction for describing the structure of metadata, different shape types' members, and even smithy-build.json. But it was basically just a 'structured graph', like a Smithy model. So I decided to just use a Smithy model itself, since I already had the abstractions for traversing it (like I had to for trait values). The Builtins model contains shapes that define the structure of certain Smithy constructs. For example, I use it to model the shape of builtin metadata, like suppressions. I also use it to model the shape of shapes, that is, what members shapes have, and what their targets are. Some shapes in this model are considered 'builtins' (in the builtins.smithy files). Builtins are shapes that require some custom processing, or have some special meaning, like AnyNamespace, which is used for describing a namespace that can be used in
https://smithy.io/2.0/spec/model-validation.html#suppression-metadata. The builtin model pretty 'meta', and I don't love it, but it reduces a significant amount of duplicated logic. For example, if we want to give documentation for some metadata, it is as easy as adding it to the builtins model. We can also use it to add support for smithy-build.json completions, hover, and even validation, later. It would be nice if these definitions lived elsewhere, so other tooling could consume them, like the Smithy docs for example, and I have some other ideas on how we can use it, but they're out of scope here.

Testing for this commit comes mostly from the completions, definitions, and hover tests, which indirectly test lower-level components like the parser (there are still some parser tests, though).

Edit 12/17: See commit message for details.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@milesziemer milesziemer requested a review from a team as a code owner September 24, 2024 18:11
@milesziemer milesziemer requested review from gosar and hpmellema and removed request for gosar September 24, 2024 18:11
import software.amazon.smithy.lsp.syntax.Syntax;
import software.amazon.smithy.lsp.syntax.SyntaxSearch;

sealed interface IdlPosition {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate this class. I spent some time looking for a way to abstract away the direct usage of documentIndex and statementIndex, both here and in SyntaxSearch, but didn't come up with anything great. Open to suggestions.

@milesziemer milesziemer force-pushed the language-feature-refactor branch from 8f9e2ef to c65cbe3 Compare November 27, 2024 17:16
This commit is a rewrite of how language features (i.e. completions,
definition, hover) are implemented. It improves the accuracy and expands
the functionality of each feature significantly. Improvements include:
- Completions
    - Trait values
    - Builtin control keys and metadata
    - Namespaces, based on other namespaces in the project
    - Keywords
    - Member names (like inside resources, maps)
    - Member values (like inside the list of operation errors, resource
      property targets, etc.)
    - Elided members
    - Some trait values have special completions, like `examples` has
      completions for the target operation's input/output parameters
- Definition
    - Trait values
    - Elided members
    - Shape ids referenced within trait values
- Hover
    - Trait values
    - Elided members
    - Builtin metadata

There's a lot going on here, but there's a few key pieces of this commit
that all work together to make this work:

At the core of these improvements is the addition of a custom parser for
the IDL that provides the needed syntactic information to implement
these features. See the javadoc on the Syntax class for more details on
how the parser works, and why it was written that way. At a high level
though, the parser produces a flat list of `Syntax.Statement`, and that
list is searched through to find things, such as the statement the
cursor is currently in. It is also used to search 'around' a statement,
like to find the shape a trait is being applied to.

Another key piece of these changes is `NodeCursor` and `NodeSearch`.
There are a few places in the syntax of a smithy file where you may have
a node value whose structure is (or can be) described by a Smithy model.
For example, trait values. `NodeCursor` is basically two things: 1. A
path from the start of a `Node` to a position within that `Node`, 2. An
index into that path. `NodeSearch` is used to search a model along the
path of a `NodeCursor`, from a starting shape. For example, when the
cursor is within a trait value, the `NodeCursor` is that path from the
root of the trait value, to the cursor position, and `NodeSearch` is
used to search in the model, starting at the trait's definition, along
the path of the `NodeCursor`, to find what shape corresponds to the
cursor's location. That shape can then be used e.g. to provide completions.

Finally, there's the `Builtins` class, and the corresponding Smithy
model it uses. I originally had a completely different abstraction for
describing the structure of metadata, different shape types' members,
and even `smithy-build.json`. But it was basically just a 'structured
graph', like a Smithy model. So I decided to just _use_ a Smithy model
itself, since I already had the abstractions for traversing it (like I
had to for trait values). The `Builtins` model contains shapes that
define the structure of certain Smithy constructs. For example, I use it
to model the shape of builtin metadata, like suppressions. I also use it
to model the shape of shapes, that is, what members shapes have, and
what their targets are. Some shapes in this model are considered
'builtins' (in the builtins.smithy files). Builtins are shapes that
require some custom processing, or have some special meaning, like
`AnyNamespace`, which is used for describing a namespace that can be
used in
https://smithy.io/2.0/spec/model-validation.html#suppression-metadata.
The builtin model pretty 'meta', and I don't _love_ it, but it reduces a
significant amount of duplicated logic. For example, if we want to give
documentation for some metadata, it is as easy as adding it to the
builtins model. We can also use it to add support for smithy-build.json
completions, hover, and even validation, later. It would be nice if
these definitions lived elsewhere, so other tooling could consume them,
like the Smithy docs for example, and I have some other ideas on how we
can use it, but they're out of scope here.

Testing for this commit comes mostly from the completions, definitions,
and hover tests, which indirectly test lower-level components like the
parser (there are still some parser tests, though).
This commit keeps the functionality added to the language features in
the previous commits, but does some broad refactoring of those changes
to clean up the APIs, get rid of some footguns, and reduce the chance of
some concurrency/parallelism issues.

The main changes are:
- Syntax.Ident/Syntax.Node.Str produced by the new parser now copy the
  actual string value. Previously, they only stored the start/end
  positions, and required you to copy the value out of the Document
  on-demand. This reduced the memory footprint of parsing, but I was
  concerned about the Document being changed at the same time another
  thread is trying to copy a value out of it. Copying eagerly avoids
  this. Plus, we can avoid most of the memory issues by doing partial
  reparsing (more on that later).
- Project now stores an index of files -> shapes defined in that file,
  instead of storing the shapes on the SmithyFile. This index is only
  needed to help determine which shapes need to be removed when
  rebuilding the model, so it doesn't make sense for SmithyFile to know
  about it. This also ties into the next change...
- Multiple changes to SmithyFile. SmithyFile now has a subclass,
  IdlFile, which stores its parse result. With the addition of the
  parser, and the changes to make
  DocumentVersion/DocumentImports/DocumentNamespace be computed from the
  parse result, SmithyFile can't represent both IDL and AST files.
  Arguably, it never really did because AST files don't have
  namespaces/imports. Either way, IdlFile now provides access to the
  parse result, which contains DocumentNamespace/Version/Imports, as
  well as the parsed statements. I also added synchronization to handle
  access to the parse result, since it will be mutated on every change.
  I don't really like how this works, but I'm going to address that in a
  future update (which I will describe below).
- Added StatementView, which wraps a list of parsed statements and a
  specific index in that list, providing methods to look "around" that
  index. This replaces the error-prone and unreadable SyntaxSearch,
  which required you to pass around int indicies everywhere.

Some more minor changes to note:
- Moved diagnostics computation into SmithyDiagnostics. It already
  belonged there probably, but especially with the addition of IdlFile I
  just had to do it.
- Moved document symbols into a 'handler' like definition, etc.
- Added `uri` and `isDetached` properties to ProjectAndFile, for
  convenience.

There are still some rough edges with this code, but I plan on making a
follow up PR to address them, so I this one doesn't become even larger.
Specifically, I want to only parse opened/managed files. This could let
us get rid of the whole ProjectFile thing, or at least not require going
through a project to find a file (it would be stored directly on
ServerState). This also makes the synchronization story much simpler,
improves initialization time, and should make it easier to eventually
load projects async.
@milesziemer milesziemer force-pushed the language-feature-refactor branch from 722865a to 046a74b Compare December 18, 2024 20:16
In some cases, when a completion is meant to replace existing text, the
range it was supposed to replace would leave an extra character at the
end. This was because the range's end position was not exclusive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants