Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Use itertools to clean up SymbolState::merge #15702

Merged
merged 7 commits into from
Jan 24, 2025

Conversation

dcreager
Copy link
Member

@dcreager dcreager commented Jan 23, 2025

merge_join_by handles the "merge two sorted iterators" bit, and zip handles iterating through the bindings/definitions along with their associated constraints.

@dcreager dcreager added internal An internal refactor or improvement red-knot Multi-file analysis & type inference labels Jan 23, 2025
@carljm
Copy link
Contributor

carljm commented Jan 23, 2025

I think you need @MichaReiser review on this PR :) Both because he knows Rust a lot better than I do, and because I'm aware that he has an aversion to itertools, because it has iterators that silently allocate, and arguably this is bad because allocations should be more visible than that.

@carljm
Copy link
Contributor

carljm commented Jan 23, 2025

Thanks for this!

This definitely looks simpler and easier to understand; having the "merge-join-by" logic extracted as a utility is nice to separate it from the red-knot-specific logic. I'm also not sure how much to weight that here, since this is core infra that shouldn't have to change too often; eking out the best performance we can might be higher priority.

On that note, CodSpeed suggests that this is a 2% regression on "cold" check (which is where I'd expect semantic indexing cost to be relevant, as opposed to "incremental" check where Salsa validation costs usually dominate): https://codspeed.io/astral-sh/ruff/branches/dcreager%2Fmerge-cleanup

I'm not sure where that regression would be coming from, exactly; CodSpeed flame graph suggests it is all coming from a salsa ingredient get_or_create method, which doesn't make a lot of sense, so maybe it's just noise? It does seem like there would be some loss of efficiency here in that we can no longer reuse one of the declarations bitsets (via union-in-place), but instead we start from zero and add all declarations? That maybe isn't likely to make much difference?

Like I said above, I'd like to hear what @MichaReiser thinks.

@carljm
Copy link
Contributor

carljm commented Jan 23, 2025

Might be worth running that cold benchmark a few times locally with and without this change, just to see if we can get a sense of how stable vs noisy that regression is?

@dcreager
Copy link
Member Author

Might be worth running that cold benchmark a few times locally with and without this change, just to see if we can get a sense of how stable vs noisy that regression is?

Can do

we can no longer reuse one of the declarations bitsets (via union-in-place)

I can also try a version that adds back the union in-place. The union itself was OR-ing block by block, instead of bit by bit, and also had a reserve step to make sure there was at most 1 (re)allocation. Interestingly, we were using union for SymbolDeclarations, but not for SymbolBindings!

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. merge_join_by doesn't allocate internally. So I'm fine with this itertools usage. I do recommend using the regular zip over izip because we don't need its functionality and zip is better known (I had to read izip's documentation)

Comment on lines -333 to -337
// Iterate through the definitions from `a` and `b`, always processing the lower definition
// ID first, and pushing each definition onto the merged `SymbolState` with its
// constraints. If a definition is found in both `a` and `b`, push it with the intersection
// of the constraints from the two paths; a constraint that applies from only one possible
// path is irrelevant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to still preserve some of those comments. E.g .this comment could be a great comment above merge_join_by

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines -345 to -352
// SAFETY: we only ever create SymbolState using [`SymbolState::undefined`], which adds
// one "unbound" definition with corresponding narrowing and visibility constraints, or
// using [`SymbolState::record_binding`] or [`SymbolState::record_declaration`], which
// similarly add one definition with corresponding constraints. [`SymbolState::merge`]
// always pushes one definition and one constraint bitset and one visibility constraint
// together (just below), so the number of definitions and the number of constraints can
// never get out of sync.
// get out of sync.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this safety documentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded it a bit and put it up at the definition of the fields in the struct

@dcreager dcreager merged commit 716b246 into main Jan 24, 2025
21 checks passed
@dcreager dcreager deleted the dcreager/merge-cleanup branch January 24, 2025 15:21
@carljm
Copy link
Contributor

carljm commented Jan 24, 2025

The latest version here still showed as a 1% regression in cold benchmark on codspeed. Were you able to reproduce that regression locally?

@dcreager
Copy link
Member Author

The latest version here still showed as a 1% regression in cold benchmark on codspeed. Were you able to reproduce that regression locally?

Yes — it was a 2% regression according to cargo bench before I went back to using union to combine the bitsets. Now it's 1%, as reported by both cargo bench and hyperfine when running against black:

$ ./go
Benchmark 1: ./red_knot_main --project /home/dcreager/git/code-nav/black --venv-path .venv --extra-search-path src
  Time (mean ± σ):     209.2 ms ±  15.7 ms    [User: 873.3 ms, System: 252.1 ms]
  Range (min … max):   185.3 ms … 240.4 ms    15 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: ./red_knot_feature --project /home/dcreager/git/code-nav/black --venv-path .venv --extra-search-path src
  Time (mean ± σ):     211.6 ms ±  15.0 ms    [User: 899.7 ms, System: 254.9 ms]
  Range (min … max):   185.9 ms … 243.5 ms    15 runs

  Warning: Ignoring non-zero exit code.

Summary
  ./red_knot_main --project /home/dcreager/git/code-nav/black --venv-path .venv --extra-search-path src ran
    1.01 ± 0.10 times faster than ./red_knot_feature --project /home/dcreager/git/code-nav/black --venv-path .venv --extra-search-path src

I felt like that was small enough to be worth it, relative to the code being easier to understand. I can revert if folks feel strongly otherwise.

@carljm
Copy link
Contributor

carljm commented Jan 24, 2025

I do think the new version is nicer to read, but personally I'm not sure that it is worth 1% overall regression in cold-check performance, if that regression is real and stable. I think that's a significant regression to accept in exchange for a less concrete benefit. I don't think this SymbolState merging code will require frequent changes, and it is a very hot code path, so I think it is OK to accept more complex code here if it performs better.

Maybe there would be some way to claw back the regression in the new version, if we can identify the source of it? But I also don't know how much time we want to devote to that investigation.

dcreager added a commit that referenced this pull request Jan 24, 2025
* main:
  Add `check` command (#15692)
  [red-knot] Use itertools to clean up `SymbolState::merge` (#15702)
  [red-knot] Add `--ignore`, `--warn`, and `--error` CLI arguments (#15689)
  Use `uv init --lib` in tutorial (#15718)
  [red-knot] Use `Unknown | T_inferred` for undeclared public symbols (#15674)
  [`ruff`] Parenthesize fix when argument spans multiple lines for `unnecessary-round` (`RUF057`) (#15703)
  [red-knot] Rename `TestDbBuilder::typeshed` to `.custom_typeshed` (#15712)
  Honor banned top level imports by TID253 in PLC0415.  (#15628)
  Apply `AIR302`-context check only in `@task` function (#15711)
  [`airflow`] Update `AIR302` to check for deprecated context keys (#15144)
  Remove test rules from JSON schema (#15627)
  Add two missing commits to changelog (#15701)
  Fix grep for version number in docker build (#15699)
  Bump version to 0.9.3 (#15698)
  Preserve raw string prefix and escapes (#15694)
  [`flake8-pytest-style`] Rewrite references to `.exception` (`PT027`) (#15680)
Comment on lines +297 to +302
let a = (a.live_bindings.iter())
.zip(a.constraints)
.zip(a.visibility_constraints);
let b = (b.live_bindings.iter())
.zip(b.constraints)
.zip(b.visibility_constraints);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wasn't aware that we're zipping three iterators here. It could make sense to use izip here ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the extra set of parentheses around the first iterator, I don't mind how this looks, and tbh that was the only reason I reached for izip in the first place! (I never like it when rustfmt takes A.zip(B).zip(C) and breaks up the A across multiple lines...)

Copy link
Member Author

@dcreager dcreager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there would be some way to claw back the regression in the new version, if we can identify the source of it? But I also don't know how much time we want to devote to that investigation.

I am cautiously optimistic about #15731

Comment on lines +297 to +302
let a = (a.live_bindings.iter())
.zip(a.constraints)
.zip(a.visibility_constraints);
let b = (b.live_bindings.iter())
.zip(b.constraints)
.zip(b.visibility_constraints);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the extra set of parentheses around the first iterator, I don't mind how this looks, and tbh that was the only reason I reached for izip in the first place! (I never like it when rustfmt takes A.zip(B).zip(C) and breaks up the A across multiple lines...)

dcreager added a commit that referenced this pull request Jan 24, 2025
This is a follow-up to #15702 that hopefully claws back the 1%
performance regression. Assuming it works, the trick is to iterate over
the constraints vectors via mut reference (aka a single pointer), so
that we're not copying `BitSet`s into and out of the zip tuples as we
iterate. We use `std::mem::take` as a poor-man's move constructor only
at the very end, when we're ready to emplace it into the result. (C++
idioms intended! :smile:)

With local testing via hyperfine, I'm seeing this be 1-3% faster than
`main` most of the time — though a small number of runs (1 in 10,
maybe?) are a wash or have `main` faster. Codspeed reports a 2%
gain.
@sharkdp
Copy link
Contributor

sharkdp commented Jan 24, 2025

I'm late to the party, but wanted to say that I talked about this function with Micha in our 1:1 some weeks ago, and we both agreed that it should be refactored. We then proceeded by doing nothing. So thanks for taking this up @dcreager!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants