-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.0: replay: do not start leader for a block we already have shreds for (backport of #2416) #2484
Conversation
…2416) * replay: do not start leader for a block we already have shreds for * pr feedback: comment, move existing check to blockstore fn * move blockstore read after tick height check * pr feedback: resuse blockstore fn in next_leader_slot (cherry picked from commit 15dbe7f) # Conflicts: # poh/src/poh_recorder.rs
Cherry-pick of 15dbe7f has failed:
To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
@@ -569,6 +569,15 @@ impl PohRecorder { | |||
self.leader_first_tick_height_including_grace_ticks | |||
{ | |||
if self.reached_leader_tick(my_pubkey, leader_first_tick_height_including_grace_ticks) { | |||
if self.blockstore.has_existing_shreds_for_slot(next_poh_slot) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only functional change.
This fixes a long standing bug where in certain rare scenarios we will attempt to produce a block that we have already produced.
This does not transmit a duplicate block, as a failsafe in broadcast_stage
will prevent us from sending the shreds out, however it results in wasted effort in banking_stage
as the block is reproduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -3964,6 +3964,13 @@ impl Blockstore { | |||
Ok(duplicate_slots_iterator.map(|(slot, _)| slot)) | |||
} | |||
|
|||
pub fn has_existing_shreds_for_slot(&self, slot: Slot) -> bool { | |||
match self.meta(slot).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why unwrap here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is just replicating the current unwrap behavior. Looks like this fails if we:
- hit a rocksdb error
- fail deserialization
Neither of these should happen in "normal" operation. Expect would be better.
Problem
See solana-labs#32679
In certain scenarios where the first leader block is not produced, however the second (or later) leader block is produced we can end up reproducing this block after resetting to a previous block.
Summary of Changes
When
poh_recorder
checks for leader slot, additionally checkblockstore
to see if shreds have already been inserted.This is an automatic backport of pull request #2416 done by [Mergify](https://mergify.com).