v2.0: replay: do not start leader for a block we already have shreds for (backport of #2416) #2484

mergify · 2024-08-08T02:47:52Z

Problem

In certain scenarios where the first leader block is not produced, however the second (or later) leader block is produced we can end up reproducing this block after resetting to a previous block.

Summary of Changes

When poh_recorder checks for leader slot, additionally check blockstore to see if shreds have already been inserted.

This is an automatic backport of pull request #2416 done by [Mergify](https://mergify.com).

…2416) * replay: do not start leader for a block we already have shreds for * pr feedback: comment, move existing check to blockstore fn * move blockstore read after tick height check * pr feedback: resuse blockstore fn in next_leader_slot (cherry picked from commit 15dbe7f) # Conflicts: # poh/src/poh_recorder.rs

mergify · 2024-08-08T02:47:54Z

Cherry-pick of 15dbe7f has failed:

On branch mergify/bp/v2.0/pr-2416
Your branch is up to date with 'origin/v2.0'.

You are currently cherry-picking commit 15dbe7fb0f.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   core/src/replay_stage.rs
	modified:   ledger/src/blockstore.rs
	modified:   ledger/src/leader_schedule_cache.rs

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   poh/src/poh_recorder.rs

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

AshwinSekar · 2024-08-08T02:59:00Z

poh/src/poh_recorder.rs

@@ -569,6 +569,15 @@ impl PohRecorder {
            self.leader_first_tick_height_including_grace_ticks
        {
            if self.reached_leader_tick(my_pubkey, leader_first_tick_height_including_grace_ticks) {
+                if self.blockstore.has_existing_shreds_for_slot(next_poh_slot) {


This is the only functional change.
This fixes a long standing bug where in certain rare scenarios we will attempt to produce a block that we have already produced.

This does not transmit a duplicate block, as a failsafe in broadcast_stage will prevent us from sending the shreds out, however it results in wasted effort in banking_stage as the block is reproduced.

bw-solana

LGTM

sakridge · 2024-08-13T16:43:55Z

ledger/src/blockstore.rs

@@ -3964,6 +3964,13 @@ impl Blockstore {
        Ok(duplicate_slots_iterator.map(|(slot, _)| slot))
    }

+    pub fn has_existing_shreds_for_slot(&self, slot: Slot) -> bool {
+        match self.meta(slot).unwrap() {


why unwrap here?

I think is just replicating the current unwrap behavior. Looks like this fails if we:

hit a rocksdb error

fail deserialization

Neither of these should happen in "normal" operation. Expect would be better.

mergify bot requested a review from a team as a code owner August 8, 2024 02:47

mergify bot added the conflicts label Aug 8, 2024

mergify bot assigned AshwinSekar Aug 8, 2024

fix conflicts

a61028a

AshwinSekar requested review from carllin and bw-solana August 8, 2024 02:55

AshwinSekar reviewed Aug 8, 2024

View reviewed changes

bw-solana approved these changes Aug 9, 2024

View reviewed changes

carllin approved these changes Aug 13, 2024

View reviewed changes

sakridge reviewed Aug 13, 2024

View reviewed changes

sakridge approved these changes Aug 15, 2024

View reviewed changes

AshwinSekar merged commit fd3b4f3 into v2.0 Aug 15, 2024
39 checks passed

AshwinSekar deleted the mergify/bp/v2.0/pr-2416 branch August 15, 2024 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0: replay: do not start leader for a block we already have shreds for (backport of #2416) #2484

v2.0: replay: do not start leader for a block we already have shreds for (backport of #2416) #2484

mergify bot commented Aug 8, 2024

mergify bot commented Aug 8, 2024

AshwinSekar Aug 8, 2024

bw-solana left a comment

sakridge Aug 13, 2024

bw-solana Aug 13, 2024

v2.0: replay: do not start leader for a block we already have shreds for (backport of #2416) #2484

v2.0: replay: do not start leader for a block we already have shreds for (backport of #2416) #2484

Conversation

mergify bot commented Aug 8, 2024

Problem

Summary of Changes

mergify bot commented Aug 8, 2024

AshwinSekar Aug 8, 2024

Choose a reason for hiding this comment

bw-solana left a comment

Choose a reason for hiding this comment

sakridge Aug 13, 2024

Choose a reason for hiding this comment

bw-solana Aug 13, 2024

Choose a reason for hiding this comment