Skip to content
This repository has been archived by the owner on Jan 11, 2024. It is now read-only.

FM-365: Commit snapshot #443

Merged
merged 6 commits into from
Nov 27, 2023
Merged

FM-365: Commit snapshot #443

merged 6 commits into from
Nov 27, 2023

Conversation

aakoshh
Copy link
Contributor

@aakoshh aakoshh commented Nov 22, 2023

Closes consensus-shipyard/ipc#176

  • When the last chunk is saved we return a SnapshotItem
  • The SnapshotItem::import is used to load the data into a blockstore and validate that the imported metadata matches.
  • If successful the App state is updated to the manifest.

Doesn't contain:

  • Tests, deferred to End-to-end test for snapshots ipc#178
  • The clearing of the /tmp directory - we can still investigate the snapshot contents after we're done
  • The offering of the imported snapshot for others - we'll offer the next snapshot we make, but not this, although we could if we copied the contents from /tmp to the persistent place
  • Anything clever like importing the CAR file into a temporary namespace in RocksDB, or trying to not import the artificial metadata field. This is a security vulnerability because we allow potentially bogus CIDs to enter the state store, which could lead to consensus failure if some smart contract computation depends on whether something is present or not.

@aakoshh aakoshh marked this pull request as ready for review November 22, 2023 20:02
// but for now let's keep it just in case we need to investigate
// some problem.

// We could also move the files into our own snapshot directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the question that I had in the previous PR, if we were storing the snapshot alreaedy to offer it to others, or we were just going to wait for the next snapshot to start providing snapshots to the network.

I think your current approach of keeping it in tmp/ for now and maybe address it in a follow-up makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest it's mostly laziness, but I also think we should first try to resume syncing with the chain before we re-transmit this snapshot. I'm not exactly sure whether it's possible to do too much validation on the snapshot contents. I validate the checksum, but I am not sure whether a particular snapshot chosen by CometBFT necessarily has more than one validator behind it, or it might be just a one-off, which would allow somebody to smuggle in illegal CIDs into the state store, and then later cause consensus failure by deploying a contract that looks it up, finding it one the store of some, but not other validators.

IIRC this would only affect Wasm actors, not EVM, and by the time it's relevant the FVM might offer protection against it. But it reminds me that I should try to remove even the metadata record from the blockstore after the import.


let snapshot = result.context("failed to import the snapshot into the blockstore")?;

// 4. See if we actually imported what we thought we would.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if the validation fail and we've already imported the snapshot? I guess you would need to import a valid snapshot, and the invalid blocks would still be stored in the blockstore, right? I don't think is critical at this point, but worth tracking it somewhere, as several import failures could make the blockstore size increase significantly. Can't we do the validation before importing the CAR?

Copy link
Contributor Author

@aakoshh aakoshh Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, the unwanted blocks would be left in the blockstore. Because they have been imported into the state store directly, the only way to remove them would be to implement garbage collection.

I actually included such thoughts into this ticket as loading the data into a staging area inside RocksDB (a separate namespace/collection) and only if successful copying it over to their final destination, or dropping the collection if the import failed. However, I don't think I have the necessary API exposed at the moment and didn't want to complicate things while we don't even have the happy path working.

Can't we do the validation before importing the CAR?

To be honest I was only following what Will's Snapshot lets me do easily. I see it uses load_car_unchecked which doesn't even verify CIDs; I'll change that. By the looks of it it's generally not easy to validate a CAR file: in our case the only CID in the header is the one of the metadata record, which only points at the CID of some data records, which is a tuple that is an internal implementation detail of the Snapshot. It is actually the last thing it puts in the file I think, but to get there you'd have to enumerate everything and then parse it correctly - and this is just the V1 snapshot.

So I landed on the "trust, but verify" approach.

I checked what kind of validations I noted down in https://github.com/consensus-shipyard/fendermint/issues/364 but they are also post-import. I'll make a note to include the on-the-fly validation there though.

@aakoshh aakoshh force-pushed the fm-362-take-snapshot-chunks branch from 8693151 to 1e2e5a7 Compare November 27, 2023 09:57
Base automatically changed from fm-362-take-snapshot-chunks to main November 27, 2023 09:57
@aakoshh aakoshh force-pushed the fm-365-commit-snapshot branch from ce7ffa8 to ca27da2 Compare November 27, 2023 09:59
@aakoshh aakoshh merged commit a1eb20a into main Nov 27, 2023
1 of 5 checks passed
@aakoshh aakoshh deleted the fm-365-commit-snapshot branch November 27, 2023 11:19
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Commit snapshots to the blockstore
2 participants