From 7ebb2fe3d31567e30d8ff838343cd04b9c1c910a Mon Sep 17 00:00:00 2001 From: Liu-Cheng Xu Date: Wed, 14 Aug 2024 10:32:17 +0800 Subject: [PATCH] Add compress-state-response-message-in-state-sync.md --- ...ss-state-response-message-in-state-sync.md | 70 +++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 text/0112-compress-state-response-message-in-state-sync.md diff --git a/text/0112-compress-state-response-message-in-state-sync.md b/text/0112-compress-state-response-message-in-state-sync.md new file mode 100644 index 000000000..b7bfbf3ee --- /dev/null +++ b/text/0112-compress-state-response-message-in-state-sync.md @@ -0,0 +1,70 @@ +# RFC-0112: Compress the State Response Message in State Sync + +| | | +| --------------- | ------------------------------------------------------------------------------------------- | +| **Start Date** | 14 August 2024 | +| **Description** | Compress the state response message to reduce the data transfer during the state syncing | +| **Authors** | Liu-Cheng Xu | + +## Summary + +This RFC proposes compressing the state response message during the state syncing process to reduce the amount of data transferred. + +## Motivation + +State syncing can require downloading several gigabytes of data, particularly for blockchains with large state sizes, such as Astar, which +has a state size exceeding 5 GiB (https://github.com/AstarNetwork/Astar/issues/1110). This presents a significant +challenge for nodes with slower network connections. Additionally, the current state sync implementation lacks a persistence feature (https://github.com/paritytech/polkadot-sdk/issues/4), +meaning any network disruption forces the node to re-download the entire state, making the process even more difficult. + +## Stakeholders + +This RFC benefits all projects utilizing the Substrate framework, specifically in improving the efficiency of state syncing. + +- Node Operators. +- Substrate Users. + +## Explanation + +The largest portion of the state response message consists of either `CompactProof` or `Vec`, depending on whether a proof is requested ([source](https://github.com/paritytech/polkadot-sdk/blob/0cd577ba1c4995500eb3ed10330d93402177a53b/substrate/client/network/sync/src/state_request_handler.rs#L216-L241)): + +- `CompactProof`: When proof is requested, compression yields a lower ratio but remains beneficial, as shown in warp sync tests in the Performance section below. +- `Vec`: Without proof, this is theoretically compressible because the entries are generated by iterating the +storage sequentially starting from an empty storage key, which means many entries in the message share the same storage prefix, making it ideal +for compression. + +## Drawbacks + +None identified. + +## Testing, Security, and Privacy + +The [code changes](https://github.com/liuchengxu/polkadot-sdk/commit/2556fefacd2e817111d838af5f46d3dfa495852d) required for this RFC are straightforward: compress the state response on the sender side and decompress it on the receiver side. Existing sync tests should ensure functionality remains intact. + +## Performance, Ergonomics, and Compatibility + +### Performance + +This RFC optimizes network bandwidth usage during state syncing, particularly for blockchains with gigabyte-sized states, while introducing negligible CPU overhead for compression and decompression. For example, compressing the state response during a recent Polkadot warp sync (around height #22076653) reduces the data transferred from 530,310,121 bytes to 352,583,455 bytes — a 33% reduction, saving approximately 169 MiB of data. + +Performance data is based on [this patch](https://github.com/liuchengxu/polkadot-sdk/commit/da93360c9a59c29409061789c598d8f4e55d7856), with logs available [here](https://github.com/liuchengxu/polkadot-sdk/commit/9d98cefd5fac0a001d5910f7870ead05ab99eeba). + +### Ergonomics + +None. + +### Compatibility + +No compatibility issues identified. + +## Prior Art and References + +None. + +## Unresolved Questions + +None. + +## Future Directions and Related Material + +None.