-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi: Rework sync model to use hdr annoucements. #2555
multi: Rework sync model to use hdr annoucements. #2555
Conversation
38758c3
to
fafbfc7
Compare
fafbfc7
to
ee68773
Compare
I added an additional commit since the original PR was opened to change the verification progress reported by the RPC server for the new model as well. |
0334b4c
to
efbc116
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(still doing some functional testing)
Got the following output while syncing on mainnet:
2021-01-18 16:03:03.837 [INF] SYNC: New valid peer [x] (outbound) (/dcrwire:0.4.0/dcrd:1.6.0(pre)/)
2021-01-18 16:03:12.919 [INF] SYNC: Processed 258000 headers in the last 10.15s (progress 94.02%)
2021-01-18 16:03:13.521 [INF] SYNC: Syncing headers to block height 521071 from peer [y] (outbound)
2021-01-18 16:03:19.309 [INF] SYNC: Processed 133074 headers in the last 6.38s (progress 100.00%)
2021-01-18 16:03:19.309 [INF] SYNC: Headers synced -- downloading blocks
2021-01-18 16:03:19.711 [INF] SYNC: Processed 2000 headers in the last 400ms (progress 100.00%)
2021-01-18 16:03:20.191 [INF] SYNC: Processed 2000 headers in the last 470ms (progress 100.00%)
2021-01-18 16:03:20.293 [INF] SYNC: Processed 2000 headers in the last 100ms (progress 100.00%)
2021-01-18 16:03:20.840 [INF] SYNC: Processed 2000 headers in the last 540ms (progress 100.00%)
2021-01-18 16:03:20.866 [INF] SYNC: Processed 2000 headers in the last 20ms (progress 100.00%)
2021-01-18 16:03:21.314 [INF] SYNC: Processed 2000 headers in the last 440ms (progress 100.00%)
2021-01-18 16:03:21.441 [INF] SYNC: Processed 2000 headers in the last 120ms (progress 100.00%)
2021-01-18 16:03:21.787 [INF] SYNC: Processed 2000 headers in the last 340ms (progress 100.00%)
2021-01-18 16:03:22.015 [INF] SYNC: Processed 2000 headers in the last 220ms (progress 100.00%)
2021-01-18 16:03:22.268 [INF] SYNC: Processed 2000 headers in the last 250ms (progress 100.00%)
2021-01-18 16:03:22.591 [INF] SYNC: Processed 2000 headers in the last 320ms (progress 100.00%)
2021-01-18 16:03:22.747 [INF] SYNC: Processed 2000 headers in the last 150ms (progress 100.00%)
2021-01-18 16:03:23.164 [INF] SYNC: Processed 2000 headers in the last 410ms (progress 100.00%)
f249585
to
ed9397a
Compare
That is fixed with the latest. |
Woot :P While downloading headers:
And then while downloading blocks:
|
ed9397a
to
194420b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All issues resolved, tested reorgs on the standard simnet setup and syncing in mainnet.
Progressing nicely :P
194420b
to
2e55226
Compare
I updated the initial chain synced log message to also account for the case when the chain is already fully synced after the initial header sync process so that it prints immediately as expected versus waiting until a new block shows up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing/testing but I did notice one strange behavior. It's intermittent but about 50% of the time the header progress message just keeps getting logged non-stop after the initial header sync is completed (with a fresh db on mainnet):
EDIT: Looks like this was already mentioned above as resolved, but it appears to still be occurring intermittently.
2021-01-18 20:57:49.085 [INF] CMGR: 16 addresses found from seeder mainnet-seed.dcrdata.org
2021-01-18 20:57:49.107 [INF] CMGR: 16 addresses found from seeder mainnet-seed-1.decred.org
2021-01-18 20:57:49.771 [INF] CMGR: 16 addresses found from seeder mainnet-seed-2.decred.org
2021-01-18 20:57:50.065 [INF] CMGR: 16 addresses found from seeder mainnet-seed.planetdecred.org
2021-01-18 20:57:50.478 [INF] CMGR: 16 addresses found from seeder mainnet-seed.jamieholdstock.com
2021-01-18 20:57:53.902 [INF] SYNC: New valid peer 107.152.35.163:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.5.1/)
2021-01-18 20:57:53.902 [INF] SYNC: Syncing headers to block height 521176 from peer 107.152.35.163:9108 (outbound)
2021-01-18 20:57:54.128 [INF] SYNC: New valid peer 51.195.222.207:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.6.0/)
2021-01-18 20:57:54.128 [INF] SYNC: New valid peer 18.198.44.215:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.5.2/)
2021-01-18 20:57:54.152 [INF] SYNC: New valid peer 206.81.16.204:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.7.0(pre)/)
2021-01-18 20:57:54.335 [INF] SYNC: New valid peer 104.199.150.126:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.5.2/)
2021-01-18 20:57:54.335 [INF] SYNC: New valid peer 139.99.135.146:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.6.0/)
2021-01-18 20:57:55.150 [INF] SYNC: New valid peer [2a0b:b7c4:1:10c1::1]:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.6.0(rc4)/)
2021-01-18 20:57:55.317 [INF] SYNC: New valid peer [2001:19f0:7001:3511:5400:1ff:feb3:3d18]:9108 (outbound) (/dcrwire:0.4.0/dcrd:1.5.1/)
2021-01-18 20:57:58.976 [INF] SYNC: Processed 52000 headers in the last 10.12s (progress 9.99%)
2021-01-18 20:58:09.059 [INF] SYNC: Processed 114000 headers in the last 10.08s (progress 30.34%)
2021-01-18 20:58:19.672 [INF] SYNC: Processed 428000 headers in the last 10.61s (progress 45.68%)
2021-01-18 20:58:29.748 [INF] SYNC: Processed 540000 headers in the last 10.07s (progress 70.57%)
2021-01-18 20:58:39.785 [INF] SYNC: Processed 302000 headers in the last 10.03s (progress 87.86%)
2021-01-18 20:58:46.313 [INF] SYNC: Processed 149177 headers in the last 6.52s (progress 100.00%)
2021-01-18 20:58:46.313 [INF] SYNC: Initial headers sync complete -- downloading blocks (best header hash 000000000000000006767d42fc60cf1dba3e3ca83075a51514ff25de7d2dbe3b, height 521177)
2021-01-18 20:58:46.323 [INF] SYNC: Processed 2000 headers in the last 10ms (progress 100.00%)
2021-01-18 20:58:46.347 [INF] SYNC: Processed 2000 headers in the last 20ms (progress 100.00%)
2021-01-18 20:58:46.468 [INF] SYNC: Processed 2000 headers in the last 120ms (progress 100.00%)
2021-01-18 20:58:46.518 [INF] SYNC: Processed 2000 headers in the last 50ms (progress 100.00%)
2021-01-18 20:58:46.604 [INF] SYNC: Processed 2000 headers in the last 80ms (progress 100.00%)
2021-01-18 20:58:46.713 [INF] SYNC: Processed 2000 headers in the last 100ms (progress 100.00%)
2021-01-18 20:58:46.767 [INF] SYNC: Processed 2000 headers in the last 50ms (progress 100.00%)
2021-01-18 20:58:46.921 [INF] SYNC: Processed 2000 headers in the last 150ms (progress 100.00%)
2021-01-18 20:58:46.967 [INF] SYNC: Processed 2000 headers in the last 40ms (progress 100.00%)
2021-01-18 20:58:47.175 [INF] SYNC: Processed 2000 headers in the last 200ms (progress 100.00%)
2021-01-18 20:58:47.187 [INF] SYNC: Processed 2000 headers in the last 10ms (progress 100.00%)
2021-01-18 20:58:47.228 [INF] SYNC: Processed 2000 headers in the last 40ms (progress 100.00%)
2021-01-18 20:58:47.238 [INF] SYNC: Processed 2000 headers in the last 10ms (progress 100.00%)
2021-01-18 20:58:47.427 [INF] SYNC: Processed 2000 headers in the last 180ms (progress 100.00%)
...
2e55226
to
23fa9f0
Compare
Upon closer inspection I see this is something else than what I thought it was. The culprit was a new block showing up during the initial headers sync. I've pushed a proper fix for it. |
23fa9f0
to
e46ff1f
Compare
e46ff1f
to
0d6a9d4
Compare
I tested a bunch more and can confirm that this is resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I did a bunch more testing, and everything has been working as expected. I just had a few typos called out inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
0d6a9d4
to
96a6c20
Compare
In addition to addressing all of the review items, I added the interval since the previous block to the "New block ...." log print stats once steady state is reached in the latest update.
|
96a6c20
to
b17a77e
Compare
b17a77e
to
a8bf6e3
Compare
This modifies the progress logger to make it a bit more generic so that it can support logging header sync progress in the future too and modernizes the code while here.
This removes the flag from the SubmitBlock method of the sync manager since those flags are not really intended for general block submission which is why it is only ever called with blockchain.BFNone. Also, the intention is to modify the way the sync manager deals with block submission in the future which this change help simplify.
This removes the orphan return flag from the SubmitBlock and ProcessBlock netsync methods in favor of allowing the caller to detect it from the error. This is a step towards the goal of removing handling of orphan blocks altogether in favor of using block headers for sync.
This removes all orphan block handling from the syncing process and is part of the overall effort to move to full headers-first syncing as opposed to the current partial header-first checkpoint-based process. It should be noted that this change is made in such a way that the syncing process continues to work properly, however, it is only intended to be an intermediate step, and, as such, it does result in suboptimal behavior once it passes the final checkpoint. That will no longer be the case one the entire sync process conversion is complete.
a8bf6e3
to
752e723
Compare
This overhauls the primary code that deals with synchronizing the blockchain with other peers on the network to use a model based on header announcements instead of inventory vectors. Currently, all blocks are discovered via a combination of inventory vector announcements and the getblocks protocol message, both of which only include the hash of the blocks in question. This is not ideal because it means blocks are blindly downloaded based on those announcements without knowing if they're actually likely to be something that is useful since there is no further information such as what blocks they connect to. It also means that extra logic is needed to deal with orphan blocks (those whose parents are not yet known) such as caching them, determining known orphan roots, and expiration. In short, the current method generally ends up wasting bandwidth and also makes it much more difficult to detect certain classes of malicious behavior. The recently added capability of blockchain to process headers independently from blocks while the block data is added out of order later opened the door for a much better model that addresses all of the aforementioned issues as well as paves the way for many other related enhancements. The new model discovers and downloads all of the headers prior to any block data via the getheaders protocol message and then uses those headers to determine exactly which blocks need to be downloaded to reach the tip of the best chain. Notably, this means orphan blocks are now a thing of the past as blocks that do not connect are no longer downloaded under any circumstance. Further, the new model also makes use of sendheaders so that all block announcements are made via the associated block header. This in turn is used to better determine if an announced block is likely to be useful prior to downloading it. It should be noted that the changes herein are intentionally limited to those necessary to use the new sync model based on headers in an incremental fashion to help simplify review and make it easier to assert correctness. There are many more improvements that this model paves the way to support planned for future commits. For example, syncing from multiple peers in parallel and improved DoS protection. The following is a high-level overview of the key features: - All headers are downloaded and examined prior to downloading any blocks - The initial header sync process: - Detects and recovers from stalled/malicious peers - The concept of orphan blocks no longer exists - This means blocks which are not already known to connect are not downloaded - All block announcements are handled via header announcements - Detects and prevents malicious behavior related to orphan headers - The chain sync process: - Starts once the headers are downloaded and entails both downloading blocks as well as verifying them - Uses the headers to determine the best blocks to download - Pipelines the requests to increase throughput - Actively avoids downloading duplicate blocks - The sync height is dynamically updated as new headers are discovered
This modifies the progress logger to expose a new method for logging the progress of header syncing and adds tests to ensure proper functionality.
This modifies netsync to provide progress logging for the initial headers sync including a progress percentage. A new method is added that determines the progress of the headers sync based on an algorithm that calculates the expected number of headers there should be in the time interval since the current best known header and the current time given the target block time for the network. This is the preferred approach because, unlike the sync height reported by remote peers, it is difficult to game since it is based on the target proof of work, but it assumes consistent mining, which is not the case on all networks, so it is limited to the main and test networks where that applies. For the other networks, it falls back to the sync height reported by the remote peers. The best known header is now logged at initial load time in blockchain and the netsync code is updated to make use of the new headers sync progress logger to tally the number of processed and periodically log the progress.
This modifies blockchain, progresslog, and netsync to provide a progress percentage when logging the main chain verification process. A new method is added to blockchain that determines the verification progress of the current best chain based on how far along it towards the best known header and blockchain is also modified to log the verification progress in the initial chain state message when the blockchain is first loaded. In addition, the progess logger is modified to log the verification progress instead of the timestamp of the last block and the netsync code is updated to make use of the new verification progress func per above. Finally, the netsync code now determines when the chain is considered current and logs new blocks as they are connected along with their hash and additional details. This provides a cleaner distinction between the initial verification and catchup phase and steady state operation.
This removes the response dealine for getheaders since there is no guaranteed response if the remote peer does not have any headers for the locator. Also, while here, modify the debug logging to only log a deadline was added when it actually was.
This modifies the RPC server getblockchaininfo to base the verification progress calculation on the best header now that all headers are determined prior to downloading and verifying blocks.
752e723
to
cbe730a
Compare
This overhauls the primary code that deals with synchronizing the blockchain with other peers on the network to use a model based on header announcements instead of inventory vectors.
Currently, all blocks are discovered via a combination of inventory vector announcements and the
getblocks
protocol message, both of which only include the hash of the blocks in question.This is not ideal because it means blocks are blindly downloaded based on those announcements without knowing if they're actually likely to be something that is useful since there is no further information such as what blocks they connect to. It also means that extra logic is needed to deal with orphan blocks (those whose parents are not yet known) such as caching them, determining known orphan roots, and expiration.
In short, the current method generally ends up wasting bandwidth and also makes it much more difficult to detect certain classes of malicious behavior.
The recently added capability of blockchain to process headers independently from blocks while the block data is added out of order later opened the door for a much better model that addresses all of the aforementioned issues as well as paves the way for many other related enhancements.
The new model discovers and downloads all of the headers prior to any block data via the
getheaders
protocol message and then uses those headers to determine exactly which blocks need to be downloaded to reach the tip of the best chain. Notably, this means orphan blocks are now a thing of the past as blocks that do not connect are no longer downloaded under any circumstance.Further, the new model also makes use of
sendheaders
so that all block announcements are made via the associated block header. This in turn is used to better determine if an announced block is likely to be useful prior to downloading it.It should be noted that the changes herein are intentionally limited to those necessary to use the new sync model based on headers in an incremental fashion to help simplify review and make it easier to assert correctness. There are many more improvements that this model paves the way to support planned for future commits. For example, syncing from multiple peers in parallel and improved DoS protection.
All of the changes are split up into a series of logic commits that describe each step to help ease the review process.
High-level overview of the key features:
This is work towards #1145.
Fixes #2252.