You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A pruning event deletes all tree nodes invalidated (orphaned) at an arbitrary prune version, less than the tree's current version. Given the earliest version of the tree on disk m, and prune version n, v2 calculates the set difference all_nodes(m -> n) - orphaned_at(n) and, importantly, writes the set to disk in a new shard. For a B+Tree, inserting is markedly faster than deleting. After completion the old shard(s) are dropped, and complete version of the tree at version n is in shard n.
There are some opportunities in the current implementation for performance improvements.
Rate Limiting
The current rate limit mechanism is global pruning limit. This works but we can probably do better.
Right now pruning begins at prune_height + keep_versions even though a new shard is created at prune_height+1. Pruning could be started at prune_height and rate limited (slowed way down) to reduce I/O pressure. This is probably as simple as smaller batches and a configurable pause between to keep to a target insert rate.
SQLite JOIN
The current set difference calculation is CPU and memory intensive. It naively loads all orphans into a map and filters branch and leaf nodes against that map. The reason for this design is to keep leaf and branch orphan writes fast, there is no index on orphan tables. If there were a SQLite JOIN statement could be used to populate the new shard, which is probably vastly more efficient.
Pruning could build the orphan indexes on (version, sequence) then execute the JOIN statements. Of course this still costs CPU and I/O, so it's not clear where the infection point (on tree size) is for it to be more efficient, but it probably is in some cases.
The text was updated successfully, but these errors were encountered:
A pruning event deletes all tree nodes invalidated (orphaned) at an arbitrary prune version, less than the tree's current version. Given the earliest version of the tree on disk m, and prune version n, v2 calculates the set difference
all_nodes(m -> n) - orphaned_at(n)
and, importantly, writes the set to disk in a new shard. For a B+Tree, inserting is markedly faster than deleting. After completion the old shard(s) are dropped, and complete version of the tree at version n is in shard n.There are some opportunities in the current implementation for performance improvements.
Rate Limiting
The current rate limit mechanism is global pruning limit. This works but we can probably do better.
Right now pruning begins at
prune_height + keep_versions
even though a new shard is created atprune_height+1
. Pruning could be started atprune_height
and rate limited (slowed way down) to reduce I/O pressure. This is probably as simple as smaller batches and a configurable pause between to keep to a target insert rate.SQLite JOIN
The current set difference calculation is CPU and memory intensive. It naively loads all orphans into a map and filters branch and leaf nodes against that map. The reason for this design is to keep leaf and branch orphan writes fast, there is no index on orphan tables. If there were a SQLite JOIN statement could be used to populate the new shard, which is probably vastly more efficient.
Pruning could build the orphan indexes on (version, sequence) then execute the JOIN statements. Of course this still costs CPU and I/O, so it's not clear where the infection point (on tree size) is for it to be more efficient, but it probably is in some cases.
The text was updated successfully, but these errors were encountered: