Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include base -> overlay key-values migration logic #199

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 139 additions & 10 deletions core/state_processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ package core

import (
"bytes"
"encoding/binary"
"fmt"
"math/big"
"time"

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/consensus"
Expand All @@ -33,6 +35,9 @@ import (
"github.com/ethereum/go-ethereum/log"
"github.com/ethereum/go-ethereum/params"
"github.com/ethereum/go-ethereum/trie"
tutils "github.com/ethereum/go-ethereum/trie/utils"
"github.com/gballet/go-verkle"
"github.com/holiman/uint256"
)

// StateProcessor is a basic Processor, which takes care of transitioning
Expand Down Expand Up @@ -96,6 +101,7 @@ func (p *StateProcessor) Process(block *types.Block, statedb *state.StateDB, cfg
// N values from the MPT into the verkle tree.
if fdb, ok := statedb.Database().(*state.ForkingDB); ok {
if fdb.InTransition() {
now := time.Now()
// XXX overkill, just save the parent root in the forking db
tt := statedb.GetTrie().(*trie.TransitionTrie)
mpt := tt.Base()
Expand All @@ -109,17 +115,22 @@ func (p *StateProcessor) Process(block *types.Block, statedb *state.StateDB, cfg
return nil, nil, 0, err
}

// move N=500 accounts into the verkle tree, starting with the
const maxMovedCount = 500
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only this offtopic nit.

// mkv will be assiting in the collection of up to maxMovedCount key values to be migrated to the VKT.
// It has internal caches to do efficient MPT->VKT key calculations, which will be discarded after
// this function.
mkv := &keyValueMigrator{}
Comment on lines +119 to +122
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This keyValueMigrator is a component that encapsulates a bit capturing the relevant walked key-values. I did this to keep the walk clean and let the component do the work. I'll comment on it later.

// move maxCount accounts into the verkle tree, starting with the
// slots from the previous account.
count := 0
for ; stIt.Next() && count < 500; count++ {
addr := rawdb.ReadPreimage(statedb.Database().DiskDB(), accIt.Hash())
for ; stIt.Next() && count < maxMovedCount; count++ {
slotnr := rawdb.ReadPreimage(statedb.Database().DiskDB(), stIt.Hash())

// @jsign: do your magic here adding the slot `slotnr`
mkv.addStorageSlot(addr, slotnr, stIt.Slot())
}

// if less than 500 slots were moved, move to the next account
for count < 500 {
// if less than maxCount slots were moved, move to the next account
for count < maxMovedCount {
if accIt.Next() {
acc, err := snapshot.FullAccount(accIt.Account())
if err != nil {
Expand All @@ -128,21 +139,21 @@ func (p *StateProcessor) Process(block *types.Block, statedb *state.StateDB, cfg
}
addr := rawdb.ReadPreimage(statedb.Database().DiskDB(), accIt.Hash())

// @jsign: do your magic here adding the account at `addr
mkv.addAccount(addr, acc)

// Store the account code if present
if !bytes.Equal(acc.CodeHash, emptyCode) {
code := rawdb.ReadCode(statedb.Database().DiskDB(), common.BytesToHash(acc.CodeHash))
chunks := trie.ChunkifyCode(code)

// @jsign: do your magic here with the code chunks
mkv.addAccountCode(addr, uint64(len(code)), chunks)
}

if !bytes.Equal(acc.Root, emptyRoot[:]) {
for ; stIt.Next() && count < 500; count++ {
for ; stIt.Next() && count < maxMovedCount; count++ {
slotnr := rawdb.ReadPreimage(statedb.Database().DiskDB(), stIt.Hash())

// @jsign do your magic here with extra slots
mkv.addStorageSlot(addr, slotnr, stIt.Slot())
}
Comment on lines -131 to 157
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: we simply "collect" the key values and let mkv capture them.

}
}
Expand All @@ -157,6 +168,13 @@ func (p *StateProcessor) Process(block *types.Block, statedb *state.StateDB, cfg
fdb.LastAccHash = accIt.Hash()
fdb.LastSlotHash = stIt.Hash()
}
log.Info("Collected and prepared key values from base tree", "count", count, "duration", time.Since(now))

now = time.Now()
if err := mkv.migrateCollectedKeyValues(tt.Overlay()); err != nil {
return nil, nil, 0, fmt.Errorf("could not migrate key values: %w", err)
}
Comment on lines +174 to +176
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we finish collecting them, we let the component do the go-verkle use of new APIS: batch create leaf nodes, and insert into the tree. More about this below.

log.Info("Inserted key values in overlay tree", "count", count, "duration", time.Since(now))
}
}

Expand Down Expand Up @@ -232,3 +250,114 @@ func ApplyTransaction(config *params.ChainConfig, bc ChainContext, author *commo
vmenv := vm.NewEVM(blockContext, vm.TxContext{}, statedb, config, cfg)
return applyTransaction(msg, config, author, gp, statedb, header.Number, header.Hash(), tx, usedGas, vmenv)
}

// keyValueMigrator is a helper struct that collects key-values from the base tree.
// The walk is done in account order, so **we assume** the APIs hold this invariant. This is
// useful to be smart about caching banderwagon.Points to make VKT key calculations faster.
type keyValueMigrator struct {
currAddr []byte
currAddrPoint *verkle.Point

vktLeafData map[string]*verkle.BatchNewLeafNodeData
}
Comment on lines +254 to +262
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be much surprises here.
The TL;DR is that since we walk the tree by address, we keep the current address point cached and only update if a new account is detected. This makes sense considering how the caller uses this component.

We keep collecting stuff in the vktLeafData which we use at the end to batch create leaf nodes and insert into the tree.

The method implementations shouldn't be surprising. They're doing something similar to what we did in the full tree conversion. Note that I'm sharing as soon as I finished writing the code, so might be something missing, so please double check.

I tried to use as many defined constants as possible to avoid magic indexes or sizes, etc.

(As mentioned in the PR description, there're some potential changes to make this faster. But I think this is reasonable and quite clear/clean to have a decent starting point; we can see later if we need to push limits a bit more that can justify a more complex version).


func (kvm *keyValueMigrator) addStorageSlot(addr []byte, slotNumber []byte, slotValue []byte) {
addrPoint := kvm.getAddrPoint(addr)

vktKey := tutils.GetTreeKeyStorageSlotWithEvaluatedAddress(addrPoint, slotNumber)
leafNodeData := kvm.getOrInitLeafNodeData(vktKey)

leafNodeData.Values[vktKey[verkle.StemSize]] = slotValue
}

func (kvm *keyValueMigrator) addAccount(addr []byte, acc snapshot.Account) {
addrPoint := kvm.getAddrPoint(addr)

vktKey := tutils.GetTreeKeyVersionWithEvaluatedAddress(addrPoint)
leafNodeData := kvm.getOrInitLeafNodeData(vktKey)

var version [verkle.LeafValueSize]byte
leafNodeData.Values[tutils.VersionLeafKey] = version[:]

var balance [verkle.LeafValueSize]byte
for i, b := range acc.Balance.Bytes() {
balance[len(acc.Balance.Bytes())-1-i] = b
}
leafNodeData.Values[tutils.BalanceLeafKey] = balance[:]

var nonce [verkle.LeafValueSize]byte
binary.LittleEndian.PutUint64(nonce[:8], acc.Nonce)
leafNodeData.Values[tutils.NonceLeafKey] = balance[:]

leafNodeData.Values[tutils.CodeKeccakLeafKey] = acc.CodeHash[:]

// Code size is ignored here. If this isn't an EOA, the tree-walk will call
// addAccountCode with this information.
}

func (kvm *keyValueMigrator) addAccountCode(addr []byte, codeSize uint64, chunks []byte) {
addrPoint := kvm.getAddrPoint(addr)

vktKey := tutils.GetTreeKeyVersionWithEvaluatedAddress(addrPoint)
leafNodeData := kvm.getOrInitLeafNodeData(vktKey)

// Save the code size.
var codeSizeBytes [verkle.LeafValueSize]byte
binary.LittleEndian.PutUint64(codeSizeBytes[:8], codeSize)
leafNodeData.Values[tutils.CodeSizeLeafKey] = codeSizeBytes[:]

// The first 128 chunks are stored in the account header leaf.
for i := 0; i < 128 && i < len(chunks)/32; i++ {
leafNodeData.Values[byte(128+i)] = chunks[32*i : 32*(i+1)]
}

// Potential further chunks, have their own leaf nodes.
for i := 128; i < len(chunks)/32; {
vktKey := tutils.GetTreeKeyCodeChunkWithEvaluatedAddress(addrPoint, uint256.NewInt(uint64(i)))
leafNodeData := kvm.getOrInitLeafNodeData(vktKey)

j := i
for ; (j-i) < 256 && j < len(chunks)/32; j++ {
leafNodeData.Values[byte((j-128)%256)] = chunks[32*j : 32*(j+1)]
}
i = j
}
}

func (kvm *keyValueMigrator) getAddrPoint(addr []byte) *verkle.Point {
if bytes.Equal(addr, kvm.currAddr) {
return kvm.currAddrPoint
}
kvm.currAddr = addr
kvm.currAddrPoint = tutils.EvaluateAddressPoint(addr)
return kvm.currAddrPoint
}

func (kvm *keyValueMigrator) getOrInitLeafNodeData(stem []byte) *verkle.BatchNewLeafNodeData {
stemStr := string(stem)
if _, ok := kvm.vktLeafData[stemStr]; !ok {
kvm.vktLeafData[stemStr] = &verkle.BatchNewLeafNodeData{
Stem: stem,
Values: make(map[byte][]byte),
}
}
return kvm.vktLeafData[stemStr]
}

func (kvm *keyValueMigrator) migrateCollectedKeyValues(tree *trie.VerkleTrie) error {
// Transform the map into a slice.
nodeValues := make([]verkle.BatchNewLeafNodeData, 0, len(kvm.vktLeafData))
for _, vld := range kvm.vktLeafData {
nodeValues = append(nodeValues, *vld)
}

// Create all leaves in batch mode so we can optimize cryptography operations.
newLeaves := verkle.BatchNewLeafNode(nodeValues)

// Insert into the tree.
if err := tree.InsertMigratedLeaves(newLeaves); err != nil {
return fmt.Errorf("failed to insert migrated leaves: %w", err)
}

return nil
}
Comment on lines +347 to +363
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we called in L173. Basically, we use the new APIs to efficiently create leaves and insert into the tree.

2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ require (
github.com/fjl/gencodec v0.0.0-20220412091415-8bb9e558978c
github.com/fjl/memsize v0.0.0-20190710130421-bcb5799ab5e5
github.com/gballet/go-libpcsclite v0.0.0-20190607065134-2772fd86a8ff
github.com/gballet/go-verkle v0.0.0-20230414192453-2838510d5ee0
github.com/gballet/go-verkle v0.0.0-20230424151626-de802a6b19f8
github.com/go-stack/stack v1.8.0
github.com/golang-jwt/jwt/v4 v4.3.0
github.com/golang/protobuf v1.5.2
Expand Down
16 changes: 2 additions & 14 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -86,8 +86,6 @@ github.com/consensys/gnark-crypto v0.4.1-0.20210426202927-39ac3d4b3f1f/go.mod h1
github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cpuguy83/go-md2man/v2 v2.0.2 h1:p1EgwI/C7NhT0JmVkwCD2ZBK8j4aeHQX2pMHHBfMQ6w=
github.com/cpuguy83/go-md2man/v2 v2.0.2/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/crate-crypto/go-ipa v0.0.0-20230315201338-1643fdc2ead8 h1:2EBbIwPDRqlCD2K34Eojyy0x9d3RhOuHAZfbQm508X8=
github.com/crate-crypto/go-ipa v0.0.0-20230315201338-1643fdc2ead8/go.mod h1:gzbVz57IDJgQ9rLQwfSk696JGWof8ftznEL9GoAv3NI=
github.com/crate-crypto/go-ipa v0.0.0-20230410135559-ce4a96995014 h1:bbyTlFQ12wkFA6aVL+9HrBZwVl85AN0VS/Bwam7o93U=
github.com/crate-crypto/go-ipa v0.0.0-20230410135559-ce4a96995014/go.mod h1:gzbVz57IDJgQ9rLQwfSk696JGWof8ftznEL9GoAv3NI=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
Expand Down Expand Up @@ -137,16 +135,8 @@ github.com/garslo/gogen v0.0.0-20170306192744-1d203ffc1f61 h1:IZqZOB2fydHte3kUgx
github.com/garslo/gogen v0.0.0-20170306192744-1d203ffc1f61/go.mod h1:Q0X6pkwTILDlzrGEckF6HKjXe48EgsY/l7K7vhY4MW8=
github.com/gballet/go-libpcsclite v0.0.0-20190607065134-2772fd86a8ff h1:tY80oXqGNY4FhTFhk+o9oFHGINQ/+vhlm8HFzi6znCI=
github.com/gballet/go-libpcsclite v0.0.0-20190607065134-2772fd86a8ff/go.mod h1:x7DCsMOv1taUwEWCzT4cmDeAkigA5/QCwUodaVOe8Ww=
github.com/gballet/go-verkle v0.0.0-20230317174103-141354da6b11 h1:x4hiQFgr1SlqR4IoAZiXLFZK4L7KbibqkORqa1fwKp8=
github.com/gballet/go-verkle v0.0.0-20230317174103-141354da6b11/go.mod h1:IyOnn1kujMWaT+wet/6Ix1BtvYwateOBy9puuWH/8sw=
github.com/gballet/go-verkle v0.0.0-20230412090410-4015adc3d072 h1:gKcktHMBKLdtCSZnaG8tv9bFG80p1tp7MjU1Uvl9nag=
github.com/gballet/go-verkle v0.0.0-20230412090410-4015adc3d072/go.mod h1:P3bwGrLhsUNIsUDlq2yzMPvO1c/15oiB3JS85P+hNfw=
github.com/gballet/go-verkle v0.0.0-20230413104310-bd8d6d33de95 h1:s8p8L/dQVmr/mgMjGIsGnnpvJMYCdfv4GHadLd/ALug=
github.com/gballet/go-verkle v0.0.0-20230413104310-bd8d6d33de95/go.mod h1:P3bwGrLhsUNIsUDlq2yzMPvO1c/15oiB3JS85P+hNfw=
github.com/gballet/go-verkle v0.0.0-20230413135631-4bea2763ed0f h1:gP4uR2/1qx6hsIzbRI28JWcsVuP7xyjyj6SpLnoFobc=
github.com/gballet/go-verkle v0.0.0-20230413135631-4bea2763ed0f/go.mod h1:P3bwGrLhsUNIsUDlq2yzMPvO1c/15oiB3JS85P+hNfw=
github.com/gballet/go-verkle v0.0.0-20230414192453-2838510d5ee0 h1:ENyj6hcn+dtO8iJ1GTzM/gkhdrAFqMi65Yf99cppdPA=
github.com/gballet/go-verkle v0.0.0-20230414192453-2838510d5ee0/go.mod h1:P3bwGrLhsUNIsUDlq2yzMPvO1c/15oiB3JS85P+hNfw=
github.com/gballet/go-verkle v0.0.0-20230424151626-de802a6b19f8 h1:UHRmPcIjYxqcS070yG9OVbr9aPfC/ToIBwRakFxQC9Y=
github.com/gballet/go-verkle v0.0.0-20230424151626-de802a6b19f8/go.mod h1:P3bwGrLhsUNIsUDlq2yzMPvO1c/15oiB3JS85P+hNfw=
github.com/getkin/kin-openapi v0.53.0/go.mod h1:7Yn5whZr5kJi6t+kShccXS8ae1APpYTW6yheSwk8Yi4=
github.com/getkin/kin-openapi v0.61.0/go.mod h1:7Yn5whZr5kJi6t+kShccXS8ae1APpYTW6yheSwk8Yi4=
github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04=
Expand Down Expand Up @@ -564,8 +554,6 @@ golang.org/x/sys v0.0.0-20211019181941-9d821ace8654/go.mod h1:oPkhp1MJrh7nUepCBc
golang.org/x/sys v0.0.0-20211020174200-9d6173849985/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220919091848-fb04ddd9f9c8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0 h1:MVltZSvRTcU2ljQOhs94SXPftV6DCNnZViHeQps87pQ=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.7.0 h1:3jlCCIQZPdOYu1h8BkNvLz8Kgwtae2cagcG/VamtZRU=
golang.org/x/sys v0.7.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201117132131-f5c789dd3221/go.mod h1:Nr5EML6q2oocZ2LXRh80K7BxOlk5/8JxuGnuhpl+muw=
Expand Down
5 changes: 5 additions & 0 deletions trie/transition.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ func (t *TransitionTrie) Base() *SecureTrie {
return t.base
}

// TODO(gballet/jsign): consider removing this API.
func (t *TransitionTrie) Overlay() *VerkleTrie {
return t.overlay
}
Comment on lines +42 to +44
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed for core/state_processor.go L173. Double check!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it needed? We could also just implement all the required functions on top of TransitionTrie? Not sure which one is the best yet, I'm just not a big fan of exposing too much of the internal structure.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's simpler to do for now, could you just please add a TODO in a comment so that we remember to check that when we merge the final PR ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "problem" with making it an API in TransitionTrie is that the methods are only Verkle related, and not conceptually generic. Still, it might be a good idea.. it's also true that "opening" the box is indirectly giving access to all the APIs.

I'll add the TODO to remember about this, yes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


// GetKey returns the sha3 preimage of a hashed key that was previously used
// to store a value.
//
Expand Down
6 changes: 6 additions & 0 deletions trie/verkle.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,12 @@ func NewVerkleTrie(root verkle.VerkleNode, db *Database, pointCache *utils.Point
}
}

func (trie *VerkleTrie) InsertMigratedLeaves(leaves []verkle.LeafNode) error {
return trie.root.(*verkle.InternalNode).InsertMigratedLeaves(leaves, func(hash []byte) ([]byte, error) {
return trie.db.diskdb.Get(hash)
})
}
Comment on lines +53 to +57
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a trampoline to use the new API to avoid magic castings and skipping layers.

For now should be fine... whenever we reconsider ethereum/go-verkle#314 maybe this method will go away if it can be as fast as the new API.


var errInvalidProof = errors.New("invalid proof")

// GetKey returns the sha3 preimage of a hashed key that was previously used
Expand Down