Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COW in LeafNodes #314

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions proof_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ import (
)

func TestProofVerifyTwoLeaves(t *testing.T) {
cfg := GetConfig()
jsign marked this conversation as resolved.
Show resolved Hide resolved

root := New()
root.Insert(zeroKeyTest, zeroKeyTest, nil)
root.Insert(oneKeyTest, zeroKeyTest, nil)
Expand All @@ -42,7 +44,6 @@ func TestProofVerifyTwoLeaves(t *testing.T) {

proof, cis, zis, yis, _ := MakeVerkleMultiProof(root, [][]byte{ffx32KeyTest}, map[string][]byte{string(ffx32KeyTest): zeroKeyTest})

cfg := GetConfig()
if !VerifyVerkleProof(proof, cis, zis, yis, cfg) {
t.Fatalf("could not verify verkle proof: %s", ToDot(root))
}
Expand Down Expand Up @@ -176,6 +177,7 @@ func TestProofOfAbsenceLeafVerify(t *testing.T) {
t.Fatal("could not verify verkle proof")
}
}

func TestProofOfAbsenceLeafVerifyOtherSuffix(t *testing.T) {
root := New()
root.Insert(zeroKeyTest, zeroKeyTest, nil)
Expand Down Expand Up @@ -212,6 +214,7 @@ func TestProofOfAbsenceStemVerify(t *testing.T) {
}

func BenchmarkProofCalculation(b *testing.B) {
_ = GetConfig()
keys := make([][]byte, 100000)
root := New()
for i := 0; i < 100000; i++ {
Expand All @@ -220,6 +223,7 @@ func BenchmarkProofCalculation(b *testing.B) {
keys[i] = key
root.Insert(key, zeroKeyTest, nil)
}
root.Commit()

b.ResetTimer()
b.ReportAllocs()
Expand Down Expand Up @@ -370,7 +374,6 @@ func TestProofDeserialize(t *testing.T) {
}

func TestProofDeserializeErrors(t *testing.T) {

deserialized, err := DeserializeProof([]byte{0}, nil)
if err == nil {
t.Fatal("deserializing invalid proof didn't cause an error")
Expand Down
3 changes: 2 additions & 1 deletion stateless_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ func TestStatelessChildren(t *testing.T) {
t.Fatal("invalid list length")
}

var emptycount = 0
emptycount := 0
for _, v := range list {
if _, ok := v.(Empty); ok {
emptycount++
Expand Down Expand Up @@ -377,6 +377,7 @@ func TestStatelessDeserializeDepth2(t *testing.T) {
}

func TestStatelessGetProofItems(t *testing.T) {
_ = GetConfig()
insertedKeys := [][]byte{zeroKeyTest, oneKeyTest, ffx32KeyTest}
provenKeys := [][]byte{zeroKeyTest, fourtyKeyTest}

Expand Down
148 changes: 98 additions & 50 deletions tree.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"bytes"
"errors"
"fmt"
"sync"

"github.com/crate-crypto/go-ipa/banderwagon"
)
Expand Down Expand Up @@ -187,6 +188,7 @@ type (

commitment *Point
c1, c2 *Point
cow map[byte][]byte
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so this is the "main" change of this second version of the PR.
The idea is that we do the "COW" idea in LeafNodes.

If we've never calculated a (*LeafNode).commitment, whenever we do (*LeafNode).Commit() we can do the same as we did before. Taking all the slice of values, and calculate the commitment.

But, if we already have a (*LeafNode).commitment != nil, then every time we touch that leaf node, we keep track in (*LeafNode).cow which was the previous value, and update (*LeafNode).values with the new value. Whenever the user calls Commit(...), the code will notice that we already had a previous (*LeafNode).commitment != nil and a (*LeafNode).cow with some tracked changes. Then, we do the diff-updating.

In summary, instead of doing diff-updating as soon as the user calls Delete(...) or Insert*(...) which previously did the diff-updating straight; we now only keep track in (*LeafNode).cow and only do the real diff-updating when Commit() is called.

So if the client is updating multiple times the same leaf value, we avoid doing diff-updatings that are overwritten. We only do it once when Commit() is called.

Also, this centralized the logic of *LeafNode commitment updating in a single place: Commit(). (i.e: we don't have commitment calculations spread in NewLeafNode(..), Insert*(...) , and Delete(...). There's a single place where that work happens.

This changes the previous version of this PR pattern if n.commitment == nil { n.Commit() } that we wanted to change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ^ is the complete picture of how COW works for leaf nodes. The idea of the comment is to give you the full picture. I'll dive into some extra details about where stuff happens.


depth byte
}
Expand Down Expand Up @@ -219,23 +221,6 @@ func NewLeafNode(stem []byte, values [][]byte) *LeafNode {
c2: Generator(),
}

// Initialize the commitment with the extension tree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you remove the commitment calculation, shouldn't the cow field be initialized with values at this point?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because commitment is nil in this case. In that case, the Commit() function already sees that as a signal that there's no previous commitment that we can do diff-updating; so it will do the usual full computation in the vector.

That means we don't need to initialize cow here. cow is initialized whenever we have a previous commitment calculation, and any value of the leaf is changed. In that case, we track in cow since will do a diff-update in the next Commit().

// marker and the stem.
cfg := GetConfig()
count := 0
var poly, c1poly, c2poly [256]Fr
poly[0].SetUint64(1)
StemFromBytes(&poly[1], leaf.stem)

count = fillSuffixTreePoly(c1poly[:], values[:128])
leaf.c1 = cfg.CommitToPoly(c1poly[:], 256-count)
toFr(&poly[2], leaf.c1)
count = fillSuffixTreePoly(c2poly[:], values[128:])
leaf.c2 = cfg.CommitToPoly(c2poly[:], 256-count)
toFr(&poly[3], leaf.c2)

leaf.commitment = cfg.CommitToPoly(poly[:], 252)

jsign marked this conversation as resolved.
Show resolved Hide resolved
return leaf
}

Expand Down Expand Up @@ -844,7 +829,7 @@ func (n *LeafNode) updateC(index byte, c *Point, oldc *Fr) {
n.commitment.Add(n.commitment, &diff)
}

func (n *LeafNode) updateCn(index byte, value []byte, c *Point) {
func (n *LeafNode) updateCn(index byte, oldValue []byte, c *Point) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the leaf COW change, updateCn now works slightly different.
Instead of receiving the new value, and getting the old from n.values, it does the opposite.

It receives the old value (now from cow), and uses the new values which were already stored in n.values.
So the logic is the same, the only thing that changes is where we get the old and new points.

var (
old, newH [2]Fr
diff Point
Expand All @@ -857,8 +842,8 @@ func (n *LeafNode) updateCn(index byte, value []byte, c *Point) {
// do not include it. The result should be the same,
// but the computation time should be faster as one doesn't need to
// compute 1 - 1 mod N.
leafToComms(old[:], n.values[index])
leafToComms(newH[:], value)
leafToComms(old[:], oldValue)
leafToComms(newH[:], n.values[index])

newH[0].Sub(&newH[0], &old[0])
poly[2*(index%128)] = newH[0]
Expand All @@ -873,42 +858,36 @@ func (n *LeafNode) updateCn(index byte, value []byte, c *Point) {
}

func (n *LeafNode) updateLeaf(index byte, value []byte) {
c, oldc := n.getOldCn(index)

n.updateCn(index, value, c)
// If we haven't calculated a commitment for this node, we don't need to create the cow map since all the
// previous values are empty. If we already have a calculated commitment, then we track new values in
// cow so we can do diff-updating in the next Commit().
if n.commitment != nil {
// If cow was never setup, then initialize the map.
if n.cow == nil {
n.cow = make(map[byte][]byte)
}

n.updateC(index, c, oldc)
// If we are touching an value in an index for the first time,
// we save the original value for future use to update commitments.
if _, ok := n.cow[index]; !ok {
if n.values[index] == nil {
n.cow[index] = nil
} else {
n.cow[index] = make([]byte, 32)
copy(n.cow[index], n.values[index])
}
}
}

n.values[index] = value
}
Comment on lines 860 to 883
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here's one half of the meat of leaf COWing.

We remove the current diff-updating work, so no heavy stuff is done when VKT APIs are called (e.g: insert, delete, etc). We only will do work in Commit() (to be explained below).

What we do now here, is what was explained in the TL;DR before:

  • If n.commitment == nil, we don't create the COW map since there aren't any meaningful "previous values" to track. It isn't adding anything, and we can avoid allocating a map with "nil" previous entries.
  • If n.commitment != nil, that means we already did a previous Commit(..) in this leaf node. So we create the COW, and keep track of the previous values and insert the new one in values. This is the information we need to track to allow Commit(...) do the diff updating.


func (n *LeafNode) updateMultipleLeaves(values [][]byte) {
var c1, c2 *Point
var old1, old2 *Fr
for i, v := range values {
if len(v) != 0 && !bytes.Equal(v, n.values[i]) {
if i < 128 {
if c1 == nil {
c1, old1 = n.getOldCn(byte(i))
}
n.updateCn(byte(i), v, c1)
} else {
if c2 == nil {
c2, old2 = n.getOldCn(byte(i))
}
n.updateCn(byte(i), v, c2)
}

n.values[i] = v
for i := range values {
if values[i] != nil {
n.updateLeaf(byte(i), values[i])
}
}

if c1 != nil {
n.updateC(0, c1, old1)
}
if c2 != nil {
n.updateC(128, c2, old2)
}
}
Comment on lines -886 to 891
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This updateMultipleLeaves(...) got simplified, so this logic disappeared.
Updating multiple leaves is simply calling updateLeaf(...) explained before (i.e: tracking value changes).


func (n *LeafNode) InsertOrdered(key []byte, value []byte, _ NodeFlushFn) error {
Expand Down Expand Up @@ -957,8 +936,77 @@ func (n *LeafNode) Commitment() *Point {
return n.commitment
}

func (n *LeafNode) Commit() *Point {
return n.commitment
var frPool = sync.Pool{
New: func() any {
ret := make([]Fr, NodeWidth)
return &ret
},
}

func (leaf *LeafNode) Commit() *Point {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so here we have the new version of (*LeafNode).Commit().
Here we do what we explained in the TL;DR before, but I'll repeat a bit in further comments.

// If we've never calculated a commitment for this leaf node, we calculate the commitment
// in a single shot considering all the values.
if leaf.commitment == nil {
Comment on lines +947 to +949
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in the comment I added, if this leaf node never was Commit(..)ed before, we do the known logic to commit to all the leaf.values in the polynomial commitment.

We could in theory only do diff-updating starting from a "empty" commitment and do diff-updates on top. But that's quite slow... if we have a bunch of values that are all new (since we have never Commited before), we do all the calculation in a single poly commitment. The code of this "if" block is the same as the previous version of this PR, so nothing interesting is needed for reviewing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could that not be simplified by simply setting the commitment to zero and then applying the CoW logic below to it? This should be equivalent, unless I missed something?

Copy link
Collaborator Author

@jsign jsign Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I went with that approach initially since it would avoid an "if" case here. Unfortunately, that was slower.

If you have a calculated commitment and 10 values to update with diffs-commits, you have to do 10 diff commitment which is slower than doing a single 10-values polynomial commitment.

In fact, at some point if your cow is showing that you changed 256 values, probably we shouldn't be doing diff-updating too. (Under the same logic) Might be useful to try to discover when diff-updating is slower than doing the full calculation again.

// Initialize the commitment with the extension tree
// marker and the stem.
count := 0
c1polyp := frPool.Get().(*[]Fr)
c1poly := *c1polyp
defer func() {
for i := 0; i < 256; i++ {
c1poly[i] = Fr{}
}
frPool.Put(c1polyp)
}()

count = fillSuffixTreePoly(c1poly, leaf.values[:128])
leaf.c1 = cfg.CommitToPoly(c1poly, 256-count)

for i := 0; i < 256; i++ {
c1poly[i] = Fr{}
}
count = fillSuffixTreePoly(c1poly, leaf.values[128:])
leaf.c2 = cfg.CommitToPoly(c1poly, 256-count)

for i := 0; i < 256; i++ {
c1poly[i] = Fr{}
}
c1poly[0].SetUint64(1)
StemFromBytes(&c1poly[1], leaf.stem)

toFrMultiple([]*Fr{&c1poly[2], &c1poly[3]}, []*Point{leaf.c1, leaf.c2})
leaf.commitment = cfg.CommitToPoly(c1poly, 252)

} else if len(leaf.cow) != 0 {
// If we've already have a calculated commitment, and there're touched leaf values, we do a diff update.
var c1, c2 *Point
var old1, old2 *Fr
for i, oldValue := range leaf.cow {
if !bytes.Equal(oldValue, leaf.values[i]) {
if i < 128 {
if c1 == nil {
c1, old1 = leaf.getOldCn(i)
}
leaf.updateCn(i, oldValue, c1)
} else {
if c2 == nil {
c2, old2 = leaf.getOldCn(i)
}
leaf.updateCn(i, oldValue, c2)
}
}
}

if c1 != nil {
leaf.updateC(0, c1, old1)
}
if c2 != nil {
leaf.updateC(128, c2, old2)
}
leaf.cow = nil
Comment on lines +980 to +1006
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have the new case. In this case we know that leaf.commitment != nil (so we already did Commit(...) in this leaf node), and we see that there're values in leaf.cow which means that values where changed.

What we do here is something similar to what we did before for diff updating. The only "meaningful" change is that we send oldValue toupdateCn(...) instead of new values (as explained in a previous comment). This is only because in leaf.values we have the new values, and the old ones in leaf.cow.

In L1143 we cleanup the leaf.cow map to reclaim some memory, and allow that to be created again if new values in this leaf change again.

}

return leaf.commitment
}

// fillSuffixTreePoly takes one of the two suffix tree and
Expand Down
14 changes: 10 additions & 4 deletions tree_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,17 @@ import (
"errors"
"fmt"
mRand "math/rand"
"os"
"sort"
"testing"
"time"
)

func TestMain(m *testing.M) {
_ = GetConfig()
os.Exit(m.Run())
}
Comment on lines +42 to +45
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing GetConfig() in each test, we can make this part of the startup process before running the tests, so we don't have to repeat ourselves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is necessary: once the first test is called, it will initialize the cfg variable for all subsequent tests. Not sure what you are trying to achieve here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding adding GetConfig() in each test.

You can't rely on only doing that in the first test of the file, because if you later do go test ./... -run=TestSomeParticularOne, that will panic if isn't the first one in the file. So that means you have to do GetConfig() on every test, which can be avoiding on doing this in TestMain().

Makes sense to you?


// a 32 byte value, as expected in the tree structure
var testValue = []byte("0123456789abcdef0123456789abcdef")

Expand Down Expand Up @@ -106,7 +112,6 @@ func TestInsertTwoLeavesLastLevel(t *testing.T) {
if !bytes.Equal(leaf.values[0], testValue) {
t.Fatalf("did not find correct value in trie %x != %x", testValue, leaf.values[0])
}

}

func TestGetTwoLeaves(t *testing.T) {
Expand Down Expand Up @@ -413,6 +418,7 @@ func TestDeleteUnequalPath(t *testing.T) {
t.Fatalf("didn't catch the deletion of non-existing key, err =%v", err)
}
}

func TestDeleteResolve(t *testing.T) {
key1, _ := hex.DecodeString("0105000000000000000000000000000000000000000000000000000000000000")
key2, _ := hex.DecodeString("0107000000000000000000000000000000000000000000000000000000000000")
Expand Down Expand Up @@ -719,7 +725,7 @@ func isLeafEqual(a, b *LeafNode) bool {

func TestGetResolveFromHash(t *testing.T) {
var count uint
var dummyError = errors.New("dummy")
dummyError := errors.New("dummy")
var serialized []byte
getter := func([]byte) ([]byte, error) {
count++
Expand Down Expand Up @@ -813,7 +819,7 @@ func TestInsertIntoHashedNode(t *testing.T) {
t.Fatalf("error detecting a decoding error after resolution: %v", err)
}

var randomResolverError = errors.New("'clef' was mispronounced")
randomResolverError := errors.New("'clef' was mispronounced")
// Check that the proper error is raised if the resolver returns an error
erroringResolver := func(h []byte) ([]byte, error) {
return nil, randomResolverError
Expand Down Expand Up @@ -879,9 +885,9 @@ func TestLeafToCommsLessThan16(*testing.T) {
}

func TestGetProofItemsNoPoaIfStemPresent(t *testing.T) {

root := New()
root.Insert(ffx32KeyTest, zeroKeyTest, nil)
root.Commit()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite important now to always do Commit() after doing changes, since we're trying to be "as lazy as possible" to do premature work if values keep changing before being ready to commit.


// insert two keys that differ from the inserted stem
// by one byte.
Expand Down