-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(gnolang): use slice not map for Attributes.data per usage performance #3437
base: master
Are you sure you want to change the base?
perf(gnolang): use slice not map for Attributes.data per usage performance #3437
Conversation
…mance Noticed in profiling stdlibs/bytes that a ton of memory was being used in maps, and that's due to the conventional CS 101 that maps with O(1) lookups, insertions and deletions beat O(n) slices' performance, but when n is small, the memory bloat is not worth it and we can use slices as evidenced in profiles for which there was 30% perceptible reduction in RAM where * Before: ```shell Showing nodes accounting for 92.90MB, 83.87% of 110.76MB total Dropped 51 nodes (cum <= 0.55MB) Showing top 10 nodes out of 123 flat flat% sum% cum cum% 47.37MB 42.77% 42.77% 47.37MB 42.77% internal/runtime/maps.newarray 10.50MB 9.48% 52.25% 10.50MB 9.48% internal/runtime/maps.NewEmptyMap 8MB 7.22% 59.47% 8MB 7.22% github.com/gnolang/gno/gnovm/pkg/gnolang.(*StaticBlock).InitStaticBlock 7.51MB 6.78% 66.25% 13.03MB 11.76% github.com/gnolang/gno/gnovm/pkg/gnolang.Go2Gno 6.02MB 5.43% 71.68% 10.73MB 9.68% github.com/gnolang/gno/gnovm/pkg/gnolang.(*defaultStore).SetObject 4MB 3.61% 75.29% 4MB 3.61% github.com/gnolang/gno/gnovm/pkg/gnolang.NewBlock 3.47MB 3.13% 78.43% 3.47MB 3.13% github.com/gnolang/gno/gnovm/pkg/gnolang.(*Allocator).NewDataArray 2.52MB 2.27% 80.70% 3.52MB 3.18% github.com/gnolang/gno/gnovm/pkg/gnolang.toKeyValueExprs 2MB 1.81% 82.51% 2MB 1.81% runtime.allocm 1.51MB 1.36% 83.87% 1.51MB 1.36% runtime/pprof.(*profMap).lookup ``` ```shell Showing nodes accounting for 47.37MB, 42.77% of 110.76MB total ----------------------------------------------------------+------------- flat flat% sum% cum cum% calls calls% + context ----------------------------------------------------------+------------- 47.37MB 100% | internal/runtime/maps.newGroups 47.37MB 42.77% 42.77% 47.37MB 42.77% | internal/runtime/maps.newarray ----------------------------------------------------------+------------- 32.01MB 78.05% | github.com/gnolang/gno/gnovm/pkg/gnolang.preprocess1.func1 7MB 17.07% | github.com/gnolang/gno/gnovm/pkg/gnolang.evalConst (inline) 1.50MB 3.66% | github.com/gnolang/gno/gnovm/pkg/gnolang.constType (inline) 0.50MB 1.22% | github.com/gnolang/gno/gnovm/pkg/gnolang.tryPredefine.func1 0 0% 42.77% 41.01MB 37.03% | github.com/gnolang/gno/gnovm/pkg/gnolang.(*Attributes).SetAttribute 41.01MB 100% | runtime.mapassign_faststr ----------------------------------------------------------+------------- 4.50MB 100% | github.com/gnolang/gno/gnovm/pkg/test.(*TestOptions).runTestFiles 0 0% 42.77% 4.50MB 4.06% | github.com/gnolang/gno/gnovm/pkg/gnolang.(*Machine).RunFiles 4.50MB 100% | github.com/gnolang/gno/gnovm/pkg/gnolang.(*Machine).runFileDecls ``` and after: ```shell Showing nodes accounting for 61.99MB, 73.12% of 84.78MB total Showing top 10 nodes out of 196 flat flat% sum% cum cum% 19.50MB 23.00% 23.00% 19.50MB 23.00% github.com/gnolang/gno/gnovm/pkg/gnolang.(*Attributes).SetAttribute 12.52MB 14.76% 37.77% 18.02MB 21.26% github.com/gnolang/gno/gnovm/pkg/gnolang.Go2Gno 7.58MB 8.94% 46.70% 9.15MB 10.79% github.com/gnolang/gno/gnovm/pkg/gnolang.(*defaultStore).SetObject 5MB 5.90% 52.60% 5MB 5.90% github.com/gnolang/gno/gnovm/pkg/gnolang.(*StaticBlock).InitStaticBlock 3.47MB 4.09% 56.69% 3.47MB 4.09% github.com/gnolang/gno/gnovm/pkg/gnolang.(*Allocator).NewDataArray 3MB 3.54% 60.24% 3MB 3.54% github.com/gnolang/gno/gnovm/pkg/gnolang.NewBlock 3MB 3.54% 63.77% 3MB 3.54% github.com/gnolang/gno/gnovm/pkg/gnolang.Nx (inline) 2.77MB 3.26% 67.04% 2.77MB 3.26% bytes.growSlice 2.65MB 3.12% 70.16% 2.65MB 3.12% internal/runtime/maps.newarray 2.50MB 2.95% 73.12% 2.50MB 2.95% runtime.allocm ``` Fixes gnolang#3436
🛠 PR Checks SummaryAll Automated Checks passed. ✅ Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):🟢 Maintainers must be able to edit this pull request (more info) ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
Codecov ReportAttention: Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a quick look at the benchmarks on BenchmarkBenchdata
, and there seems to be a minor performance drop:
goos: linux
goarch: amd64
pkg: github.com/gnolang/gno/gnovm/pkg/gnolang
cpu: AMD Ryzen 7 7840U w/ Radeon 780M Graphics
│ master.txt │ 3437.txt │
│ sec/op │ sec/op vs base │
Benchdata/fib.gno_param:4-16 7.922µ ± 3% 8.155µ ± 5% ~ (p=0.132 n=6)
Benchdata/fib.gno_param:8-16 56.74µ ± 4% 58.82µ ± 5% ~ (p=0.180 n=6)
Benchdata/fib.gno_param:16-16 2.709m ± 5% 2.819m ± 4% +4.06% (p=0.026 n=6)
Benchdata/loop.gno-16 70.94n ± 2% 70.28n ± 69% ~ (p=0.515 n=6)
Benchdata/matrix.gno_param:3-16 170.5µ ± 6% 175.4µ ± 7% ~ (p=0.589 n=6)
Benchdata/matrix.gno_param:4-16 449.2µ ± 2% 445.6µ ± 2% ~ (p=0.310 n=6)
Benchdata/matrix.gno_param:5-16 1.613m ± 3% 1.663m ± 4% ~ (p=0.093 n=6)
Benchdata/matrix.gno_param:6-16 8.292m ± 3% 8.400m ± 5% ~ (p=0.394 n=6)
geomean 131.3µ 134.0µ +2.01%
(the benchmarks were done very quickly, but it's just to have an idea)
the optimization is good and desired for the memory side, but I'd like this to come at no cost on the CPU.
Some ideas for optimization:
- Maybe we can make the slice an
[]attrKV
, so that iteration ingetAttribute
doesn't have to do a pointer indirection for each key. - ATTR_PREPROCESSED and ATTR_PREDEFINED can only be
true
if set; maybe these could be just an unexportedbyte
value with a bitmap.
Removed the |
TL;DR: Despite the reduction in memory consumption, the preliminary direct benchmarks show increase in CPU time but in microseconds
Deep dive
Noticed in profiling stdlibs/bytes that a ton of memory was being used in maps, and that's due to the conventional CS 101 that maps with O(1) lookups, insertions and deletions beat O(n) slices' performance, but when n is small, the memory bloat is not worth it and we can use slices as evidenced in profiles for which there was 30% perceptible reduction in RAM where
and after:
Fixes #3436