-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocate less in Fortuna #188
Conversation
Instead, use a temporary buffer. Contradicts mirage#186
Thank you @hannesm ! I can confirm that with this PR I no more have 10's of 2B allocation per second 🚀 So I'm in favor to merge it as #186 is not merged yet. |
Let's merge this, and get data race freedom in a later release.. |
CHANGES: * mirage-crypto, mirage-crypto-rng{,lwt,mirage}: support CL.EXE compiler (mirage/mirage-crypto#137 @jonahbeckford) - mirage-crypto-pk not yet due to gmp dependency, mirage-crypto-ec doesn't pass testsuite * mirage-crypto-ec: use simpler square root for ed25519 - saving 3 multiplications and 2 squarings, details https://mailarchive.ietf.org/arch/msg/cfrg/qlKpMBqxXZYmDpXXIx6LO3Oznv4/ (mirage/mirage-crypto#196 @hannesm) * mirage-crypto-ec: use sliding window method with pre-computed calues of multiples of the generator point for NIST curves, speedup around 4x for P-256 sign (mirage/mirage-crypto#191 @Firobe, review @palainp @hannesm) * mirage-crypto-ec: documentation: warn about power timing analysis on `k` in Dsa.sign (mirage/mirage-crypto#195 @hannesm, as proposed by @edwintorok) * mirage-crypto-ec: replace internal Cstruct.t by string (speedup up to 2.5x) (mirage/mirage-crypto#146 @dinosaure @hannesm @reynir, review @Firobe @palainp @hannesm @reynir) * bench/speed: add EC (ECDSA & EdDSA generate/sign/verify, ECDH secret/share) operations (mirage/mirage-crypto#192 @hannesm) * mirage-crypto-rng: use rdtime instead of rdcycle on RISC-V (rdcycle is privileged since Linux kernel 6.6) (mirage/mirage-crypto#194 @AdrianBunk, review by @edwintorok) * mirage-crypto-rng: support Loongarch (mirage/mirage-crypto#190 @fangyaling, review @loongson-zn) * mirage-crypto-rng: support NetBSD (mirage/mirage-crypto#189 @drchrispinnock) * mirage-crypto-rng: allocate less in Fortuna when feeding (mirage/mirage-crypto#188 @hannesm, reported by @palainp) * mirage-crypto-ec: avoid mirage-crypto-pk and asn1-combinators test dependency (instead, craft our own asn.1 decoder -- mirage/mirage-crypto#200 @hannesm) ### Performance differences between v0.11.2 and v0.11.3 and OpenSSL The overall result is promising: P-256 sign operation improved 9.4 times, but is still a 4.9 times slower than OpenSSL. Numbers in operations per second (apart from speedup, which is a factor v0.11.3 / v0.11.2), gathered on a Intel i7-5600U CPU 2.60GHz using FreeBSD 14.0, OCaml 4.14.1, and OpenSSL 3.0.12. #### P224 | op | v0.11.2 | v0.11.3 | speedup | OpenSSL | |--------|---------|---------|---------|---------| | gen | 1160 | 20609 | 17.8 | | | sign | 931 | 8169 | 8.8 | 21319 | | verify | 328 | 1606 | 4.9 | 10719 | | dh-sec | 1011 | 12595 | 12.5 | | | dh-kex | 992 | 2021 | 2.0 | 16691 | #### P256 | op | v0.11.2 | v0.11.3 | speedup | OpenSSL | |--------|---------|---------|---------|---------| | gen | 990 | 19365 | 19.6 | | | sign | 792 | 7436 | 9.4 | 36182 | | verify | 303 | 1488 | 4.9 | 13383 | | dh-sec | 875 | 11508 | 13.2 | | | dh-kex | 895 | 1861 | 2.1 | 17742 | #### P384 | op | v0.11.2 | v0.11.3 | speedup | OpenSSL | |--------|---------|---------|---------|---------| | gen | 474 | 6703 | 14.1 | | | sign | 349 | 3061 | 8.8 | 900 | | verify | 147 | 544 | 3.7 | 1062 | | dh-sec | 378 | 4405 | 11.7 | | | dh-kex | 433 | 673 | 1.6 | 973 | #### P521 | op | v0.11.2 | v0.11.3 | speedup | OpenSSL | |--------|---------|---------|---------|---------| | gen | 185 | 1996 | 10.8 | | | sign | 137 | 438 | 3.2 | 2737 | | verify | 66 | 211 | 3.2 | 1354 | | dh-sec | 180 | 1535 | 8.5 | | | dh-kex | 201 | 268 | 1.3 | 2207 | #### 25519 | op | v0.11.2 | v0.11.3 | speedup | OpenSSL | |--------|---------|---------|---------|---------| | gen | 23271 | 22345 | 1.0 | | | sign | 11228 | 10985 | 1.0 | 21794 | | verify | 8149 | 8029 | 1.0 | 7729 | | dh-sec | 14075 | 13968 | 1.0 | | | dh-kex | 13487 | 14079 | 1.0 | 24824 |
@palainp observed lots of 2 byte allocations (with malloc, via bigarray). The root cause is the
add
function in our Fortuna module.This PR uses instead a global pre-allocated buffer. This is likely fine for OCaml 4, but will be troublesome (a data race) when OCaml 5 (see #186) is considered, or any other multi-threading preemption setup.
It is not 100% clear how to move forward. One part is to document "mirage-crypto is not SMP safe", and if someone wants to use it in such a system, they should guard functions with mutexes.
What does the way ahead mean? We have observed that using "cstruct.t" may have been not the wisest choice from early on, and settling on bytes/string would be more wise (in terms of GC friendlyness). We also observe that a "domain-local variable" (i.e.
__thread
in C) is missing in OCaml AFAICT. Keeping performance in mind, maybe we should merge this as an intermediate stop-gap solution, then move on revising the API to use string/bytes (at least internally), and only then figure out a solution for the multicore stuff.In related news, there's ongoing work for thread-local storage in OCaml 5 (ocaml/ocaml#12724 ocaml/ocaml#12889 ocaml/ocaml#12677) -- looks like 5.3 may have such a thing. Considering my time and energy, maybe mark mirage-crypto* as not available for OCaml 5 (until there's a thread local storage story and we fix the code) is the way to go? WDYT?