Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare with reed-solomon-16 #40

Open
ordian opened this issue Jan 17, 2024 · 1 comment
Open

Compare with reed-solomon-16 #40

ordian opened this issue Jan 17, 2024 · 1 comment

Comments

@ordian
Copy link
Member

ordian commented Jan 17, 2024

I just came across https://github.com/malaire/reed-solomon-16 which was written 3 years ago.
It doesn't have any unsafe or SIMD code and seems faster on laptop than reed-solomon-novelpoly. Futhermore, there's an open PR with avx2 backend, which claims to make it >10x faster. It was published in https://crates.io/crates/reed-solomon-simd.

cc @alindima

@burdges
Copy link
Contributor

burdges commented Jan 17, 2024

Yes the supplied cargo run --release --example quick-comparison benchmark says encoding runs twice as fast. It's only 20% or 30% faster for decoding, but that's less important for us, given the systematic chunks optimization.

Linux laptop with i7 11th Gen @ 2.80GHz

4096:4096 (1 kiB)
> reed-solomon-16                  0         48841        104440
> reed-solomon-novelpoly           0        106629        163039

32768:32768 (1 kiB)
> reed-solomon-16                  0        498662       1054801
> reed-solomon-novelpoly           0       1143171       1498802

Apple M2:

4096:4096 (1 kiB)
> reed-solomon-16                  0         36535         80496
> reed-solomon-novelpoly           0         57448        110243

32768:32768 (1 kiB)
> reed-solomon-16                  0        376412        816519
> reed-solomon-novelpoly           0        614643       1031925

It does single log-exp table multiplications like we do, so the table sizes should be the same. We should not have any advantage in cache pressue under diverse load.

All our abstractions exist because we didn't know how best to represent the field. There is much less abstraction in their code, maybe they'd more confidence in this log-exp table approach, so maybe we confuse rustc somehow, or maybe they just translated leopard more faifthfully, but..

I suspect https://github.com/malaire/reed-solomon-16/blob/master/src/engine/engine_nosimd.rs has some optimization not present in our FFT, which perhaps better reuses something. At least one or two things in our FFT slightly botyhered me, but I never explored it.

It's maybe worth a benchmark vs leopard https://github.com/catid/leopard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants