Skip to content

Commit

Permalink
zk accel layer docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mratsim committed Dec 4, 2023
1 parent cdbf4a6 commit a408f1b
Showing 1 changed file with 235 additions and 0 deletions.
235 changes: 235 additions & 0 deletions docs/zk_accel_layer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
# ZK Accel layer

_Last update: 2023-12-04_

This document is aimed at software and hardware proof system accelerators providers.

It documents the steps taken in Constantine to support the candidate ZAL (Zk Accel API)
proposed in https://github.com/privacy-scaling-explorations/halo2/issues/216
"[RFC] Blackboxing MSM and FFT - Hardware Accel API".

> ``📝`` Note
>
> Constantine is written in the [Nim programming language](https://nim-lang.org/).\
> Nim compiles to machine code through C hence code produced by Nim
> has the C ABI of the compiler/OS used.
>
> In short this guide is applicable for C or C++ libraries.
## Testing

Building Constantine with ZAL supports requires:

- Clang
- LLVM
- Nim
- Rust

All of those are available through your package manager on MacOS or Linux.

Commands for testing:

```
git clone https://github.com/mratsim/constantine
cd constantine
cargo test
cargo bench
```

## API

As of 2023-12-04, the candidate API is proposed in https://github.com/privacy-scaling-explorations/halo2curves/pull/107/files#diff-e746a6a49bd01b5c3241e440d309f5a5e38e583aa9a2eaa2c97419fdc1a3104aR42

```Rust
pub trait ZalEngine {}

pub trait MsmAccel<C: CurveAffine>: ZalEngine {
fn msm(&self, coeffs: &[C::Scalar], base: &[C]) -> C::Curve;
}
```

A similar trait will likely be proposed for FFTs and coset FFTs

Initialization of the ZalEngine is at the moment not enforced.
It is recommended that ZAL backends provide:
- an initialization function:
- either "fn new() -> ZalEngine" for simple libraries
- or a builder pattern for complex initializations
- a shutdown function wrapped in a Drop trait.

The ZalEngine can be a stub type and the shutdown function might be unnecessary
if the ZalEngine uses a global threadpool like Rayon.

Backends might want to add as an option:
- The number of threads (CPU)
- The device(s) to run on (multi-sockets machines, multi-GPUs machines, ...)
- The curve (JIT-compiled backend)
- ...

## C Backend

Wrapping the C code can be autogenerated from Rust bindgen
```sh
cargo install bindgen-cli
```

Constantine uses the following [scripts/gen_rust_bindings.sh](../scripts/gen_rust_bindings.sh)

```sh
bindgen \
include/constantine.h \
-o constantine-rust/constantine-sys/src/bindings.rs \
--default-enum-style rust \
--use-core \
--no-derive-debug \
--default-visibility private \
--enable-function-attribute-detection \
-- -Iinclude
```

The headers for MSM are at [include/curves/bn254_snarks_parallel.h](../include/curves/bn254_snarks_parallel.h)

```C
void ctt_bn254_snarks_g1_prj_multi_scalar_mul_fr_coefs_vartime_parallel(const ctt_threadpool* tp, bn254_snarks_g1_prj* r, const bn254_snarks_fr coefs[], const bn254_snarks_g1_aff points[], size_t len);
```
The Rust wrapping is in [constantine-rust/constantine-zal-halo2kzg/src/lib.rs](../constantine-rust/constantine-zal-halo2kzg/src/lib.rs)
```Rust
use ::core::mem::MaybeUninit;
use constantine_sys::*;
use halo2curves::bn256;
use halo2curves::zal::{MsmAccel, ZalEngine};
use std::mem;
pub struct CttEngine {
ctx: *mut ctt_threadpool,
}
impl CttEngine {
#[inline(always)]
pub fn new(num_threads: usize) -> CttEngine {
let ctx = unsafe { ctt_threadpool_new(num_threads) };
CttEngine { ctx }
}
}
impl Drop for CttEngine {
fn drop(&mut self) {
unsafe { ctt_threadpool_shutdown(self.ctx) }
}
}
impl ZalEngine for CttEngine {}
impl MsmAccel<bn256::G1Affine> for CttEngine {
fn msm(&self, coeffs: &[bn256::Fr], bases: &[bn256::G1Affine]) -> bn256::G1 {
assert_eq!(coeffs.len(), bases.len());
let mut result: MaybeUninit<bn254_snarks_g1_prj> = MaybeUninit::uninit();
unsafe {
ctt_bn254_snarks_g1_prj_multi_scalar_mul_fr_coefs_vartime_parallel(
self.ctx,
result.as_mut_ptr(),
coeffs.as_ptr() as *const bn254_snarks_fr,
bases.as_ptr() as *const bn254_snarks_g1_aff,
bases.len(),
);
mem::transmute::<MaybeUninit<bn254_snarks_g1_prj>, bn256::G1>(result)
}
}
}
```

And testing

```Rust
#[cfg(test)]
mod tests {
use super::*;

use ark_std::{end_timer, start_timer};
use rand_core::OsRng;

use halo2curves::bn256;
use halo2curves::ff::Field;
use halo2curves::group::prime::PrimeCurveAffine;
use halo2curves::group::{Curve, Group};
use halo2curves::msm::best_multiexp;
use halo2curves::zal::MsmAccel;

#[test]
fn t_threadpool() {
let tp = CttEngine::new(4);
drop(tp);
}

fn run_msm_zal(min_k: usize, max_k: usize) {
let points = (0..1 << max_k)
.map(|_| bn256::G1::random(OsRng))
.collect::<Vec<_>>();
let mut affine_points = vec![bn256::G1Affine::identity(); 1 << max_k];
bn256::G1::batch_normalize(&points[..], &mut affine_points[..]);
let points = affine_points;

let scalars = (0..1 << max_k)
.map(|_| bn256::Fr::random(OsRng))
.collect::<Vec<_>>();

for k in min_k..=max_k {
let points = &points[..1 << k];
let scalars = &scalars[..1 << k];

let t0 = start_timer!(|| format!("freestanding msm k={}", k));
let e0 = best_multiexp(scalars, points);
end_timer!(t0);

let engine = CttEngine::new(num_cpus::get());
let t1 = start_timer!(|| format!("CttEngine msm k={}", k));
let e1 = engine.msm(scalars, points);
end_timer!(t1);

assert_eq!(e0, e1);
}
}

#[test]
fn t_msm_zal() {
run_msm_zal(3, 14);
}
}
```

## ABI description

- ZalEngine is an opaque pointer that you can use for anything (or nothing for the builtin H2cEngine)
- coeffs/scalars are:
- field elements in the range [0,r) in **Montgomery** domain, with `r` the prime **order** of the curve.
For A in Montgomery and a in canonical domain we have `A = aR (mod r)`, R being 2²⁵⁶ for BN254
- halo2curves uses 64-bit words, word-endianness is machine-endian (i.e. little endian on x86, ARM, RISC-V)
- Big numbers are split into 64-bit limbs, least significant limb first, i.e. limb-endianness is little endian
- each takes 32 bytes
- points/bases are:
- group elements / short Weierstrass elliptic curve points, in **affine coordinates representation** (**NOT** jacobian)
Coordinates are ordered (x, y)
- each takes 64 bytes
- result is:
- a group element / short Weierstrass elliptic curve points, in **projective coordinates representation** (**NOT** jacobian)

Converting from Jacobian to Projective is explained here [constantine/math/ec_shortweierstrass.nim](../constantine/math/ec_shortweierstrass.nim)

For affine coordinates (x, y):
- Projective coordinates have the form (X, Y, Z) with x = X/Z and y = Y/Z
- Jacobian coordinates have the form (X, Y, Z) with X = X/Z² and y = Y/Z³

```Nim
func projectiveFromJacobian*[F; G](
prj: var ECP_ShortW_Prj[F, G],
jac: ECP_ShortW_Jac[F, G]) {.inline.} =
prj.x.prod(jac.x, jac.z)
prj.y = jac.y
prj.z.square(jac.z)
prj.z *= jac.z
```
Coordinates are ordered (X, Y, Z)
- Result requires 96 bytes

0 comments on commit a408f1b

Please sign in to comment.