Nova Folding and Recursive Proofs

Sep 29, 2024

Special thanks to Lisa and Maciej for the feedback.

The ZK space is evolving rapidly. ZK-EVMs, once thought to be years away from practical use, have quickly become a reality. Similarly, ZK-VMs, recently met with skepticism, now enable smooth transitions from optimistic to ZK rollups with small engineering effort.

One innovation that emerged from this rise is folding schemes. Folding schemes reduce the computational complexity of verifying two proofs by combining or "folding" them into one, easier-to-verify proof.

This article will explore folding schemes, focusing on Nova, which expands on this concept. We'll walk through an example of Nova folding and briefly examine the protocol.

Nova Introduction

In a zk-SNARK1, the prover convinces the verifier that they know the evaluation of a function F without revealing any information about F. The advantage is that the proof generated by the prover is significantly smaller than the computation the verifier needs to perform to verify the results.

Nova is a SNARK that implements Incrementally Verifiable Computation (IVC). But what exactly is IVC? Think of a function F that takes a transaction T as an input, processes it, and then uses the output as the input for the next step—this is how recursive functions work. If you include a verification step that confirms the previous calculation was correct, you get an IVC. This can be visualized as a loop of computation and verification.

The function F’ takes an input T and reaches a state S at each step, which is then fed back into F’. But instead of only computing F’, it also verifies that the computation at the previous step was correct. The function can be stopped at any point and prove the correctness of the entire computation based only on the correctness of the last step. In simpler terms, if the transition from one state to another is correct, all prior transitions are correct, too.

Nova uses an R1CS to represent computation. After Nova, other folding schemes appeared, working with different arithmetization choices, such as Sangria for PLONKish arithmetization or HyperNova, which abstracts the arithmetization entirely.

R1CS converts the circuit into a series of linear equations, represented as matrices that satisfy the equation A(Z) ⋅ B(Z) = C(Z) where ⋅ is an element-wise multiplication and Z is a vector.

Here are the characteristics of Nova compared to other SNARKs:

No trusted setup and no need for pairing-friendly elliptic curves.
Works with any elliptic curves with a hard discrete logarithm problem (easy to verify, hard to compute).
Prover time scales linearly with F — O(F) complexity.
Constant verifier time.
The proof size is O(log F), growing much slower than F.
Proof generation and verification happen only in the final computation step.

To run an IVC using a standard SNARK like PLONK, proof generation and verification must be repeated at each step, making the process inefficient. Halo and Halo2 address this by introducing accumulation.

In Halo2, the prover generates a proof at each step, but the verification is deferred. The unverified part—where the verifier computes polynomial evaluation at a selected point—is accumulated and performed at the final step of the IVC. This portion, called opening, scales linearly with the circuit size. This is the hard part.

The easy part—where the prover computes a polynomial and a commitment to it—is performed at each step. The polynomials and commitments are then added together. Due to the homomorphic nature of polynomial addition, verification can be deferred by taking a linear combination of all received polynomials and commitments and computing the opening. The polynomial commitment scheme must be additively homomorphic for this to work.

However, in Halo2, the prover must still generate proof at each step. Nova builds on this concept by introducing folding.

Enter Folding

With folding, the easy and hard parts of verification are deferred, and proof generation is only performed at the final step. This is important because proving is the most resource-intensive part, as it involves computing the entire circuit.

The folding scheme is an interactive protocol (can be non-interactive via Fiat-Shamir) between a prover and a verifier. Both hold instances I1 and I2 with the prover additionally holding witnesses W1 and W2. The prover's goal is to convince the verifier that it knows the witnesses W1 and W2 for instances I1 and I2 without revealing the witnesses.

The prover constructs a new instance I and a new witness W such that proving W for I implies knowledge of W2 and W2 for I1 and I2, respectively. An instance I consists of a set of inputs and outputs satisfied by the witness W. In this context, I1 and I2 are two R1CS matrices folded together.

The prover uses witnesses to convince the verifier that they know values corresponding to instances I1 and I2 without revealing the values themselves. In this context, the witness is the solution that satisfies the R1CS representations, ensuring that the equation A(Z) ⋅ B(Z) = C(Z) holds.

Here, Z is a vector comprising the instance I, the witness W, and the constant 1, so Z = (1, I, W).

Naive Implementation

In a naive implementation, the verifier sends randomness to fold instances I1 and I2 to stop the prover from coming up with values that fit the equation but are not correct witnesses. The folding of two instance/witness pairs is defined as

\( Z=Z_1+R \cdot Z_2\)

where Z is a new folded instance, and R is the randomness sent by the verifier.

The folded R1CS equation to be satisfied is:

\(A(Z_1+R \cdot Z_2) \cdot B(Z_1+R \cdot Z_2) = C(Z_1+R \cdot Z_2)\)

Unfortunately, the equation is not satisfied in the naive implementation because R1CS is quadratic and doesn’t support the linear combination introduced by folding. The problems which arise are:

Quadratic Coefficient

The folding expression Z1 + R ⋅ Z2 needs to satisfy:

\(A(Z_1+R \cdot Z_2) \cdot B(Z_1+R\cdot Z_2)=C(Z_1+R\cdot Z_1)\)

But

\(A(Z_1 + R \cdot Z_2) \cdot B(Z_1 + R \cdot Z_2)\)

expands to include the term R².

We need R instead of R² because quadratic equations wouldn’t preserve the linearity of folding.

Mixed Terms

The expression

\(A(Z_1+R\cdot Z_2) \cdot B(Z_1+R\cdot Z_2)\)

expands also to include the term:

\(R⋅(A(Z_1)\cdot B(Z_2)+A(Z_2)\cdot B(Z_1))\)

which should not be present in R1CS, as it doesn’t fit the structure A(Z) ⋅ B(Z) = C(Z)

Again, the reason is the quadratic nature of R1CS vs the linear nature of folding.

Fixed Element:

In R1CS, the witness must store the value 1 in the beginning, i.e., it is [1, x, y] for an equation with two variables.
The 1 is needed to handle constant (non-variable) terms, which can then be multiplied by 1 to keep them intact.
The requirement to have 1 to deal with constants wouldn’t be met in the folded instance because the expression Z1 + R ⋅ Z2 results in 1 becoming 1 + R.

To handle these issues, Nova introduces relaxed R1CS.

Just Relax

In the Relaxed R1CS, there are two new terms:

Scalar u: changes the first component of Z from 1 to u.
Error Vector E: accounts for R² and mixed terms introduced during folding.

The equation for the relaxed, therefore R1CS becomes:

\(A(Z)\cdot B(Z)=u⋅C(Z)+E \)

And vectors Z are now:

\(\left( u_1, I_1, W_1, E_1 \right); \left( u_2, I_2, W_2, E_2 \right).\)

Folding them involves:

New scalar:

\(u=u_1+R⋅u_2\)

New vector:

\(E=E_1+R⋅(A(Z)\cdot B(Z_2)+A(Z_2)\cdot B(Z)-u_1⋅C(Z_2)-u_2⋅C(Z))+R^2⋅E_2 \)

The vector E absorbs the R² and accounts for mixed terms, keeping the linear relationship consistent. Here's how each component of E works:

\(E_1: \text{The original error from the first instance is carried over.}\)

\(R \cdot \left( A(Z_1) \cdot B(Z_2) + A(Z_2) \cdot B(Z_1) \right): \text{ Adds the mixed terms directly to } E.\)

\(-u_1 \cdot C(Z_2) - u_2 \cdot C(Z_1)): \text{ Adjusts for the contributions of } u_1 \text{ and } u_2. \)

\(R^2 \cdot E_2: \text{Accounts for the } R^2 \text{ term, incorporating it into } E.\)

Let’s examine it further through a “by hand” example to visualize these problems and their solutions.

Nova by Hand

We have two instance-witness pairs:

Instance/Witness Pair 1:
\(\text{Instance } Z_1 \text{; } \text{Witness: } W_1 = \begin{bmatrix} 1 \\ 5 \\ 25 \end{bmatrix} \)

Instance/Witness Pair 2:
\(\text{Instance } Z_2 \text{; } \text{Witness: } W_2 = \begin{bmatrix} 1 \\ 3 \\ 6 \end{bmatrix}\)

Calculations for Each Pair

Pair 1

\(\text{Calculate } A_1: A_1 = \begin{bmatrix} 0 & 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 3 \\ 6 \end{bmatrix} = 3 \)

\(\text{Calculate } B_1: B_1 = \begin{bmatrix} 0 & 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 5 \\ 25 \end{bmatrix} = 5 \)

\(\text{Calculate } C_1: C_1 = \begin{bmatrix} 0 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 5 \\ 25 \end{bmatrix} = 25 \)

Pair 2

\(\text{Calculate } A_2: A_2 = \begin{bmatrix} 0 & 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 3 \\ 6 \end{bmatrix} = 3 \)

\(\text{Calculate } B_2: B_2 = \begin{bmatrix} 2 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 3 \\ 6 \end{bmatrix} = 2\)

\(\text{Calculate } C_2: C_2 = \begin{bmatrix} 0 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 3 \\ 6 \end{bmatrix} = 6\)

The idea of folding is to combine two instance-witness pairs into one while preserving the R1CS equation. This proves knowledge of both pairs simultaneously. The verifier sends a random scalar R to the prover to secure the protocol.

Given R = 2, we perform linear combinations as follows:

A = A1 + R ⋅ A2 = 5 + 2 ⋅ 3 = 11
B = B1 + R ⋅ B2 = 5 + 2 ⋅ 2 = 9
C = C1 + R ⋅ C2 = 25 + 2 ⋅ 6 = 37

Verification Check

The verification step checks if the folded instance satisfies the relation:

A(Z) ⋅ B(Z) = C(Z)

However, using R1CS constraints, we get A(Z) ⋅ B(Z) = 99 and C = 37, which does not satisfy C = 99. Therefore, we move to relaxed R1CS:

In relaxed R1CS, we modify the structure by introducing:

Scalar u: To scale the output.
Error vector E: To absorb additional terms like R² and mixed terms.

Relaxed Parameters

u1 = 1 and u2 = 1
E1 = 0 and E2 = 0

These parameters are default in Nova but can and should be adjusted if a particular R1CS representation requires it.

Calculating the Scalar and Error Vector

\(\text{Scalar } u: \quad u = u_1 + R \cdot u_2 = 1 + 2 \cdot 1 = 3\)

\(\text{Error vector } E: \quad E = E_1 + R \cdot (A_1 \cdot B_2 + A_2 \cdot B_1 - u_1 \cdot C_2 - u_2 \cdot C_1) + R^2 \cdot E_2 \)

Breaking it down:

E1 + R = 0 + 2 = 2
A1 ⋅ B2 + A2 ⋅ B1 = 5 ⋅ 2 + 3⋅ 5 = 25
-u1 ⋅ C2 = -1 ⋅ 6 = -6
-u2 ⋅ C1 = -1 ⋅25 = -25
R² ⋅ E2 = 2² ⋅ 0 = 0

Thus: E = 2 ⋅ (25 - 6 - 25) + 0 = -12

Now verify the relaxed R1CS condition:

A ⋅ B = u ⋅ C + E

Calculate:

A ⋅ B = 11 ⋅ 9 = 99
u ⋅ C + E = 3 ⋅ 37 - 12 = 111 + 12 = 99

And voila, the relaxed R1CS equation holds.

The Protocol

The above example presents one of Nova's main contributions: folding enabled by relaxed R1CS. Here's how it extends into an IVC.

A relaxed R1CS is represented by the function F’.
At each step i, the prover folds the current instance I with the accumulated instance and commits to the witness W and cross-terms T of F′ generating the new folded instance and proof π. The cross-terms are defined as:

\(T = A_1 \cdot B_2 + A_2 \cdot B_1 - u_1 \cdot C_2 - u_2 \cdot C_1\)

The verifier holds the last folded instance represented by prover’s commitments to W and T. Upon receiving proof π from the prover, the verifier folds the instance and proof together.
The prover sends the commitments to W and T and their plaintext values.
The verifier ensures that the folding was performed correctly and that commitments represent F′.

This implementation, however, has two problems:

The proof size grows linearly with the number of steps due to the accumulating cross-terms in T from the entire incremental F′ execution. This makes such a scheme impractical for real-world applications like rollups.
The zero-knowledge property is not preserved since the prover must reveal the witness to the verifier at the final step.

This is where the IVC meets zkSNARKs, which address both issues.

To make a SNARK work, a polynomial commitment scheme is needed. In the original implementation of Nova, similarly to Bulletproofs, this is Pedersen commitments combined with the inner product argument. These commitments are additively homomorphic, allowing folding without revealing the underlying values.

Nova also uses Spartan (another SNARK) as a wrapper for succinct verification of folded proofs. Spartan does not require a trusted setup.

Wrap-up

Nova was the first implementation of a folding scheme. It is designed to work as an IVC represented as a (relaxed) R1CS: A ⋅ B = u ⋅ C + E, allowing the folding operation Z = Z1 + R ⋅ Z2 to work.

It doesn’t require a trusted setup and offers efficiency at proving time and verification costs.

After Nova, other folding schemes followed, introducing further optimizations::

Sangria For PLONKish arithmetization.
SuperNova: For a generalized F, it can be different at each iteration.
HyperNova: Folding with a sum-check protocol; generalizes arithmetization (CCS) for PLONKish, R1CS, or AIR.

That wraps up our journey through Nova folding. Stay tuned for more articles on ZK and blockchain infrastructure!

Sources:

Veridise: Nova & Folding
Lisa Akselrod: An incomplete guide to Folding
Srinath Setty: Nova: Recursive Zero-Knowledge Arguments from Folding Schemes
Nicolas Mohnblatt: Sangria is relaxed PLONK a Nova-like folding scheme for PLONK
Nova Paper

The folding scheme isn't a SNARK by itself, but when combined with its implementation as an IVC and the prover-verifier protocol, Nova is considered a SNARK (as described in the paper).

0xrafal

Discussion about this post