Basics of zk-SNARKs

Feb 10, 2023

Are you searching for a way to understand zero-knowledge proofs from a technical perspective, but finding existing resources either too shallow or too convoluted? Look no further! This article is for those who, like myself, are not math wizzes, but still want to dive as deep as possible into the topic. So buckle up, and let's start with the basics.

Zero-Knowledge Proofs

Zero-knowledge proofs (ZKPs) were first brought to light in 1989 with the publication of "The Knowledge Complexity of Interactive Proof Systems" by Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The concept received a lot of attention in the late 1990s, with the introduction of non-interactive ZKPs in "Succinct Non-Interactive Zero-Knowledge for a von Neumann Architecture" by Amit Sahai and Salil Vadhan. Despite its initial buzz, the use of ZKPs slowed down over the years, but have recently experienced a revival, especially in the context of blockchain technology. Early on, Zcash embraced the technology and became one of the first cryptocurrencies to implement ZKPs in 2014. Fast forward to today, and the crypto space is filled with new projects that utilize zero-knowledge proofs to enhance scalability and privacy. This innovative use of mathematics and cryptography has made ZKPs a buzzword in the crypto community.

ZKPs are a special kind of cryptographic proof that allows a prover to show to a verifier that they possess certain information, without revealing the information itself. For example, a prover could prove that a number x falls within a certain range [a,b] without revealing x's actual value. Similarly, it could be proved that an element z is part of a set S without revealing the value of the element, or prove that two numbers x and y are equal without revealing their values.

For ZKPs to work, the problem at hand must be in the NP class. NP stands for "nondeterministic polynomial time" and refers to problems that can be verified quickly and easily, even though they may be difficult to solve. Examples of NP problems include checking a hash function for a specific value, determining if a number is prime, or solving a sudoku puzzle.

In the blockchain world, ZKPs are a key tool for achieving two key goals: privacy and scalability. Privacy is the main reason for the adoption of ZKPs in cryptocurrencies like Zcash and decentralized applications like Tornado Cash, where ZKPs allow for transactions to be mixed without revealing information about the parties involved or the amount being transferred.

Scaling is another important reason for the use of ZKPs in blockchain. ZKPs allow for large amounts of data to be encoded into small-sized proofs that are easy to run and verify, making them a perfect solution for blockchains. In fact, Ethereum has introduced ZKPs in EIP-4844, also known as Proto-Danksharding, where the succinctness of ZKPs will be used to allow L2s to post transaction on Ethereum in "data blobs”.

For a ZKP system to be considered truly zero-knowledge, it must meet certain criteria. These include:

Completeness: If the statement is true, the verifier will be convinced of its validity.
Soundness: If the prover is lying, the verifier will not be convinced of the statement's validity.
Zero-knowledge: The proof reveals no information about the statement being proved beyond the fact that it is true.

Zero-knowledge proofs come in several different forms, each with its own method of proving a statement to a verifier. The main types of ZKPs include:

Interactive proofs: This type of proof involves a back-and-forth communication between the prover and verifier, where the prover provides pieces of information about the statement being proved, and the verifier asks questions (challenges) to verify the information. This process repeats until the verifier is confident that the statement is true.
Non-interactive proofs (SNARKs, if the succinctness requirement is met): In this type of proof, the prover sends all the information needed to verify the statement, along with pre-computed answers to potential challenges, to the verifier. This eliminates the need for further interaction and makes the process faster. However, a trusted setup between the parties is required to pre-determine the answers to the challenges.
Transparent proofs: Unlike non-interactive proofs, transparent proofs do not require a trusted setup and thus offer a more secure solution. They eliminate the risk of a malicious agent exploiting the system if the algorithms were to be compromised in the future

SNARKs

SNARK stands for "Succinct Non-Interactive ARgument of Knowledge," meaning that the proof can be generated and verified quickly without any interaction between the prover and verifier. To understand how zk-SNARKs work, it is helpful to start with a basic understanding of arithmetic circuits. Any problem in the NP class can be translated into an arithmetic circuit, which consists of input gates, sum gates, and product gates. Input gates are labeled with either a variable or a value and have no incoming wires. Sum gates are labeled with "+" and perform addition, while product gates are labeled with "*" and perform multiplication.

For example, imagine a prover P wants to prove to a verifier V that he knows the value of x such that the polynomial:

\(p(x) = x^2 + x + 7 = 37\)

The arithmetic circuit for this polynomial would look as follows:

      +
     / \
    +    7
   / \
  *   x
 / \
x   x

The input gates in the circuit are for the variable x and constant 7, and the circuit has three arithmetic gates performing three arithmetic operations. Although real-life arithmetic circuits have far more gates and create much more complicated polynomials, what's important to note is that an arithmetic circuit can always be expressed in the polynomial form.

To create a ZK-SNARK, the next step is to turn the arithmetic circuit into a set of equations using a system called the Rank 1 Constraint System (R1CS). This system allows the representation of the arithmetic circuit as a set of linear equations. For example, the R1CS representation of the polynomial p(x) would look like this:

X = x
A = X^2
B = A + X
C = B + 7

The process of transforming the arithmetic circuit into a set of equations is called "flattening." In this process, X, A, B and C are the variables and the equations represent the operations carried out in the arithmetic circuit. To represent the system of linear equations in R1CS, a set of matrices is used:

s
[1, 5, 37, 25, 30]

A
[0, 1, 0, 0, 0]
[0, 1, 0, 1, 0]
[7, 0, 0, 0, 1]
B
[0, 1, 0, 0, 0]
[1, 0, 0, 0, 0]
[1, 0, 0, 0, 0]
C
[0, 0, 0, 1, 0]
[0, 0, 0, 0, 1]
[0, 0, 1, 0, 0]

The vector s represents the solutions to the operations performed in the linear equations and is presented in the form[1, X, C, A, B].

In the process of calculating the matrices A, B, C we need to ensure that they meet a specific equation: s.A * s.B - s.C = 0, which is the same as s.A * s.B = s.C. We can verify this by taking the dot product of each element in the matrices and adding them up:

A
[1, 5, 37, 25, 30].[0, 1, 0, 0, 0] = 5
[1, 5, 37, 25, 30].[0, 1, 0, 1, 0] = 30 (5+25)
[1, 5, 37, 25, 30].[7, 0, 0, 0, 1] = 37 (30+7)

B
[1, 5, 37, 25, 30].[0, 1, 0, 0, 0] = 5
[1, 5, 37, 25, 30].[1, 0, 0, 0, 0] = 1
[1, 5, 37, 25, 30].[1, 0, 0, 1, 0] = 1

C
[1, 5, 37, 25, 30].[0, 0, 0, 1, 0] = 25
[1, 5, 37, 25, 30].[0, 0, 0, 0, 1] = 30
[1, 5, 37, 25, 30].[0, 0, 1, 0, 0] = 37

Our calculation shows that s.A * s.B = s.C is met, since 5*5=25, 30*1=30, 37*1=37 This means that the prover P knows the right answer and that our arithmetic circuit is properly constructed. The purpose of the R1CS representation is to ensure that our arithmetic circuit produces the correct answer (also known as witness), and lays out all the conditions for our statement (constrains).

If you would like to understand the process better, you can use the Python code written by Vitalik to turn your set of equations into an R1CS.

The next step in the ZK-SNARK creation process is to convert the R1CS representation into a Quadratic Arithmetic Program (QAP). This is important because ZK-SNARKs rely on polynomials to prove statements, and while the R1CS matrices may confirm that the calculation is correct and the prover knows the witness, they are not suitable for ZK-SNARK purposes. The transformation from R1CS to QAP will convert the matrices into polynomials, making them suitable for use in a ZK-SNARK.

A polynomial of degree n has at least n + 1 points where it intersects with the coordinate plane. For example, a polynomial of degree 1, like a line, would intersect with the plane at two points. Meanwhile, a polynomial of degree 2, like a parabola, would have three points of intersection. Given that our R1CS matrices have three rows (or that our arithmetic circuits have three gates) we'll be constructing polynomials of the second degree.

We will be using the coordinates 1, 2, 3 for the x-axis, and for each column of each matrix created for the R1CS representation, we will use these values to find the corresponding y-axis coordinates. For example, consider the first column of the first matrix, 0, 0, 7. We will map these values to create the following set of coordinates (1,0), (2,0), (3,7).

Using Lagrange interpolation, we will find a polynomial that satisfies these coordinates. To do this, we create individual polynomials for each point and then combine them to form a final polynomial that passes through all three points. The final polynomial serves as an estimate for the underlying function that generated these points. Lagrange polynomials for coordinates(1,0), (2,0), (3,7) are, respectively:

\(L_1(x) = \frac{(x - 2)(x - 3)}{(1 - 2)(1 - 3)} \times 0 = 0\)

\(L_2(x) = \frac{(x - 1)(x - 3)}{(2 - 1)(2 - 3)} \times 0 = 0\)

\(L_3(x) = \frac{(x - 1)(x - 2)}{(3 - 1)(3 - 2)} \times 7 = \left(\frac{(x - 1)(x - 2)}{2}\right) \times 7\)

To find the Lagrange polynomial for the point (1,0), we first need to create a polynomial that is non-zero at x = 1 and zero at the two other points. This can be done using the formula (x - 2)(x - 3). We then divide this by (1 - 2)(1 - 3)to find the correct value of y. Finally, we multiply the polynomial by the value of y for our point.

We repeat this process for the two other points, and then add up all of the resulting polynomials to get the final Lagrange interpolating polynomial:

\(\begin{align*} p(x) &= L_1(x) + L_2(x) + L_3(x) \\ &= 0 + 0 + \left( \frac{(x - 1)(x - 2)}{2} \times 7 \right) \\ &= \frac{7}{2} \times (x - 1)(x - 2) \\ &= \frac{7}{2} \times (x^2 - 3x + 2) \end{align*}\)

The above result can also be written in the form of another polynomial:

\(\begin{align*} p(x) &= \frac{7}{2} \times (2 - 3x + x^2) \\ &= 7 - 10.5x + 3.5x^2 \end{align*}\)

Which then can be written in the form of a vector: [7, 10.5, 3.5]. We thus have transformed the first row of the matrix A of R1CS (transposed) to a polynomial. When we do the same with the remaining rows, we get the following matrix:

Ap
[7.0, -10.5, 3.5]
[0.0, 1.5, -0.5]
[0.0, 0.0, 0.0]
[-3.0, 4.0, -1.0]
[1.0, -1.5, 0.5]
Bp
[-2.0, 2.5, -0.5]
[3.0, -2.5, 0.5]
[0.0, 0.0, 0.0]
[0.0, 0.0, 0.0]
[0.0, 0.0, 0.0]
Cp
[0.0, 0.0, 0.0]
[0.0, 0.0, 0.0]
[1.0, -1.5, 0.5]
[3.0, -2.5, 0.5]
[-3.0, 4.0, -1.0]

You can use the code written by Vitalik to get such QAP for your SNARK.

Now, our initial statement is represented in the form of polynomials, and soon we will be able to do some cryptography on it.

Recap

Before closing the first part of the article, let’s recap all the steps we went through:

We began by defining our statement, where the prover wants to prove to the verifier that they know the value of x such that the equation x^2 + x + 7 = 37 holds true. This statement needs to be quickly verifiable for us to build a ZK-SNARK for it.
We then created an arithmetic circuit that represented the statement and flattened it by writing each calculation in the circuit as a separate polynomial. The circuits only accept addition and multiplication as possible operations.
From the flattened arithmetic circuit, we created an R1CS representation of the problem. R1CS consists of three matrices that must satisfy the equation: s.A * s.B - s.C = 0.
Finally, we transformed the R1CS to QAP using Lagrange interpolation. QAP is a matrix of polynomials that captures the underlying function of the original statement, with each row representing the coefficients of the polynomials.

To wrap up, our example was simple and straightforward, but in reality, arithmetic circuits can become quite complex, with thousands of gates and wires. Performing manual calculations for these large circuits is not a feasible option. That's why tools such as Circom (Hardware Description Language), or programming languages with compilers such as zoKrates, and Cairo are utilized to build the circuits and perform the complex calculations.

In the next part of the article, we will explore the process of constructing an actual zk-SNARK from the polynomials we derived. We will look at how polynomial commitments are used to verify the set of polynomial equations, and how the final proofs for statements are created. Stay tuned!

0xrafal

Discussion about this post