The Groth16 algorithm enables a quadratic arithmetic program to be computed by a prover over elliptic curve points derived in a trusted setup, and quickly checked by a verifier. It uses auxiliary elliptic curve points from the trusted setup to prevent forged proofs.
We refer to an elliptic curve point belonging to the G1 elliptic curve group as [x]1 and an elliptic curve point belonging to the G2 elliptic curve group as [x]2. A pairing between [x]1 and [x]2 is denoted as [x]1∙[x]2 and produces an element in G12. Variables in bold such as a are vectors, upper case bold letters such as L are matrices, and field elements (sometimes informally referred to as “scalars”) are lower case letters such as d. All arithmetic operations happen in a finite field with a characteristic that equals the order of the elliptic curve group.
Given an Arithmetic Circuit (ZK Circuit), we convert it to a Rank 1 Constraint System (R1CS)La∘Ra=Oa with matrices of dimension n rows and m columns with a witness vector a. Then, we can convert the R1CS to Quadratic Arithmetic Program (QAP) by interpolating the columns of the matrices as y values over the x values [1,2,...,n]. Since L, R, and O have m columns, we will end up with three sets of m polynomials:
u1(x),...,um(x)v1(x),...,vm(x)w1(x),...,wm(x)m polynomials interpolated on the m columns of Lm polynomials interpolated on the m columns of Rm polynomials interpolated on the m columns of O
From this, we can construct a Quadratic Arithmetic Program (QAP):
If a third party creates a structured reference string (srs) via a powers of tau ceremony, then the prover can evaluate sum terms (the ∑aifi(x) terms) in the QAP at a hidden point τ. Let the structured reference strings be computed as follows:
[Ωn−1,Ωn−2,…,Ω2,Ω1,G1][Θn−1,Θn−2,…,Θ2,Θ2,G2][Υn−2,Υn−3,…,Υ1,Υ0]=[τnG1,τn−1G1,…,τG1,G1]=[τnG2,τn−1G2,…,τG2,G2]=[τn−2t(τ)G1,τn−3t(τ)G1,…,τt(τ)G1,t(τ)G1]srs for G1srs for G2srs for h(τ)t(τ)
We refer to f(τ) as a polynomial evaluated on a structured reference string [τdG1,...,τ2G1,τG1,G1] via the inner product:
If the QAP is balanced, then the following equation holds:
[A]1∙[B]2=?[C]1∙G2
Motivation
Simply presenting ([A]1,[B]2,[C]1) is not a convincing argument that the prover knows a such that the QAP is balanced.
The prover can simply invent values a, b, c where ab=c, compute
[A]1[B]2[C]1=aG1=bG2=cG1
and present those as elliptic curve points [A]1, [B]2, [C]1 to the verifier.
Thus, the verifier has no idea if ([A]1,[B]2,[C]1) were the result of a satisfied QAP or not.
We need to force the prover to be honest without introducing too much computational overhead. The first algorithm to accomplish this was “Pinocchio: Nearly Practical Verifiable Computation.” This was usable enough for ZCash to base the first version of their blockchain on it.
However, Groth16 was able to accomplish the same thing in much fewer steps, and the algorithm is still widely in use today, as no algorithm since has produced as efficient an algorithm for the verification step (though other algorithms have eliminated the trusted setup or significantly reduced the amount of work for the prover).
Update for 2024: A paper rather triumphantly titled “Polymath: Groth16 is not the limit” published in Cryptology demonstrates an algorithm that requires fewer verifier steps than Groth16. However, there are no known implementations of the algorithm at this time of writing.
Preventing forgery Part 1: Introducing α and β
An “unsolveable” verification formula
Suppose we update our verification formula to the following:
[A]1∙[B]2=?[D]12+[C]1∙G2
Note that we are using additive notation for the G12 group for convenience.
Here, [D]12 is an element from G12 and has an unknown discrete logarithm.
We now show that it is impossible for a verifier to provide a solution ([A]1,[B]2,[C]1) to this equation, without knowing the discrete logarithm of [D]12.
Attack 1: Forging A and B and deriving C
Suppose the prover randomly selects a’ and b’ to produce [A]1 and [B]2 and tries to derive a value [C’] that is compatible with the verifier’s formula.
[A]1∙[B]2=?[D]12+[C]1∙G2
Knowing the discrete logarithms of [A]1 and [B]2, the malicious prover tries to solve for [C’] by doing
The final line is requires the prover to solve for the discrete log of χ12, so they cannot compute a valid discrete log for [C′]1.
Attack 2: Forging C and deriving A and B
Here the prover picks a random point c′ and computes [C′]1. Because they know c′, they can try to discover a compatible combination of a′ and b′ such that
This requires the prover, given [ζ]12, to come up with an [A]1 and [B]2 that pair to produce [ζ]12.
Similar to the discrete log problem, we rely on unproven cryptographic assumptions that this computation (decomposing an element in G12 into a G1 and G2 element) is infeasible. In this case, the assumption that we cannot decompose [ζ]12 into [A]1 and [B]2 is called the Bilinear Diffie-Hellman Assumption. The interested reader can see a related discussion on the Decisional Diffie-Hellman Assumption.
(Unproven does not mean unreliable. If you can find a way to prove or disprove this assumption, fame and fortune awaits you! In practice, there is no known way to decompose [ζ]12 into [A]1 and [B]2 and it is believed to be computationally infeasible.)
How α and β are used
In practice, Groth16 doesn’t use a term [D]12. Instead, the trusted setup generates two random scalars α and β and publishes the elliptic curve points ([α]1,[β]2) computed as:
[α]1[β]2=αG1=βG2
What we referred to as [D]12 is simply [α]1∙[β]2.
Re-deriving the proving and verification formulas
To make the verification formula [A]1∙[B]2=?[α]1∙[β]2+[C]1∙G2 “solvable”, we need to alter our QAP formula to incorporate α and β.
The prover can compute [A]1 and [B]2 without knowing τ, α, or β. Given the structured reference string (powers of τ) and the elliptic curve points ([α]1,[β]2), the prover computes [A]1 and [B]2 as
Here, aiui(τ) does not mean the prover knows τ. The prover is using the structure reference string [τn−1G1,τn−2G1,…,τG1,G1] to compute ui(τ) for i=1,2,…,m and the G2 srs for for [B]2.
However, it isn’t currently possible to compute [C]1 without knowing α and β. The prover cannot pair [α]1 with ∑aiui(τ) and [β]2 with ∑aivi(τ) because that would create a G12 point, whereas the prover needs a G1 point for [C]1.
Instead, the trusted setup needs to precompute m polynomials for the problematic C term of the expanded QAP.
αi=1∑maivi(τ)+βi=1∑maiui(τ)+i=1∑maiwi(τ)
With some algebraic manipulation, we combine the sum terms into a single sum:
=i=1∑m(αaivi(τ)+βaiui(τ)+aiwi(τ))
and factor out ai:
=i=1∑mai(αvi(τ)+βui(τ)+wi(τ))
The trusted setup can create m polynomials evaluated at τ from the boxed term above, and the prover can use that to compute the sum. The exact details are shown in the next section.
Summary of the algorithm so far
Trusted setup steps
Concretely, the trusted setup computes the following:
α,β,τ[τn−1G1,τn−2G1,…,τG1,G1][τn−1G2,τn−2G2,…,τG2,G2][τn−2t(τ),τn−3t(τ),…,τt(τ),t(τ)][Ψ1]1[Ψ2]1[Ψm]1←random scalars←srs for G1←srs for G2←srs for h(τ)t(τ)=(αv1(τ)+βu1(τ)+w1(τ))G1=(αv2(τ)+βu2(τ)+w2(τ))G1⋮=(αvm(τ)+βum(τ)+wm(τ))G1
Note that only the computation of [C]1 changed – the prover only uses the ai and Ψi terms ℓ+1 to m.
The verifier computes the first ℓ terms of the sum:
[X]1=i=1∑ℓaiΨi
And the verification equation is:
[A]1∙[B]2=?[α]1∙[β]2+[X]1∙G2+[C]1∙G2
Part 2: Separating the public inputs from the private inputs with γ or δ
Forging proofs by misuing Ψi for i≤ℓ
The assumption in the equation above is that the prover is only using Ψℓ+1 to Ψm to compute [C]1, but nothing stops a dishonest prover from using Ψ1 to Ψℓ to compute [C]1, leading to a forged proof.
For example, here is our current verification equation:
[A]1∙[B]2=?[α]1∙[β]2+i=1∑ℓaiΨi+[C]1∙G2
If we expand the C term under the hood, we get the following:
Suppose for example and without loss of generality that a=[1,2,3,4,5] and ℓ=3. In that case, the public part of the witness is [1,2,3] and the private part is [4,5].
However, nothing stops the prover from creating an valid portion of the public witness as [1,2,0] and moving the zeroed out public portion to the private part of the computation as follows:
The equation above is valid, but the witness does not necessarily satisfy the original constraints.
Therefore, we need to prevent the prover from using Ψ1 to Ψℓ as part of the computation of [C]1.
Introducing γ and/or δ
To avoid the problem above, the trusted setup introduces a new scalar : γ or δ to force Ψℓ+1 to Ψm to be separate from Ψ1 to Ψℓ. To do this, the trusted setup divides (multiplies by the modular inverse) the private terms (that constitute [C]1) by δ and/or the public terms (that constitute [X]1, the sum the verifier computes) by γ.
Since the h(τ)t(τ) term is embedded in [C]1, those terms also need to be divided by δ. If either δ and γ have an unknown discrete logarithm, then the forgery described earlier along possible other methods are avoided. This method was used in Zcash’s Sapling’s‑based trusted setups where γ is simply left to G2 and δ is still updated from G2 to a random value at later trusted setup stages.
α,β,τ,γ,δ[τn−1G1,τn−2G1,…,τG1,G1][τn−1G2,τn−2G2,…,τG2,G2][δτn−2t(τ),δτn−3t(τ),…,δτt(τ),δt(τ)][Ψ1]1[Ψ2]1[Ψℓ]1[Ψℓ+1]1[Ψℓ+2]1[Ψm]1←random scalars←srs for G1←srs for G2←srs for h(τ)t(τ)public portion of the witness=γαv1(τ)+βu1(τ)+w1(τ)G1=γαv2(τ)+βu2(τ)+w2(τ)G1⋮=γαvℓ(τ)+βuℓ(τ)+wℓ(τ)G1private portion of the witness=δαvℓ+1(τ)+βuℓ+1(τ)+wℓ+1(τ)G1=δαvℓ+2(τ)+βuℓ+2(τ)+wℓ+2(τ)G1⋮=δαvm(τ)+βum(τ)+wm(τ)G1
And the verifier steps now include pairing by [γ]2 and/or [δ]2 to cancel out the denominators:
[A]1∙[B]2=?[α]1∙[β]2+[X]1∙[γ]2+[C]1∙[δ]2
or
[A]1∙[B]2=?[α]1∙[β]2+[X]1∙[G]2+[C]1∙[δ]2
Part 3: Enforcing true zero knowledge: r and s
Our scheme is not yet truly zero knowledge. If an attacker is able to guess our witness vector (which is possible if there is only a small range of valid inputs, e.g. secret voting from privileged addresses), then they can verify their guess is correct by comparing their constructed proof to the original proof.
As a trivial example, suppose our claim is x1 and x2 are both either 0 or 1. The corresponding arithmetic circuit would be
x1(x1−1)=0x2(x2−1)=0
An attacker only needs to guess four combinations to figure out what the witness is. That is, they guess a witness, generate a proof, and see if their answer matches the original proof.
To prevent guessing, the prover needs to “salt” their proof, and the verification equation needs to be modified to accommodate the salt.
The prover samples two random field elements r and s and adds them to A and B to make the witness unguessable – an attacker would have to guess both the witness and the salts r and s:
To derive the final verification formula, let’s temporarily ignore that we don’t know the discrete logs of the Greek letter terms and compute the left-hand-side of the verification equation AB:
We now separate this equation in to the verifier and prover portions. The boxed terms are the verifier portion, the underbrace terms are the terms that the prover provides:
We are now ready to show the Groth16 algorithm end-to-end. The trusted setup and the verification steps remain unchanged from the previous example where we incorporated γ and δ. Only the prover’s calculation changes to incorporate r and s.
Trusted Setup
α,β,τ,γ,δ[τn−1G1,τn−2G1,…,τG1,G1][τn−1G2,τn−2G2,…,τG2,G2][δτn−2t(τ),δτn−3t(τ),…,δτt(τ),δt(τ)][Ψ1]1[Ψ2]1[Ψℓ]1[Ψℓ+1]1[Ψℓ+2]1[Ψm]1←random scalars←srs for G1←srs for G2←srs for h(τ)t(τ)public portion of the witness=γαv1(τ)+βu1(τ)+w1(τ)G1=γαv2(τ)+βu2(τ)+w2(τ)G1⋮=γαvℓ(τ)+βuℓ(τ)+wℓ(τ)G1private portion of the witness=δαvℓ+1(τ)+βuℓ+1(τ)+wℓ+1(τ)G1=δαvℓ+2(τ)+βuℓ+2(τ)+wℓ+2(τ)G1⋮=δαvm(τ)+βum(τ)+wm(τ)G1
At this point, you have sufficient knowledge to understand the proof verification code in Solidity. Here is Tornado Cash’s proof verification code. The reader is encouraged to read the source code closely. If the reader is comfortable with Solidity assembly programming, then understanding this source code will not be difficult as the variable names are consistent with the ones in this article.
([A]1,[B]2,[C]1), an attacker can compute the point negation of [A]1 and [B]2 and present a new proof as ([A′]1,[B′]2,[C]1) where [A′]1=neg([A]1) and [B′]2=neg([B]2).
To see that [A]1∙[B]2=[A′]1∙[B′]2, consider the following code:
from py_ecc.bn128 import G1, G2, multiply, neg, eq, pairing
# chosen arbitrarily
x = 10
y = 100
A = multiply(G1, x)
B = multiply(G2, y)
A_p = neg(A)
B_p = neg(B)
assert eq(pairing(B, A), pairing(B_p, A_p))
Intuitively, the attacker is multiplying A and B by −1, and (−1)×(−1) cancels itself out in the pairing.
The defense against this attack is described in the following section.
You can see a proof of concept of this attack in this article.
The prover can create an unlimited number of proofs for the same witness
This isn’t a “security issue” per se – it is necessary to achieve Zero Knowledge. However, the application needs a mechanism to track which facts have already been proven and cannot rely on the uniqueness of the proof to achieve that.
Learn more with RareSkills
Our ability to publish material like this free of charge depends on the continued support of our students. Consider signing up for our Zero Knowledge Bootcamp, Web3 Bootcamps, or getting a job on RareTalent.
Originally Published August 31, 2023
Ready to Get Started? Join Thousands of Users Today
Start your free trial now and experience the difference. No credit card required.