Foreword

Welcome to the ZK Jargon Decoder 🕵️

The ZK Jargon Decoder aims to be a dictionary and reference guide for common jargon found in cryptography and the zero-knowledge literature. It is collection of informal and practical definitions. Each term should have a one-liner for quick reference and a more detailed explanation if needed.

The nature of this project implies that our definitions will not be perfectly accurate: some technical details will be omitted, some subtleties will be ignored. However if you feel that any of these definitions overlook important aspects of the terms they clarify please get in touch by email or Twitter:

email: nico@geometry.dev // Twitter: @nico_mnbl

This project is still work in progress, participation and suggestions are always welcome!

License

ZK Jargon Decoder © 2022 by ZK Jargon Decoder contributors is licensed under CC BY-SA 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/

Introduction to ZK Jargon


Part 1: What does it mean to prove?

This article aims to be a gentle introduction to the notions of decision problems, relations, languages and the complexity class. We follow a running example and rephrase it each time we cover a new term.

Decision Problems

In mathematics, a statement is either TRUE or FALSE. However, deciding whether a statement is true or false is not always easy. Consider for example the following statements:

  • Statement 1: the number is even.
  • Statement 2: the number can be factored.

We can quickly check that Statement 1 is TRUE, but what about Statement 2? To our knowledge, the best way to decide Statement 2 is just to try factoring . If we succeed, the statement is TRUE, but we might be trying for a long time, with no guarantee of ever succeeding!

What if we are given additional information? Let’s say we were magically handed (or by luck stumbled upon) the numbers and . We can try to multiply them together and find that indeed . We can now decide Statement 2 ✅.

Notice that with the additional information, we only needed to perform one multiplication; much less work than trying out all the potential factorizations. In a way, this additional information was in fact a proof that Statement 2 is TRUE.

Answer #1 (informal): proving means giving auxiliary information about a statement to decide that it is true.

We can also rephrase this answer in the form of a condition: a provable statement is one that can be easily decided provided the right information. In complexity theory, this corresponds to the class .

Relations

The discussion above is formalised by the notions of relations and languages:

  • A relation is a set of ordered pairs . When talking about relations (provable statements) we refer to the first item, , as an instance, and the second, , as a witness.

Example: factors. Let’s define a relation that we will call . An instance of this relation is an integer, and a witness is an unordered list of integers greater than 1. We will say that the pair is in if and only if contains more than one element and the product of all the elements in is equal to .

For example, the instance has the witness . The instance has multiple witnesses: including , and .

We can now recast Statement 2 in terms of :

  • Statement 2 (with jargon, part 1): there exist a witness such that is in .

As we saw above, finding such a witness ourselves is a lot of work. However, given the candidate witness , we can quickly check that the pair is in .

Languages

Rather than always having to say “the instance has a witness such that ”, we use the shorter form “the instance is satisfiable”. Note that the relation which satisfies isn’t explicitly stated and must be made clear from the context.

We can collect all the satisfiable instances of a relation in a set:

  • the language defined by a relation is the set of all satisfiable instances for . We often write or .

Example: factors. As we have seen, and are in . On the other hand, prime numbers such as or cannot be expressed as a product of integers. Therefore, they are not in .

Once again we can rephrase Statement 2 in terms of the language defined by :

  • Statement 2 (with jargon, part 2): is in .

Answer #2 (same as before, in jargon): proving means giving evidence that an instance is indeed in the language defined by some relation .

More examples. As an exercise, try to cast a Sudoku as a relation and identify the instance, witness and language. What about the relation , defined for some function as all pairs such that ? What happens when we pick to be (i) a polynomial or (ii) a hash function?


Up next: probabilistic and interactive proofs

So far we have only considered the trivial proof system: sending the witness.

In some cases, this is not desirable. Sometimes the witness is private and should remain so. Other times, the witness is just too big to be sent or for the verifier to process. These cases motivate the need for more elaborate (and powerful) proof systems. We discuss these systems more formally in the next article.

Introduction to ZK Jargon


Part 2: Probabilistic and Interactive Proofs

In the previous article, we covered what “proving” means, established some necessary vocabulary and finally discussed the trivial proof system: sending the witness. As discussed, this is not always desirable. Sometimes the witness is private and should remain so. Other times, the witness is just too big to be sent or for the Verifier to process.

In this article, we look at two techniques to build more elaborate proof systems:

  • the first is to allow the Verifier to “ask multiple questions”, i.e. to interact with the Prover;
  • the second is to allow the Verifier to not read the whole proof.

Naturally, if the proof is shorter than the witness or it is some randomized version of it, the Verifier might sometimes accept proofs for FALSE statements. This is acceptable (!), as long as it only happens with a small probability. We call this probability the soundness error (see Soundness).

1. Asking multiple questions

The first technique can be understood as “asking multiple questions”. Rather than sending the witness as a single message, the Prover and Verifier can interact over multiple rounds. This forces the Prover to stay consistent: once something is said, it cannot be unsaid; and any contradiction between the Prover’s messages can be used against him.

Proof systems that use this paradigm are known as interactive proofs (IP).

Interactive proof

In practice, we much prefer non-interactive proofs. Luckily, if an IP is such that the Verifier’s messages are all sampled at random from a uniform distribution, there are techniques to transform the IP into a non-interactive proof (e.g., the Fiat-Shamir Transform).

2. Not reading the full proof

The second technique is to not read the full proof. Instead, the Verifier will query the proof at location of her choice. The locations are usually chosen at random after the Prover sent his message.

Proof systems that use this paradigm are known as probabilistically checkable proofs (PCP).

Probabilistic proof

Note that for PCPs to make sense, we need a way for the Prover to send his message, without requiring the Verifier to read it in full. When working on theoretical systems, we allow the Prover to send an oracle for his proof. In this context, an oracle is a black-box that stores the proof string and can be queried at any index. To turn these into real-world systems, we implement the oracle with a commitment scheme.

3. Combining both

Of course, we can combine the two above techniques. The resulting proof systems are known as interactive oracle proofs (IOP).

Interactive oracle proof

Interestingly, IOPs are not more powerful than PCPs. Everything that can be proven by an IOP can also be proven by a PCP. However, they yield more efficient protocols in practice.

Our favorite recipe to build a SNARK is as follows: we start with an IOP, replace the theoretical oracles with commitments, and then remove interaction by using the Fiat-Shamir transform. The SNARK “proof” is a collection of commitments to the Prover’s messages, the values of the IOP messages at the locations queried by the Verifier, and some extra information to convince the Verifier that the values do correspond to the committed strings.

For a formal treatment of these techniques, please see the excellent free online book Building Cryptographic Proofs from Hash Functions by Chiesa and Yogev.

References

See ’’Bibliographic Notes“ in Chiesa and Yogev’s Building Cryptographic Proofs from Hash Functions.


Cartoons created in Excalidraw by Nicolas Mohnblatt, using the Stick Figures collection by Youri Tjang and the Speech Bubbles collection by Oscar Capraro.

Algebraic Holographic Proof (AHP)

An algebraic holographic proof is a interactive proof where the prover sends oracles which are low degree polynomials and can be split into two categories: those that can be processed before the prover-verifier interactions and those that cannot. Closely related to polynomial IOP.


Algebraic Holographic Proofs are first defined in the Marlin paper [CHMMVW20] as a means to separate the information theoretic aspects of SNARKs from the cryptographic aspects. It is an interactive oracle proof with extra properties:

  • algebraic: an honest prover only produces oracles for low degree polynomials (just like in polynomial IOPs)
  • holographic: the verifier does not need to see the proof’s input (e.g. a circuit) but instead has oracle access to an encoding of it.

We use AHPs, polynomial commitment schemes and the Fiat-Shamir heuristic to construct pre-processing SNARKs such as Marlin and PLONK.

AHP or Polynomial IOP? Algrebraic holographic proofs and polynomial interactive oracle proofs are almost equivalent notions. They were developed concurrently in 2019 by separate research groups: the former by the group behind Marlin [CHMMVW20] and the latter by the group behind DARK [BFS20]. While they formalise very similar proof systems, polynomial IOPs are more general in that they do not require holography (as defined above).

References

[BFS20] Bünz, Benedikt, Ben Fisch, and Alan Szepieniec. “Transparent SNARKs from DARK compilers.” In Advances in Cryptology–EUROCRYPT 2020: 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zagreb, Croatia, May 10–14, 2020, Proceedings, Part I 39, pp. 677-706. Springer International Publishing, 2020.

[CHMMVW20] Chiesa, Alessandro, Yuncong Hu, Mary Maller, Pratyush Mishra, Noah Vesely, and Nicholas Ward. “Marlin: Preprocessing zkSNARKs with universal and updatable SRS.” In Advances in Cryptology–EUROCRYPT 2020: 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zagreb, Croatia, May 10–14, 2020, Proceedings, Part I 39, pp. 738-768. Springer International Publishing, 2020.

Arithmetization

Ambiguous. The process of turning a generic statement or question into a set of equations to be verified or solved - also refers to the output of that process.


Arithmetization is the process of turning a generic statement or question into a set of equation to be verified or solved. Consider the following statement: “I am twice older than my youngest sibling”. Can we write this mathematically? Let’s write the ages of the siblings and the age of the claimant. We can now rewrite the statement as:

To verify the original statement for Alice (8) and her siblings Bob (9) and Charlie (4), we can just plug in the values to our equation. We evaluate each side and determine whether the equation holds: if it holds the statement was true, if it does not the statement was false. Simple as that!

A “good” arithmetization is one in which the final mathematical expressions can be evaluated with minimal effort (computation). While our example was trivial, the process of arithmetization becomes more complex for abstract statements such as: “I have correctly shuffled a deck of cards” or “I know a secret value such that running a public program with as input will output the public value ”.


See also: R1CS, PLONK Arithmetization

Base Field

Elliptic Curve Cryptography. The finite field from which coordinates of an elliptic curve point are chosen.

⚠️ Prerequisites: Finite fields.

See Elliptic Curve.

Circuit

In ZK literature this usually refers to an arithmetic circuit: an ordered collection of additions and multiplications (and sometimes custom operations) that are applied to a set of inputs to yield an output.


The word “circuit” is used somewhat ambiguously, but most of the times we refer to an arithmetic circuit. An arithmetic circuit is an ordered collection of operations (e.g. addition, multiplication) represented by gates. These gates are connected by wires. Given an arithmetic circuit, we can apply an input signal, allow it to propagate through the wires and gates and observe the output. Below is an example of an arithmetic circuit that expects 3 inputs , and , and computes:

Why are we so obsessed with arithmetic circuits? Finding a set of input values that produce a desired output is a hard problem. This problem is known as the circuit satisfiability problem and has been heavily studied in complexity theory. One notable result is that any provable statement can be converted to the satisfiability problem (more formally, the circuit satisfiability problem is NP-complete). This comes in handy when we want to construct proof systems for generic statements.

Coefficient Form

Of a polynomial. Represent a polynomial as a list of the coefficients associated to each power of the indeterminate variable.


Let’s look at an example polynomial , defined as . How can we describe this polynomial in a computer program?

One approach is to record a list of the coefficients in front of every power of the indeterminate : p_coeffs = [5, 0, 4, 1]. Here we ordered the coefficients from lowest to highest power; conveniently, the index of the elements of our array correspond to the power of . As we recorded coefficients, this is known as the “coefficient form”.

Another equivalent representation would be to provide evaluations of at a large enough number of known points (see Evaluation Form).

Coefficient Form vs Evaluation Form.

There is no strictly superior representation. The coefficient form allows for a more lightweight representation of sparse polynomials (polynomials where many of the coefficients are ). Indeed, we only need to record the non-zero coefficients. On the other hand, some operations such as polynomial multiplication are much more expensive in coefficient form () than they are in evaluation form ().

We can convert from coefficient form to evaluation form by evaluating the polynomial at points. The operation that converts from evaluation form back to coefficient form is known as polynomial interpolation (Lagrange interpolation is one way to perform this operation).

Common Reference String

The collection of all public parameters required to run a protocol - usually in the context of SNARKs and polynomial commitment schemes.


In polynomial commitment schemes and SNARKs - like may other cryptographic protocols - different parties need to agree on some common parameters. These often include what kind of mathematical objects they will be manipulating (integers, points on an elliptic curve, lattices, etc…), a prime number, a generator of a cyclic group. Once a setup has been run to determine all the necessary parameters, they are collected into a string and are published for future reference. This collection (string) is what we call a common reference string.

Trusted Setups. In some cases the common parameters are constructed using information that needs to remain secret for the protocol to be secure. In this case we refer to the setup as a Trusted Setup and the string of parameters as a Structured Reference String.

Completeness

Of a proof system. A proof system is complete if, for every TRUE statement, the Prover can always produce an accepting proof.


Completeness is a property of a proof system best understood as: “an honest Prover should always be able to convince an honest Verifier of a valid statement”.

Given a relation and the associated language , the mathematical expression for completeness looks often like the equation below: where and are the honest Prover and Verifier respectively, and denotes the bit output by at the end of the interaction with for the instance .

Perfect completeness vs Completeness.
The definition above imposes that the honest prover always convince the honest verifier (i.e. the probability is 1). We call this property perfect completeness. Sometimes this is not necessary, nor is it achievable. In those cases, we can relax the success probability to be greater than , where is something small.

Constraints

In arithmetized circuits. A constraint is an equation that relates gate inputs to gate outputs.

Elliptic Curve

A set of coordinate pairs that satisfy the equation .

⚠️ Prerequisites: Finite fields.

An elliptic curve is defined as the set of all coordinate pairs such that:

In elliptic curve cryptography, we pick and to be elements of a finite field. We refer to this field as the base field.

We then define a group operation which we call “point addition” and usually denote with the ‘’ symbol. Repeated applications of the group operations can be counted like below:

The number ‘’ above is referred to as a scalar. If we are in a cyclic group, then the scalars also define a finite field: we call this field the scalar field.

Evaluation Form

Of a polynomial. Represent a polynomial as a list of evaluations of the polynomial at given points.


Let’s look at an example polynomial , defined as . How can we describe this polynomial in a computer program?

One approach is to provide evaluations of at a large enough number of known points. In fact, for a polynomial of degree , we will always need at least points. In our case, we could evaluate at 4 points: , , and . We could record this in an array as p_evals = [5, 10, 29, 68]. This recording of evaluations is aptly named the “evaluation form”.

Another equivalent representation would be to provide a list of the coefficients in front of every power of the indeterminate (see Coefficient Form).

Coefficient Form vs Evaluation Form. There is no strictly superior representation. The coefficient form allows for a more lightweight representation of sparse polynomials (polynomials where many of the coefficients are ). Indeed, we only need to record the non-zero coefficients. On the other hand, some operations such as polynomial multiplication are much more expensive in coefficient form () than they are in evaluation form ().

We can convert from coefficient form to evaluation form by evaluating the polynomial at points. The operation that converts from evaluation form back to coefficient form is known as polynomial interpolation (Lagrange interpolation is one way to perform this operation).

Fast Fourier Transform (FFT)

An algorithm that allows us to efficiently “convert” between two equivalent representations of polynomials: either as a list of coefficients (coefficient form) or as a list of evaluations of the polynomial (evaluation form).


COMING SOON.

Fiat-Shamir Heuristic

Turn a public-coin interactive protocol into a non-interactive protocol (in the random oracle model).


A Cautionary Tale. Implementing the Fiat-Shamir heuristic is often a cause for security-critical bugs as described in this Trail of Bits blog post.

Instance

(Of a general purpose SNARK). The public inputs/outputs of a circuit.


In a general purpose SNARK, we call the instance the collection of public values, whether they are inputs, or desired outputs of some computation.

“Instance” is sometimes used interchangeably with the term “statement” although these terms are not exactly the same1.

1

The terms instance, witness and statement come from complexity theory and the study of relations. Read more in our introductory article

Lagrange Interpolation

A method to construct the unique polynomial of degree that passes through given points.


⚠️ Prerequisites: Coefficient vs Evaluation Form.

Lagrange interpolation is one way to perform polynomial interpolation (recall that this is the process of recovering a polynomial from a set of known evaluations, see prerequisites above).

Let’s say we’ve been given then following point-evaluation pairs , , and . Since we have 4 pairs, we will be able to interpolate the unique polynomial of degree 4 such that for all .

The strategy is the following: for each , we create a polynomial which evaluates to 0 at all the points we were given except , where it evaluates to 1. We can then express the polynomial as:

The set of all polynomials is known as the Lagrange basis for the evaluation domain (here, the set of all values).

Language

See the introductory article on What does it mean to prove?.

Nullifier

A private value which, once revealed, invalidates (or “nullifies”) some associated object.


The term “nullifier” is somewhat loosely defined and appears more in system specs than in academic papers. In general it refers to a private value which, once revealed, invalidates some associated object.

Example: Nullifiers to Prevent Double-Spending

A notorious example system that uses nullifiers is Tornado Cash:

  • Users deposit funds into the smart contract and associate a unique, secret nullifier value to their deposit.
  • Later one may withdraw from the contract by revealing the nullifier associated with the original deposit. A zero-knowledge proof attests to the fact that the nullifier is associated with one of the contract’s deposits but does not reveal which one.
  • Upon successful verification of the nullifier and proof, the smart contract allows the withdrawal.

The smart contract keeps track of all the nullifiers it has seen: if the same nullifier is presented a second time, it must be the case that the funds have already been released and that this user is attempting to cheat by withdrawing more funds than they deposited!

Oracle

Computing. An oracle is a black box that efficiently performs a computation: e.g. a random oracle produces random values, a polynomial oracle evaluates a polynomial at any requested input point.


An oracle is a black box that solves a specific computational problem in constant time. Oracles are used in the theoretical study of computing and complexity theory.

By interacting with an oracle, one can learn the output of the oracle’s computation but learns nothing about its inner workings.

PLONKish Arithmetization

A group of circuit arithmetizations derived from the PLONK paper [GWC19] – their core ingredients are a gate equation and copy constraints that “wire” the gates together.


⚠️ Prerequisites: Arithmetization, Circuit.

PLONK Arithmetization

The PLONK arithmetization was first proposed in the PLONK paper [GWC19] as a means to arithmetize circuits where each gate has 2 inputs and 1 output. Each gate is expressed using a “gate equation”; the wiring is represented as permutations.

(Gate Equation). A single gate with inputs and output is defined by the equation: The values are known as “selectors” and allow to specialize each gate into enforcing a specific operation. The table below provides some examples:

Gate TypeSelector ValuesSimplified Gate Equation
Addition, , , ,
Multiplication, , , ,
Public Input is , ,

(Copy Constraints). Gates are “connected” using copy constraints. Consider two gates for which we want to enforce that the output of the first gate is the left input to the second gate. Let’s label our wires: left input, right input and output of the first gate will be , while left input, right input and output of the second gate will be . To enforce that the value on wire is the same as the value on wire , we show that these values are interchangeable; i.e. that they can be permuted without affecting the validity of the gate equations.

A full PLONK circuit is defined by the matrix of all selector values and the copy permutations.

PLONKish: Variants and Extensions

The PLONK blueprint (gate equation & copy constraints) is extremely versatile and expressive, and has been declined into many variants. The term PLONKish was coined by Daira Hopwood to describe such variants.

Additional features include:

  • custom gates - we can add custom functions to the gate equation. To do so we create a new selector and multiply it by some function of the wire values . The new gate equation becomes: .
  • larger fan-in and fan-out - the PLONK arithmetization can be extended to support more than 2 inputs and 1 output for each gate.
  • lookup tables - the gate equation can also be extended to allow checking that some input value is a member of a table of values.

Some commonly used names for these variants:

  • TurboPLONK PLONK arithmetization + custom gates + larger fan-in/fan-out
  • PlonkUp [PFMBM22] PLONK arithmetization + lookup tables using plookup.
  • UltraPLONK PLONK arithmetization + custom gates + larger fan-in/fan-out + lookup tables using plookup.
  • halo2 arithmetization [Zcash] PLONK arithmetization + custom gates + larger fan-in/fan-out + lookup tables using the Halo2 lookup argument.

Additional Resources. In this ZKSummit talk, Zac Williamson, co-author of PLONK, covers the PLONK arithmetization as well as its TurboPLONK extension. The halo2 documentation also provides a thorough explanation of the PLONKish arithmetization used in halo2.

Circuits vs Machine Computation. While the PLONK arithmetization was originally designed to capture arithmetic circuits, its PLONKish extensions are now general enough that they also capture machine computations.

References

[GWC19] Gabizon, Ariel, Zachary J. Williamson, and Oana Ciobotaru. “PLONK: Permutations over Lagrange-bases for Oecumenical Non-interactive arguments of Knowledge.” Cryptology ePrint Archive (2019).

[PFMBM22] Pearson, Luke, Joshua Fitzgerald, Héctor Masip, Marta Bellés-Muñoz, and Jose Luis Muñoz-Tapia. “Plonkup: Reconciling plonk with plookup.” Cryptology ePrint Archive (2022).

[Zcash] Zcash. “The halo2 Book”.

Polynomial Commitment Scheme

A two-phase protocol: in the first phase, a Prover commits to a polynomial by emitting a public commitment; in the second phase the Verifier chooses a value , and the Prover produces a value and convinces the Verifier that .


Overview

A polynomial commitment scheme is a two phase protocol: the first phase is known as the commitment phase, the second as the evaluation phase.

In the commitment phase, a Prover generates a commitment to some polynomial . The type of this commitment object varies depending on the polynomial commitment scheme being used (e.g. a single elliptic curve point for KZG commitments or a vector of elliptic points in IPA commitments).

The evaluation phase is often an interactive protocol between the Prover and a Verifier. The Verifier chooses a point at which it wants to learn the evaluation of . The prover can then produce a value and an “opening proof”. Finally, the Prover and Verifier use the commitment, claimed evaluation and the opening proof as inputs to a (sometimes interactive) protocol to convince the Verifier that .

Properties

We are interested in polynomial commitment schemes that are

  • binding: a prover cannot produce false opening proofs for a committed polynomial.
  • (optionally) hiding: the commitment reveals nothing about the polynomial.

Some Popular Schemes. KZG (or Kate) commitments, IPA (inner-product argument), FRI (Fast Reed-Solomon Interactive Oracle Proof of Proximity).

Polynomial Interactive Oracle Proof (PIOP)

An interactive proof system in which the prover computes polynomials and the verifier can query these polynomials at evaluation points of her choice.


Polynomial Interactive Oracle Proofs (PIOP, polyIOP or polynomial IOP) emerged from the development of SNARKs and were later formally defined in the DARK paper [BFS20]. These are interactive protocols between a prover and a verifier. With each message the prover produces an oracle and the verifier gets to query any oracles it has received from the prover. In a PIOP, the prover can only produce oracles that evaluate polynomials with degree lower than a given bound.

The following figure is taken from zk-SNARKs: A Gentle Introduction [Nit20]1 and illustrates the prover-verifier interaction in a PIOP:

PIOP protocol diagram from [Nit20]

AHP or Polynomial IOP? Algrebraic holographic proofs and polynomial interactive oracle proofs are almost equivalent notions. They were developed concurrently in 2019 by separate research groups: the former by the group behind Marlin [CHMMVW20] and the latter by the group behind DARK [BFS20]. While they formalise very similar proof systems, polynomial IOPs are more general in that they do not require holography (as defined in the AHP article).

References

[BFS20] Bünz, Benedikt, Ben Fisch, and Alan Szepieniec. “Transparent SNARKs from DARK compilers.” In Advances in Cryptology–EUROCRYPT 2020: 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zagreb, Croatia, May 10–14, 2020, Proceedings, Part I 39, pp. 677-706. Springer International Publishing, 2020.

[CHMMVW20] Chiesa, Alessandro, Yuncong Hu, Mary Maller, Pratyush Mishra, Noah Vesely, and Nicholas Ward. “Marlin: Preprocessing zkSNARKs with universal and updatable SRS.” In Advances in Cryptology–EUROCRYPT 2020: 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zagreb, Croatia, May 10–14, 2020, Proceedings, Part I 39, pp. 738-768. Springer International Publishing, 2020.

[Nit20] Anca Nitulescu. zk-SNARKs: a Gentle Introduction. 2020. https://www.di.ens.fr/~nitulesc/files/Survey-SNARKs.pdf.

1

A great read for the technically-versed and curious reader trying to get a global overview of SNARKs in 2021/2022.

Preprocessing SNARK

A SNARK in which the circuit - or any equivalent description of the computation - is encoded into a proving key and verifying key that are produced ahead of time and independently from any proof.

Public-coin

A public-coin algorithm is an algorithm in which all random values are derived publicly - i.e. without the need for private information or evaluation methods.

Quadratic Arithmetic Program (QAP)

One way (amongst many others) to encode an arithmetic circuit into a set of arithmetic equations. See Arithmetization.

R1CS

Acronym: Rank-1 Constraint System. A circuit arithmetization based on matrix equalities: each matrix row enforces a constraint over linear combinations of the circuit’s wires.


⚠️ Prerequisites: Arithmetization, Circuit.

Rank-1 Constraint Systems (R1CS) are one way (amongst many others) to encode an arithmetic circuit. An R1CS circuit is defined by the following equation: where , and are square matrices of identical sizes, is a vector and ‘’ denotes the element-wise product.

The matrices , and are fixed and define the circuit. To satisfy the R1CS circuit, a prover must find a vector for which the equality above is true. We can also include public inputs by changing the game slightly: we choose a vector of public inputs and require that the prover finds a vector such that the vector satisfies the R1CS equality[^1]. We call the instance vector and the witness vector.

R1CS in Action. You can see R1CS in action in Module 6 of the ZK Whiteboard Sessions from 7:56 to 12:00. There, Mary Maller shows how to populate R1CS matrices to enforce that a value decomposes into three bits by enforcing the following equalities: Note that she is working with a different yet equivalent matrix equation:

[^1] Here ‘’ denotes the concatenation operation.

Random Oracle Model (ROM)

An approximation of the real world in which the outputs of certain computations (like hashing or signing a message) are seen as truly random.


The random oracle model is a tool that we use to write security proofs for our cryptographic constructions - it is a representation of the world in which we claim that random oracles exist. A random oracle is a black box to which we ask queries (send values) and receive a response: a value chosen at random. The random oracle’s responses are

  • consistent: if I give it a query that it has already seen, the random oracle will give me the same response it gave previously.
  • unpredictable: the responses follow a uniform distribution over the oracle’s output domain (this could be integers, strings of 10 characters, points on a curve, etc…).
  • unrelated to the query: nothing about a random oracle’s response gives me information about the query.

Hash Functions as Random Oracles. When implementing protocols that are secure in the ROM, we will often claim that cryptographic hash functions (SHA256, Keccak/SHA3, Poseidon) behave like random oracles. This claim usually holds, however we cannot use any hash function. The first ZKHack puzzle explores the dangers of using a poorly-selected hash.

Relation

A relation is a set of pairs ; the condition that makes a pair part of a relation may be arbitrarily defined (“the set of numbers that I think go well together”) or follow some rule (“ is greater than ”).


See the introductory article on What does it mean to prove?.

Roots of Unity

The -th root of unity is a number such that ; we often call roots of unity the set .


The -th root of unity is a number such that . When we use the plural, -th roots of unity, we refer to the set .

Example in the complex numbers

If we were working in a regular (read: not finite) field, the roots of unity would be complex numbers. For example you might already be familiar with the complex number which is in fact the 4-th root of unity in the set of complex numbers (you can check this by computing ). The associated set would be .

Example in a finite field

In a finite field, the definition of “roots of unity” directly maps to that of a multiplicative subgroup of order . For example let’s consider the field with prime characteristic . The elements of this field are , , , , , and . Let’s look at the powers of :

So is the -rd root of unity with the corresponding subgroup of order 3 being

Scalar Field

Elliptic Curve Cryptography. The finite field defined by counting the number of repeated applications of the group operation (i.e. point addition).

⚠️ Prerequisites: Finite fields.

See Elliptic Curve.

Schwartz-Zippel Lemma

A little mathematical theorem that allows us to assert with high probability whether two polynomials are identical by evaluating both at the same randomly chosen input; this is very efficient!


⚠️ Math: polynomials, probabilities. In the spirit of this “jargon decoder”, we do not look at the lemma directly but instead give informal intuition for why it is useful in ZK arguments

The Schwartz-Zippel lemma is a central component to many arguments of knowledge because it allows us to efficiently check whether two polynomials are identical.

Consider two straight lines (degree 1 polynomials, ): these are either identical or intersect at most once. Similarly, two parabolas (degree 2 polynomials, ) are either identical or will intersect at most twice. Degree 3 polynomial will have at most 3 intersections, etc… The Schwartz-Zippel lemma can be used to show that this pattern is true for polynomials of any degree : their evaluations will only be equal to each other for at most inputs.

With that in mind, imagine that we evaluate two polynomials of degree at most at a single point in some evaluation domain and find that they agree. We know that we are in one of two cases:

  • case 1: our polynomials are identical, all their coefficients are the same.
  • case 2: our polynomials are different and we are at one of the points where they agree. If our evaluation point was chosen at random in , the probability of finding ourselves in this case is (notation: denotes the order of , i.e. the number of values in the set ). For a sufficiently large compared to , this probability will approach 0.

since case 2 is extremely unlikely we can assert with high probability that the two polynomials were identical.

Formalities - only for those who want them: the lemma itself does not compare polynomials but instead is interested in the probability that a polynomial evaluates to 0. Notice that a non-zero polynomial of degree has at most roots (if it had more roots, it would be of higher degree!). Therefore let be a non-zero polynomial of degree , be the evaluation domain of , and an element of chosen uniformly at random, we can write:

When , this probability is negligible. Therefore if for a value uniformly chosen at random we find that , it is overwhelmingly likely that is the zero polynomial. To check whether two polynomials and are identical, we can check whether the polynomial is the zero polynomial.

SNARK

Acronym: succinct non-interactive argument of knowledge. An argument system in which a Prover can produce a short, single-message proof attesting that she knows some information that shows the truth of a statement.


In this article, we will define a number of notions relevant to proof systems and SNARKs by using the example of a Sudoku grid. Look out for words in bold: these are common jargon words which get mapped to our simple example.

Statement, Instance and Witness

Let’s consider a Sudoku grid, a specific instance of a problem. Alice makes the following statement: “This grid has a solution”. If it exists, this solution would be a witness to the fact that the statement is true. But maybe the grid is hard to solve and just looking at the instance (the grid) is not enough to be convinced that a witness (a solution) exists. What can Alice do to convince us that the witness exists?

Proof vs Argument

To support her statement, Alice shows us that the Sudoku grid has been published in many well-known and respected newspapers. In most cases this would be enough to convince us that the statement is true: surely they wouldn’t publish a “wrong” grid. However, if Alice were very powerful she could have been able to produce modified copies of the papers, or pay the news editors to include an unsolvable grid. Because there is still a small chance that Alice is cheating, we do not say that Alice produced an irrefutable proof but rather say that Alice has produced an argument.

More formally, a proof cannot be falsified by a computationally unbounded adversary. On the other hand an argument can be falsified by such an adversarial prover. In practice, the bound can be set so high that we estimate that no entity in the world would be able to reliably operate above such a bound. With that in mind, we often abuse the distinction above and use the word “proof” to describe the outputs of both a proof system and an argument system.

Arguments of Knowledge

Notice that in her argument, Alice showed the existence of the witness but may not know the witness herself. What if her statement was “I know the solution to this grid”? Here, Alice would have to produce an argument of knowledge, one by which she can convince us that she knows the solution to the Sudoku. The most naive approach to demonstrate knowledge would simply be for Alice to show us the witness. Since it is impossible to show us the solution without knowing it, this would constitute a proof of knowledge. While this proof of knowledge is non-interactive, it is not succinct: the message that Alice sent is of the same size as the witness. Can we do any better?

SNARKs

We finally get to the notion of a SNARK: a succinct non-interactive argument of knowledge. We have already learnt from the above that an argument of knowledge allows a bounded prover to show that she knows the witness that supports a statement. The additional properties of a SNARK are:

  • succinctness: the proof is substantially shorter than the statement or the witness.
  • non-interactive: the proof is contained in a single message.

A SNARK can optionally implement the zero-knowledge property: such a SNARK reveals no information about the witness other than what can be implied by evaluating the truth of the statement. We call such a SNARK a zkSNARK (zero-knowledge SNARK). In our case, Alice could convince us that she knows the solution to the Sudoku without giving us any hints as to what the solution is!

SNARKs in Practice [Advanced]. In practice we do not look at Sudokus but instead consider an arithmetic circuit. The circuit and any public inputs are the instance (just like the Sudoku 9x9 grid and the numbers that already populate it), the statement is the claim that the circuit can evaluate to a desired value, and the witness is all the prover-chosen inputs and intermediate values in the circuit. A famous result of complexity theory shows that any provable statement can be converted into this form (namely that the circuit satisfiability problem is NP-complete).

We can write a proof for the statement about the circuit by expressing everything as polynomials and running a polynomial interactive oracle proof. This proof can then be made non-interactive by using the Fiat-Shamir heuristic, and succinct by using a polynomial commitment scheme.


See also: instance, witness, common reference string, structure reference string, trusted setup, preprocessing SNARK

Soundness

Of a proof system. A proof system is sound if, for every FALSE statement, the Prover can (almost) never produce an accepting proof.


Soundness is a property of a proof system best understood as: “a malicious Prover should not be able to convince the Verifier of an invalid statement”.

Given a relation and the associated language , the mathematical expression for completeness looks often like the equation below: where is any malicious prover, is the honest Verifier, denotes the bit output by at the end of the interaction with for the instance and is a small number. We call the soundness error.

Statistical vs Computational Soundness. The notion of soundness described above is known as statistical soundness or information-theoretic soundness. It considers all possible adversaries, including those with unlimited resources. In most real-world applications, we are only concerned with bounded adversaries: we usually limit ourselves to adversaries running probabilistic polynomial-time algorithms. This adversarial model is formalised by the notion of an argument (rather than a proof, see our discussion here) and that of computational soundness.

Computational soundness is defined in the same was as information-theoretic soundness but is only required to hold against probabilistic polynomial-time adversaries.

Stronger Soundness variations

Soundness comes in several levels of strength. Above is the most basic version according to which an honest verifier should only accept a proof if it corresponds to a valid statement. The following describes increasingly stronger versions of soundness.

Knowledge Soundness

When constructing arguments of knowledge, a notion stronger than simple soundness is required: knowledge soundness. In this strenghtened scenario, an honest verifier should only accept a proof if it corresponds to a valid statement and it was generated by a prover who knows a valid witness for this statement.

Formally, this is defined in terms of an extractor: an argument system is said to be knowledge sound if there exists an efficient algorithm that, given the prover’s algorithm and internal state, can extract a valid witness for the proven statement. Intuitively, this models the desired property that the prover “knows” a witness for this statement.

Simulation Knowledge Soundness (or Simulation-Extractability)

Another useful and even stronger notion of soundness is simulation-extractability. Informally, it states that an honest verifier should only accept a proof if it corresponds to a valid statement and was generated by a prover who knows a valid witness, even if the prover is allowed to observe honestly generated proofs for arbitrary statements.

Intuitively, if an argument system is simulation-extractable, then it is non-malleable: an adversary can’t observe a proof for some statement generated by a third party and maul it into a proof for another statement for which the adversary does not know a valid witness.

This captures many real-world scenarios where honestly generated proofs are made public and may be seen by potential attackers. As such, it has become the “gold standard” in terms of security notions

Structured Reference String

A collection of public parameters for a protocol which were created using information that must remain secret to ensure the protocol’s security.


See Common Reference String and the note at the bottom of the page.

Threshold Encryption

A public-key encryption scheme in which the secret key is shared amongst parties: to decrypt a ciphertext, we require that a group of strictly more than participants collaborate; is called the threshold.

Trusted Setup

An algorithm that determines a protocol’s public parameters using information that must remain secret to ensure the protocol’s security.


See Common Reference String and the note at the bottom of the page.

Vanishing Polynomial

The unique polynomial of degree that evaluates to at all points of a domain of size .


The vanishing polynomial over a set , often denoted , is the polynomial which has a single root at each element of . It can be explicitly written as:

By definition, the vanishing polynomial over will have the following properties:

  • it is unique.
  • it is of degree .
  • for all , . (Hence the name vanishing!)

Vanishing Polynomial for Roots of Unity. When the set is a set of -th roots of unity, the vanishing polynomial can be expressed as: To see why this is the case, let’s consider the set which we know are the -rd roots of unity in (see the Roots of Unity article): Why does this matter? Expressing in this way makes it very efficient to evaluate it at one point!

Witness

(Of a general purpose SNARK). All the circuit variables that the verifier does not see: the prover’s private inputs and all the intermediate values computed in the circuit.

In a general purpose SNARK, we call the witness the collection of values that the verifier does not read; either because they are private to the prover, or because they are too long to satisfy the succinctness property.

Consider the following example:

Peggy sends a string string to Victor and wants to convince him that she knows a secret value seed such that applying the Keccak hash 1000 times to it produces string. That is, string = Keccak(Keccak(...(Keccak(seed)...))).

Peggy and Victor therefore agree on a SNARK circuit that takes seed as a private input and string as a public input. The circuit applies Keccak once to seed to produce a 1st intermediate value. This value is then fed into Keccak again, producing another intermediate value. This process is repeated 1000 times until a final value is produced. Finally, this last value is compared to string.

In this example, Peggy wants to keep the value seed secret, so it must certainly be part of the witness. Secondly, notice that the circuit produces 1000 intermediate values. Victor (rightfully so) doesn’t care about these values and shouldn’t have to read through them! He just wants to see string and a short proof that Peggy is saying the truth. So the 1000 intermediate values are also part of the witness.

Zero-knowledge Proof

A proof that reveals no more information than the validity of the statement it supports.

COMING SOON.