Probabilistic Checkable Proof (PCP)
April 30, 2023
This post gives a gentle introduction to the Probabilistic Checkable Proof (PCP), the pearl of computational complexity theory.
Introduction
Consider the canonical NP problem, boolean satisfiability (SAT). To determine if a Boolean formula is satisfiable, a deterministic verifier needs to check a polynomial length proof — the assignment of each variable. But what if the verifier is allowed to use randomness and to fail with a small probability? It turns out that the probablistic verifier still needs a polynomial-size proof, but it does not need to look at all of it. With oracle access to the proof, it only needs to make constant queries to the proof, i.e. check constant number of locations of the proof. The proof string is of course much more complex than the assignment of each variable. We call these proofs “probablistic checkable proofs” (PCP). The aforementioned fact about NP is the famouse PCP theorem, and in this post, we will walk through the key components of the proof.
Definition
A Language is in the complexity class PCP if there exists a polynomial-time probablistic oracle Turing machine called the PCP verifier such that
- is the alphabet for the proofs
- uses bits of randomness and queries
- Completeness: If , there exists a proof such that for some
- Soundness: If , for all proofs , for some
Basic Examples and Results
- PCP = P
- PCP = P
- PCP = BPP
- PCP = NP
- PCP = NP
Usually and we simply drop . Further, if we drop the completeness and soundness parameters, we implicity assume that . Note that given bits of randomness and queries, the verifier can check at most locations of the proof, which is the effective size of any proof. (Here, we assume the verifier is non-adaptive.)
The Statement of the PCP Theorem
The theorem demands a very intricate proof, which is of polynomial size and requires only constant number of queries. The proof we will explore is based on “proof composition”. We will construct two not so strong proofs: a very long proof (exponential-size) that requires only constant number of queries, and a short proof (polynomial size) that requires polylog queries. Then we will compose them to take the merits of both. Thus, the outline of the proof will contain three main components: (1) an exponential-size PCP, (2) a polynomial-size PCP, and (3) the proof composition. Generally speaking, every PCP is a smart coding of the plain proof, so we will use different codes borrowed from coding theory throughout the journey.
Exponential-size PCP
Note that the effective proof size is , hence the name exponential-size PCP. To prove this statement, we will consider a NP-complete problem called QUADEQ, which determines if a system of quadratic equations has a satisfying assignment. The prover should use Walsh-Hadamard code to construct a exponential-size PCP of the satisfying assignment. The verifier will first use linearity testing to verify the validity of the codeword, i.e., to check that the proof is indeed the Walsh-Hadamard code of some string. Then it can use constant queries to verify if the string is a satisfying assignment with high probability.
QUADEQ
A quadratic equation of variables over has the form . A system of quadratic equations over is in QUADEQ if the system has a satisfying assignment .
The polynomial-size proof of QUADEQ is simply the satisfying assignment of the variables , so QUADEQ is clearly in NP. To see that QUADEQ is NP-complete, it is easist to reduce from circuit-satisfiabily. We can assign a variable to represent the output of each gate in the circuit:
- .
This conversion of a problem instance to a set of equations is known as arithmetization.
Walsh-Hadamard Code
To construct a exponential-size PCP, we will make use of the Walsh-Hadamard code. Given a message , its Walsh-Hadamard codeword is , where . Here is the index of , a string of exponential size, and is the dot-product of and .
Random Subsum Principle
As a fundamental principle of coding, if the messages and differs in a few locations, we want their codewords to differ in almost all the locations. This is indeed true to Walsh-Hadamard Code, which can be summarized as the random subsum principle:
Linear Functions
Another principle of coding is that the verifier should be able to check the validity of a codeword, i.e., the given string is the codeword of some string, or the given string is close to the codeword of some string. To demonstrate this property, we will consider the Walsh-Hadamard codeword of as a linear function defined by . In general, a linear function satisfies . It’s straightforward to see that is indeed a linear function. Conversely, every linear function can be written as for some . Let be the string with a single at the -th location. Then . Thus, the Walsh-Hadamard code is equivalent to linear functions over . To check the validity of a codeword is to test whether it is close to a linear function.
BLR Linearity Testing
Formally, we say that a function is -close to a linear function if
The BLR linearity testing is actually very simple: we randomly query two locations and , and test if . Nevertheless, it is very powerful. We state without a proof of the following theorem: If
is -close to some linear function.
This suggests a simple probabilistic algorithm for linearity testing, and hence verifying a Walsh-Hadamard code.
Local Decoding
Walsh-Hadamard code is also a locallly decodable code, i.e., we can recover a single bit of a corrupted codeword while making constant number of queries. Suppose is -close to a linear function (a Walsh-Hadamard codeword), and we want to decode , but is corrupted. Then we can pick a random , and since
we can decode by querying and if these two locations are not corrupted. If the codeword is corrupted at every location with the same probability, then this procedure actually makes things worse. However, in our PCP system in the later sections, we only want to check for specific locations of the codeword. In this case, the local decoding procedure guards against a malicious prover who will concentrate the corruptions on the certain locations we are looking at.
An Exponential-size PCP for QUADEQ
QUADEQ asks, given quadratic equations of variables, if there is satisfying assignment of the variables. Equivalently, QUADEQ asks, given inputs and , where is a matrix, and is an dimensional vector, if there is a dimensional vector such that and . Here should be a satisfying assignment.
Our exponential-size PCP should be and , where . Now the verifier will carry out the following steps:
- (Linearity Test) Check that , are linear functions, i.e., , are Walsh-Hadamard code- words.
- (Tensor Test) Check that , encode the same source word, i.e., encodes and f encodes .
- (Satisfiability Test) Check that .
We’ve already discussed the linearity test. We will describe the remaining two steps.
Tensor Test
The idea of tensor test is very straightforward. If and for some , then
Thus, given two linear functions and , the tensor test is as following:
- Choose uniformly at random.
- Accept if
The protocol is clearly complete. It is also sound: if g and r linear functions, based on random subsum principle, it can be shown that each tensor test rejects correctly with at least probability. Thus, it is also efficient: we only need to repeat the test constant number of times to achieve desired parameters. The final caveat is that from linearity testing, we only have the gurantee that and are close to linear functions. This is a problem since we are only checking specific locations of , to be exact, fraction of locations. This is why local decoding we’ve discussed above is necessary to achieve soundness constraints.
Satisfiability Test
The satisfiability test is also quite straightforward:
- Choose uniformly at random.
- Accept if
Again, based on the random subsum principle, each test rejects correctly with at least probability. Again, we only need to repeat the test constant number of times to achieve desired parameters.
Summary of Exponential-size PCP
As a recap, the exponential-size PCP verifier uses four testing procedures: linearity test, tensor test (with local decoding), and satisfiability test. Each procedure achieves constant soundness with constant number of queries, and thus we can repeat each procedure constant number of times to achieve overall constant soundness with constant number of queries.
Polynomial-size PCP
Here, the effective proof size is , hence the name polinomial-size PCP. To prove this statement, we will again consider the NP-complete problem QUADEQ. But instead of quadratic equations over , we will consider the general problem of quadratic equations over an arbitrary finite field , which is still a NP-complete problem. For the exponential-size PCP, the proof is some version of the satisfying assignment encoded by the Walsh-Hadamard code, which corresponds to linear functions. For the polynomial-size PCPs, we will also use techniques from coding theory, namely low-degree extension and Reed-Muller code, which are closely related and correspond to low-degree polynomial functions. Reed-Muller code won’t be used explicity in the proof, but will be used to encode or amplify the problem instance so that the verifier can check satisfiability with small number of random bits. Then, the main proof will be the low-degree extension of the satisfying assignment, and with the clever SumCheck protocol, the verifier can check the assignment with small number of queries. Additionally, like linearity testing for the exponential-size PCP, the verifier will need to verify the validity of the proof with low-degree testing first.
Amplifying the QUADED with Reed-Muller code
Let’s remind of ourselves the QUADEQ problem: we are given quadratic equations of variables: , and we want to know if there is a common zero of the equations . For the exponential-size PCP, we have to use number of random bits, essentially because we check the satisfiability of each equation. We will try to tackle this first.
First Naive Solution
What if we just pick an equation at random? This only uses number of bits, which is on the order of . The number of queries depend on the arity of the equation we selected, which could be potentially , which would be a problem. However, from the reduction of the circuit satisfiability problem, we can restrict the problem to quadratic equations of max arity , while still resulting in a NP-complete problem. So we actually only need constant number of queries.
However, the soundness is fairly bad. In the worst case, there is an assignment that satisfies equations, then our soundness is only . Remember that we need constant soundness so that we can easily amplify it without blowing up the number of random bits or queries.
Reed-Solomon code
To resolve the problem, we need to leverage coding theory to “spread the error”. Roughly, we want to construct more equations so that if an original equation is not satisfied, a lot of constructed equations will not be satisfied.
To achieve this, consider a non-zero message of length , . The encoding is a code of length , which corresponds to the evaluations of a polynomial at arbitrary different locations, say . The polynomial is defined by
i.e., the coefficients of the polynomial is defined by the message. Then the code is . Since is a degree polynomial, it has at most roots. And thus if , there are at most zero coordinates in ; then, if we pick a random coordinate of , we have a probability to find a non-zero coordinate, which can be made constant if we take say .
This encoding procedure is a linear transformation via the Vandermonde matrix :
such that .
If we consider our quadratic functions as a vector function , we can similarly encode the vector function to get a new vector function . Or equivalently, for every evaluation of our quadratic equations, we encode the result with . Then if where has one non-zero coordinate, at least cooridinates of are non-zero.
In this way, we achieve constant soundness and we don’t need much randomness. We only need to use number of random bits, which is still on the order of . However, we have a problem with the number of queries. Namely, to have different locations in , . But as we will see, we need the size of to be smaller to achieve small number of queries.
Reed-Muller code
To reduce the size of , we need to use Reed-Muller code, which extends from single-variable polynomials to multi-variable polynomials. Given a message of length , , we will use the message as the coefficients of a multi-variable polynomial . We assume that
for some integer representing the degree of and integer representing the number of variables. Then we can define
where indexes . With Reed-Muller code, we can choose and properly to ensure that . Further, we will have new polynomial equations, and we will achieve constant soundness by randomly picking one equation.
Low-Degree Extension
Now with Reed-Muller code to encode the equations, we are left to deal with equations and we can randomly check one of them to achieve constant soundness. However, we have a big problem with the query complexity. Namely, both the Reed-Solomon and Reed-Muller code “spread” the variables into new polynomials, so that each new polynomial now may have max arity of . This will require us to query all positions of the assignment provided by thr prover, but we want to achieve query complexity of .
To circumvent checking all coordinates, we will require the prover to provide instead an encoding of the satisfying assignment, the low-degree extension of the assignment. Given an assignment , we will consider a subfield and an integer , such that . Then, we can view the assignment function as . In effect, we are using coordinates over a small field to index coordinates.
Finally, we will claim without a proof (which is not complicated) that we can always construct a low-degree extension such that and individual degree of . We will ask the prover to send the evaluation table of this low-degree extension as a part of the proof to the verifier.
Why will this help at all? Let’s consider more concretly the task of the verifier. Suppose that the verifier uses Reed-Muller code and randomly pick a new polynomial
where we drop the degree terms and constants for simplicity of discussion, and split the coefficients for symmetric terms like . Let’s now use the indexing with coordinates from instead and apply the low-degree extension of assignment :
Note that is computed by the verifier and can be viewed as a function . The verifier can find the low-degree extension of it . Let’s view the summand as a single function
so that the evaluation of becomes
Why will this transformation be helpful? We will soon discuss the SumCheck protocol, which can cleverly evaluate the equation with the prover. Instead of substituting in all terms, the protocol checks each of the coordinates one by one, resulting in query complexity on the order of . Note that the individual degree of is bounded by , which is criticial for this protocol.
SumCheck Protocol
Let’s consider more generally the SumCheck protocol, and analyze its application to our particular problem in the end. We will first introduce the SumCheck Protocol as an IP (interactive proof) protocol, for which the prover and verifier communicates back and forth.
The problem considers a polynomial function , where has a small individual degree in every variable. The verifier wants to check the value of
where is a subfield.
Besides our problem, a very useful scenario is when encodes a Boolean formula, , and for some large prime . In this case, is on the order of clauses, and the problem is asking for the number of satisfying assignments to . Thus, with the SumCheck Protocol, we can prove many results related to the complexity class IP.
The verifier will pick random elements , and define the following functions:
We can make the following observations:
Note that is the solution we’re looking for, is easily computable, and the recursive relation motivates a recursive IP protocol as follows:
- The verifier receives from the prover.
- The verifier receives a polynomial of degree (the coefficients) from the prover, and checks that .
- For , the verifier sends to the prover, and the prover sends back a polynomial , and the verifier checks that .
- Finally, the verifier checks that .
Now the protocol is clearly complete: the prover just needs to send the information as defined. The number of random bits we need is . Now the interesting part is the completeness, which calls into attention the requirement that , the individual degree of , is small (compared to ). Assume that the verifier accepts at the last step. If , the function is a polynomial of maximal degree (we can directly reject if the prover sends a higher order polynomial). Thus, is a root of with probability . Then we can make inductive steps, assuming that , by the same logic, with probability . Finally, by the union bound, we falsely accept with probability .
Now in the context of PCPs, the prover has to send a single (poly-size) proof instead of interactively communicating. Thus, the prover needs to write down the proof in respect to every possible trace of randomness. In effect, this is a tree of depth , where each parent has children. And for each child, the prover needs to write down the coefficients of the polynomial of degree , so the total length is . Finally, for each round, the verifier needs to query a polynomial of degree , so the total queries is .
Now back to the context of our problem. We have that . We will need to define the parameters for our low-degree extension as follows:
Thus, for the SumCheck protocol, we will achieve soundness of , proof length of , number of queries equal to , and randomness of .
Finally, note that we cannot arbitrarily decide . This is constrained by the code we used to amplify the equations. Recall that Reed-Solomon code will require , thus not applicable in our scenario. Luckily, Reed-Muller code suffices for our case.
Low-degree Testing
A final step to complete the proof is a low-degree testing. We ask for a low-degree extension of the assignment function from the prover, and we need to make sure that’s the case. We will skip the detailed algorithm and proof for this testing, but provide the following fact which motivates the algorithm.
Given a multivariate function , for , where is a prime,
where
The test basically involves randomly sampling and tests for this local property.
PCP Theorem via Proof Composition
Now we’ve introduced two weaker PCP theorems, we will sketch how to compose them to prove the PCP theorem. In order to do that, we introduce two variants of PCPs, whose purposes will be explicit once we get to proof composition.
PCP of Proximity (PCPP)
We now present a relaxation of PCPs that only verify that the input is close to an element of the language. The advantage of this relaxation is that it allows the possibility that the verifier may read only a small number of bits from the input. Actually, for greater generality, we will divide the input into two parts , giving the verifier explicitly and as an oracle, and we only count the verifier’s queries to the latter. Thus we consider languages consisting of pairs of strings. For a pair language , we define .
Definition
A Language is in the complexity class PCPP if there exists a polynomial-time probablistic oracle Turing machine called the PCPP verifier such that
- Completeness: If , there exists a proof such that
- Soundness: If , for all proofs ,
We introduce a new proximity parameter , and the distance is defined as
if and if .
On one hand, the soundness criterion of PCPPs is a relaxation of the soundness of PCPs. Whereas, a PCP is required to reject (with high probability) every input that is not in the language, a PCPP is only required to reject input pairs in which the second element (i.e., ) is far from being suitable for the first element (i.e., is far from L(x)). That is, in a PCPP, nothing is required in the case that is close to and yet . On the other hand, the query complexity of a PCPP is measured more stringently, as it accounts also for the queries to the input-part y (on top of the standard queries to the proof ). This should be contrasted with a standard PCP that has free access to all its input, and is only charged for access to an auxiliary proof.
Robust PCP
We now present a strengthening of the standard PCP soundness condition. Instead of asking that any proof is rejected with high probability, we ask that any proof is far from acceptance with high probability.
Definition
Given an input and a proof , if we fix the random bits , we have a local view . We define the accepting views as
A Language is in the complexity class RPCP if there exists a polynomial-time probablistic oracle Turing machine called the RPCP verifier such that
- Completeness: If , there exists a proof such that
- Soundness: If , for all proofs ,
Here, the distance function is defined similarly as in the case of PCPP. Note that PCP RPCP, i.e., a vanilla PCP has robustness.
Proof Composition
To carry out proof composition, we will have an outer RPCP which has a small size, and an inner PCPP which uses a few queries. When the outer RPCP determines the oracle positions to query given random bits , , the inner PCPP will verify that the selected positions are accepted, . The new proof will be the concatenation of and for all . Note that we need the outer RPCP to be robust, because the inner PCPP only has soundness gurantee if is far from accepting.
References
[1] CS 294 at UC Berkeley, Probabilistically Checkable and Interactive Proof Systems, Spring 2019, Website