Skip to content

The Reach of Dichotomy

July 22, 2021


Congratulations on the 2021 Gödel Prize

Composite including Richerby’s talk at STOC 2021

Andrey Bulatov, Martin Dyer and David Richerby, and Jin-Yi Cai and Xi Chen are the authors of three papers awarded the 2021 Gödel Prize. The citations by the ACM and EATCS say clearly that the papers not the people win the prize, but our “blog invariant” is to show photos of people at the top, so there they are.

Today we congratulate the people—not the papers—and tell some of the technical story behind this richly deserved award.

Dick and I have intended to write about this since the prize was announced in early May. Travel and personal events have slowed the blog, plus in my case, June was by far the most pressing month of the “chess cheating pandemic” since chess moved online, and that carried into July. The award to our friend Jin-Yi—who has been a faculty colleague of both of us—is the “something else about UW Madison on our mind” that was mentioned in the June 20 post, during which chess cases arrived in real time. I am thankful that this month, to judge by the weekly roundups of games by ChessBase and The Week in Chess, most major tournaments are back to being played in-person—where the yearly volume of cases coming to my attention has been on either side of ten, rather than in middle triple digits.

That the papers win the award calls something else to mind: We can no longer point to an “original manuscript” with papers being electronic. Perhaps the continued rise of non-fungible tokens (NFTs) will restore this distinction. Tim Berners-Lee recently created and auctioned off an NFT of his original code for the World Wide Web. This may be done to prize-winning papers in the future. Then the prize money could create an inherent base value, such as we discussed for “semi-fungible tokens” at the end of this post, so that the prize really would go to the papers.

The Beginning of Dichotomy

The term “dichotomy” in complexity theory first became prominent with Thomas Schaefer’s 1978 paper about {\mathrm{SAT}}-like problems. For a large and natural class {\mathcal{S}} of such problems, he proved that either the problem is {\mathsf{NP}}-complete like {\mathrm{SAT}} itself or it is in {\mathsf{P}}—never intermediate between them. The famous 1975 theorem of Richard Ladner showed that if {\mathsf{NP \neq P}} then plenty of intermediate languages exist, but no problem in {\mathcal{S}} can be among them. That is the dichotomy: everything in {\mathcal{S}} is either easy or maximally hard. A similar project was taken up for {\mathsf{\#P}} by Nadia Creignou and Nicolas (Miki) Hermann in a 1996 paper showing that every {\mathrm{\#SAT}}-like counting problem is either in {\mathsf{P}} or is {\mathsf{\#P}}-complete.

What does “{\mathrm{SAT}}-like” mean? Consider the clause {(\bar{x}_i \vee \bar{x}_j \vee x_k)}. It is satisfied if and only if the values of the variables belong to the relation

\displaystyle  R = \{(0,0,0),(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,1)\},

which excludes only the non-satisfying assignment {(1,1,0)}. Note that if we had the clause {(\bar{x}_i \vee x_j \vee \bar{x}_k)} instead, we could re-use {R} by switching {x_j} and {x_k} in the given sequence of variables. Putting negated literals first, if we instead want to require exactly one literal to be true we could apply one of the following relations to each clause, depending on how many negated variables it has:

\displaystyle  \begin{array}{rcl}  R^0 &=& \{(1,0,0),(0,1,0),(0,0,1)\}\\ R^1 &=& \{(0,0,0),(1,1,0),(1,0,1)\}\\ R^2 &=& \{(0,1,0),(1,0,0),(1,1,1)\}\\ R^3 &=& \{(0,1,1),(1,0,1),(1,1,0)\} \end{array}

Then we can say that both the “Exactly One 3SAT” decision problem in {\mathsf{NP}} and the associated counting problem in {\mathsf{\#P}} are defined by the collection {\Gamma = \{R^0,R^1,R^2,R^3\}} of relations. The following shows how they are subcases of the general constraint satisfaction problem (CSP):

  • Parameter: A set {\Gamma} of relations over some domain {D}.

  • Instance: A set {\Phi} of tuples {C_1,\dots,C_m} over variables {X = \{x_1,\dots,x_n\}}, with each {C_j} associated to some relation {R_j} in {\Gamma}.

  • Question: Is there an assignment {\alpha: X \rightarrow D} that makes {\alpha(C_j) \in R_j} for each {j}? For the counting version, called {\mathsf{\#CSP}}, how many such assignments are there?


Here {\alpha(C_j)} means the ordered tuple of values {\alpha(x_i)} of the variables {x_i} in {C_j}. If {\Gamma} and {D} are considered part of the instance—where they must be finite—the problem is uniform, but if they are fixed (and possibly infinite), it is nonuniform. For a simple analogy: for every fixed context-free grammar {G}, its language {L(G)} belongs to {\mathsf{P}}, but a much stronger fact is that the uniform acceptance problem {A_{CFG} = \{\langle G,x\rangle: x \in L(G)\}} also belongs to {\mathsf{P}}. Another relaxation is to allow different domains {D_i} for different variables {x_i}.

In the Boolean case, {D} is fixed to be {\{0,1\}}. We can now say what Schaefer and Creignou-Hermann proved. Schaefer showed that every nonuniform Boolean CSP with finite {\Gamma} is either in {\mathsf{P}} or {\mathsf{NP}}-complete. He showed the former holds only when {\Gamma} falls into one of six explicit cases, where which case is also decidable in polynomial time given {\Gamma}. Creignou and Hermann gave a similar result for Boolean #CSP but with a smaller list of cases—because there are cases where the decision problem is in {\mathsf{P}} but the counting problem remains {\mathsf{\#P}}-complete. A simple such case is where {\Gamma} has only the relation {R = \{(1,0),(0,1),(1,1)\}}, which defines monotone {\mathrm{2SAT}}.

Extending the Reach

Also in the 1990s, Tomás Feder and Moshe Vardi wrote two papers on extending Schaefer’s dichotomy to a wider class of problems. To quote the first sentence of the latter’s abstract:

This paper starts with the project of finding a large subclass of {\mathsf{NP}} which exhibits a dichotomy.

The first step was to go to larger domains {D}. For example, consider {D = \{1,2,3\}}. Take {R = \{(1,2),(2,3),(1,3)\}}, which is simply the relation {\neq} on {D}. Given an undirected graph {G}, its edges become the tuples. The resulting CSP is satisfiable if and only if {G} is 3-colorable. Of course, this is {\mathsf{NP}}-complete. Feder and Vardi arrived at this by going down from large subclasses of {\mathsf{NP}} defined by logic in which Ladner-type constructions can still be expressed, so that dichotomy does not hold. When they imposed all of their conditions, they were left with nonuniform CSPs, and conjectured dichotomy for their decision problems.

The corresponding question for counting problems was solved first, by Bulatov. Dyer and Richerby followed by simplifying Bulatov’s proof in a way that also allows deciding whether a given finite {(\Gamma,D)} falls into one of the polynomial-time cases (assuming {\mathsf{\#P \neq P}} to begin with). The latter task runs in polynomial time with an oracle for testing what they call strong balance, which in turn polynomial-time mapping-reduces to Graph Isomorphism—so by Laci Babai’s celebrated theorem they can decide in time quasi-polynomial in {|\Gamma|} whether it is a hard or easy case.

Finally—and not awarded any of this prize money—a 2017 paper by Bulatov proved dichotomy for the decision case of nonuniform CSPs. Independently, so did a paper by Dmitriy Zhuk. We will explore just a few of the underlying ideas in the next section.

\displaystyle  {\S}

Here we continue with the further extension made by Jin-Yi and Xi Chen. This starts by writing the counting problem as

\displaystyle  \#\Phi = \sum_{\alpha: X \rightarrow D} \prod_{j=1}^m w(\alpha(C_j),R_j), \ \ \ \ \ (1)

where for simple #CSP, {w(\alpha(C_j),R_j) = 1} if the values assigned by {\alpha} to the variables in {C_j} satisfy the associated relation {R_j}, and {w(\alpha(C_j),R_j) = 0} otherwise. However, one need not score an unsatisfied constraint as zero. Giving a nonzero partial credit {\lambda} for a wrong answer {\alpha(C_j)} to {R_j} might make you more magnetic as a teacher. It would literally be magnetism: the simple case where {\Gamma} consists of just the binary equality relation on {\{0,1\}}, so that for all binary clauses (edges) {(u,v)} we have

\displaystyle  w(0,0) = w(1,1) = 1, \qquad w(0,1) = w(1,0) = \lambda,

yields the simplest case of the Ising model of magnetism. But we can also take a quantum leap and make you complex: we can allow the weights {w(\alpha(C_j),R_j)} to have complex values. In particular, we can have functions of the form

\displaystyle  \sum_{\alpha: X \rightarrow D} \prod_{j=1}^m \omega^{g(\alpha(C_j),R_j)},

where {\omega} is a complex root of unity. Writing the product of powers of {\omega} as just {\omega} raised to a sum of terms ranging over all the clauses {C_j} in the instance {\Phi}, and inserting a normalizing constant {r}, we obtain functions of the form

\displaystyle  Z(f_\Phi) = r\sum_{\alpha: X \rightarrow D} \omega^{f_\Phi(\alpha)},

which are generally called partition functions. Indeed, we long ago discussed various efficient ways to convert quantum circuits {C} into polynomials {f_C(x_1,\dots,x_n)} so that, with {D = \{0,1\}} and {r} depending only on {C}, {Z(f_C)} gives the acceptance amplitude of {C}. The acceptance probability can be represented in like manner. See also this paper for another view of the bridge from the Ising model to quantum circuits.

Thus, the dichotomy theorems that Jin-Yi and Xi have obtained for complex weighted CSPs take matters into the quantum realm. What this may say about the class {\mathsf{BQP}} (bounded-error quantum polynomial time) we leave to the end.

Some of the Ideas

One key idea can be approached as generalizing a familiar feature of linear algebra. Suppose we take some number {\ell} of vectors of length {k} that belong to a linear subspace {S}. We can arrange them into an {\ell \times k} matrix {M}. Now let us multiply {M} on the left by a row vector {a = (a_1,\dots,a_\ell)}. Then {aM} is another vector of length {k}. This vector belongs to {S} because it is just a linear combination of our {\ell}-many vectors in {S}.

The insight for abstraction is that we can regard {a} as having been applied individually to each of the {k} columns of {M}. We got one new {k}-tuple from this application, but it still belonged to {S}. So now, let us consider “multiplying” {M} on the left by an arbitrary {\ell}-ary function {h(x_1,\dots,x_\ell)}, so that {hM} is always a {k}-tuple. And let us relax {S} to be any (finite) set {R \subseteq D^k} where {D} is our (finite) domain.

Definition 1 {R} has {h} as a polymorphism if for any {M} made from {\ell} members of {R}, allowing repeats, {hM} belongs to {R}. A set {\Gamma} of relations {R} has {h} as a polymorphism if every {R \in \Gamma} has {h} as a polymorphism.

With {\ell = 2}, the binary and function is a polymorphism of the relation {R} above that represented 3-clauses with two negated literals. This is because the excluded triple {(1,1,0)} cannot be written as the and of the other triples. This extends to the relation {R'} for satisfiability with three negated literals, which excludes {(1,1,1)}, so and is a polymorphism of {\Gamma = \{R,R'\}}, and also for {k > 3} to pertain to Horn clauses in general. Dually, or is a polymorphism for satisfiability of dual-Horn clauses.

The case of {\mathrm{2SAT}} is trickier. Its {\Gamma} has three constituent relations, depending on the number of negated literals:

\displaystyle  \begin{array}{rcl}  R^0 &=& \{(1,0),(0,1),(1,1)\}\\ R^1 &=& \{(0,0),(0,1),(1,1)\}\\ R^2 &=& \{(0,0),(0,1),(1,0)\}. \end{array}

Is there an {h} that is simultaneously a polymorphism of each one? Neither and nor or works: the former fails on {R^0} because the component-wise and of {(1,0)} and {(0,1)} is the excluded {(0,0)}, and or similarly fails on {R^2}. In fact, none of the sixteen binary Boolean functions is a homomorphism of all three. But when we go up to {\ell = 3}, however, it turns out that the majority-of-3 function is a polymorphism: any duplicated tuple rules, so only the three instances of three separate tuples need to be checked.

Going back to the linear dependence idea, the ternary function {h(x_1,x_2,x_3) = x_1 + x_2 + x_3} is a polymorphism of affine relations. For {D = \{0,1\}} where {+} is the same as exclusive-or, it has the property that {h(x,y,y) = h(y,y,x) = x}. This property is named for Anatoly Maltsev, who used it long before any of this complexity work began. It turned out to be a key to demarcating the easy cases for the dichotomy in higher dimensions. Affine relations were already the lone easy case in Creignou-Hermann for Boolean #CSP, while we can now crisply state Schaefer’s original dichotomy for decision CSP: affine, majority-of-3, and, or, plus two trivial cases where the all-zero or all-one assignment always works, are the polymorphisms that characterize the SAT-like problems in {\mathsf{P}}.

A second basic idea is to re-cast the counting problems as ones of counting homomorphisms between relational structures. For instance, let {G} be any graph and {K_r} the complete graph on {r} nodes. A homomorphism {h} has the property that when {(u,v)} is an edge of {G}, {(h(u),h(v)} is an edge of {K_r}—in particular, {h(u) \neq h(v)}. This is possible if and only if {G} is {r}-colorable. This approach was greatly enhanced in a 1998 paper by Peter Jeavons and its follow-ups with other authors. It was particularly useful to the decidability part of Dyer and Richerby’s paper.

A third idea used by Cai and Chen originated in a 2010 paper of theirs with Pinyan Lu for the case of non-negative weights. It is to stratify the sum in (1) over all {t \leq n} by mapping

\displaystyle  f(a_1,\dots,a_t) = \sum_{\alpha: \alpha(x_1)=a_1,\dots,\alpha(x_t)=a_t} \;\;\; \prod_{j=1}^m w(\alpha(C_j),R_j).

With {t = n} this is just a single evaluation; with {t=0} this is the original function. They create an oracle that either evaluates the whole sum or allows progressing from {t-1} to {t}. They create a matrix {M} with {d^{t-1}} rows for the possible fixed parts {\vec{a} = (a_1,\dots,a_{t-1})} and {d} columns for the possible choices of {a_t = \alpha(x_t)}, where {d = |D|}. The entries are {M[\vec{a},c] = f(a_1,\dots,a_{t-1},c)}, and {M[\vec{a},\cdot]} stands for the row vector over all {c \in D}. The oracle, given {\vec{a}}, returns the multiple of {M[\vec{a},\cdot]} that normalizes the first non-zero entry to {1}, or the all-zero vector if {M[\vec{a},\cdot]} is all zero.

A big extra challenge in the complex-weights case is how to handle cancellations, as begun in this paper by Cai, Chen, and Lu. Implementing the oracle efficiently requires the Maltsev property, a kind of “proto-unitarity” condition on {M} (under which {M} has at most {d} pairwise linearly independent rows and every two such rows are orthogonal), and a third property they call the “Type Partition” condition to avert blowup in the number of queries. Whether the classification they get is decidable remains open.

Here is where we say to go to the papers for details, but we hope we have expressed ideas and their motivation as a guide. We have a little more to say about interpretation and making further progress on the dichotomy.


Taking Dichotomy Further


Progress on the counting dichotomy increases the “internal evidence” for {\mathsf{\#P \neq P}}, which of course is implied by {\mathsf{NP \neq P}}. To paraphrase how our blogging friends Scott and Lance and Bill have often put it:

Why would all this finely filigreed structure with sharp demarcation between hard and easy cases exist if {\mathsf{P \neq NP}} were a false illusion?

At the very least, this puts the onus on a believer in {\mathsf{P = NP}} to give an alternative explanation of how such structure could arise.

It strikes us, however, that this also puts a squeeze on {\mathsf{BQP}} under the conventional wisdom that it is intermediate between {\mathsf{P}} and {\mathsf{\#P}}, without being {\mathsf{NP}}-hard in particular. This is not in technical conflict with the Cai-Chen dichotomy because exactly computing the acceptance probabilities of a class of quantum circuits (on inputs built into the circuits) can be {\mathsf{\#P}}-complete while the circuits use only a gap in probabilities to solve problems that are easy or intermediate. But we suspect that this still limits possible ways to capture {\mathsf{BQP}} “naturally” by descriptive logic.

The prize paper with Chen is just one part of a long-term project by Jin-Yi and co-workers on classifying counting problems and showing where dichotomy applies. A 2010 paper of Jin-Yi with Lu and Sangxia Huang bridged from #CSP and a subcase of counting homomorphisms between graphs to a wider kind of sum-over-products formula originally called a holant here. Recent progress on dichotomy for holants is represented by this FOCS 2020 paper with Shuai Shao. This is also both directly relevant to quantum computation and maps out progress that is still out there to make. The paper has a one-page prologue that summons characters from legendary fantasy (and Star Wars) to conquer quantum issues.

Open Problems

Again, we congratulate the winners—the human winners. What will the full implications of this work become in the years ahead?


[some minor word changes]

4 Comments leave one →
  1. Breakout permalink
    August 5, 2021 4:13 pm

    One could also change the rules slightly for each game on a random basis (e.g. one could expand the board, either in 2D or going to 3+ dimensions), which would mess up most chess-playing programs that have been trained on large historical game databases. Changing the rules devalues built up experience relative to strategy and tactics. Of course, then the game is no longer chess per se — it becomes some species of what used to be called “fairy chess.”

    • August 5, 2021 11:15 pm

      I think you meant to comment on the current “Turning the Tables” post—? I am wondering if WordPress has posts tangled up, because the current post should have generated more trackbacks needing approval to earlier posts—but only did to this one.

      Conventional wisdom on your suggestion is that it would benefit chess engines, which until recently did not require training on human games, just evaluation-based search.

Trackbacks

  1. Turning the Tables on Cheating? | Gödel's Lost Letter and P=NP
  2. Is P=NP a Grave Matter? | Gödel's Lost Letter and P=NP

Leave a Reply to KWReganCancel reply

Discover more from Gödel's Lost Letter and P=NP

Subscribe now to keep reading and get access to the full archive.

Continue reading