Polynomial Prestidigitation
How some longstanding open problems were made to disappear
Ernie Croot, Vsevolod Lev, and Péter Pach (CLP) found a new application of polynomials last month. They proved that every set of size at least has three distinct elements such that . Jordan Ellenberg and Dion Gijswijt extended this to for prime powers . Previous bounds had the form at best. Our friend Gil Kalai and others observed impacts on other mathematical problems including conjectures about sizes of sunflowers.
Today we congratulate them—Croot is a colleague of Dick’s in Mathematics at Georgia Tech—and wonder what the breakthroughs involving polynomials might mean for complexity theory.
What’s amazing is that the above papers are so short, including a new advance by Ellenberg that is just 2 pages. In his own post on the results, Tim Gowers muses:
[The CLP argument presents a stiff challenge to my view that] mathematical ideas always result from a fairly systematic process—and that the opposite impression, that some ideas are incredible bolts from the blue that require “genius” or “sudden inspiration” to find, is an illusion that results from the way mathematicians present their proofs after they have discovered them. …[T]he argument has a magic quality that leaves one wondering how on earth anybody thought of it.
We don’t know if we can explain the source of the ‘magic’ but we will try to describe it in a way that might help apply it.
Rank and Degree
At top level there is no more sleight-of-hand than a simple trick about matrix rank. We discussed ideas of rank some time ago.
If a matrix is a sum of matrices each of rank at most , then any condition that would force to have rank must be false.
A simple case is where the condition zeroes every off-diagonal element of . Then the main diagonal can have at most nonzero entries. This actually gets applied in the papers. The fact that column rank equals row rank also helps for intuition, as Peter Cameron remarks.
A second trick might be called “degree-halving”: Suppose you have a polynomial of degree . Even if is irreducible, might be approximated or at least “subsumed” term-wise by a degree- product . When is multi-linear, or at least of bounded degree in each variable—call this —we may get where is close to .
In any case, at least one of must have degree at most , say . If we can treat and its variables as parameters, maybe even substitute them by well-chosen constants, then we are down to of degree . Then is a sum of terms each having a monomial of total degree in variables each with power at most .
The number of such monomials is relatively small. This limits the dimension of spaces spanned by such , which may in turn connect to the bound above and/or limit the size of exceptional subsets of the whole space . We discussed Roman Smolensky’s famous use of the degree-halving trick in circuit complexity here.
Polynomial Leverage
These tricks of linear algebra and degree are all very well, but how can we use them to attack our problem? We want to bound the size of subsets having no element such that for some nonzero , , , and all belong to . This is equivalent to having no three elements such that . This means that the following two subsets of are disjoint:
How can we use polynomials to gain leverage on this? The insight may look too trivial to matter:
Any polynomial supported only on must vanish on .
Let be the complement of and let be the space of polynomials vanishing on that belong to our set . We can lower-bound the size of by observing that the evaluation map from to the graph of its values on is a linear transformation. Its image has size at most , and since is the kernel, we have , so .
Well, this is useless unless , but is the complement of which is no bigger than the set we are trying to upper-bound. So it is useless—unless is pretty big. So we need to choose —and maybe —to be not so low. We can do this, but how can this lower bound on help? We need a “clashing” upper bound. This is where the presto observation by CLP came in.
Magic Box
Given the set , make a matrix whose entry in row , column , is . In APL notation this is the “ outerproduct” of with itself. Its diagonal is and the rest is .
Now apply to every entry to get a matrix . By every off-diagonal entry vanishes, so is a diagonal matrix. Its rank is hence the number of nonzero diagonal entries. If we can upper-bound , then we can upper-bound by the hoc-est-corpus rubric of description complexity:
Every can be described by its up-to- nonzero values on , so there are at most of them.
The papers use bounds on the dimension of in place of description complexity, but this is enough to see how to get some kind of upper bound. Since , taking logs base gives us:
It remains to bound , but it seems to take X-ray vision just to see that a bound can give us anything nontrivial. OK, any fixed bound on makes the right-hand side only which yields a contradiction, so there is hope. The rank trick combines with degree-halving to pull a bound involving out of the hat. Here is the version by Ellenberg and Gijswijt where the nonce choice suffices and the coefficients on in are replaced by a general triple such that :
Lemma 1 With , , and as above, put to be the set of polynomials in that vanish on . Then for all there are at most values for which
Proof: Let and be vectors of variables, and write
where each coefficient is in and the sum is over pairs of monomials whose product has degree at most and at most in any variable. Collect the terms in which has total degree at most separately from those where does, so that we get
where each and is an arbitrary function and now the sum is over monomials of total degree at most (and still no more than in any variable if we care). Now look at the matrix whose entry is , including the diagonal where . We have
This is a sum of at most single-entry matrices, so the rank of is at most that. Since is a diagonal matrix and makes , there are at most nonzero values over .
Sawing the degree in half stacks up against . We retain freedom to choose (and possibly ) to advantage. There are still considerable numerical details needed to ensure this works and tweaks to tighten bounds—for which we refer to the papers—but we have shown the “Pledge,” the “Turn,” and the “Prestige” of the argument.
Open Problems
Can you find more applications of the polynomial technique besides those enumerated in the papers and posts we have linked? For circuit complexity we’d not only like to go from back to as CLP have it, but also get results for when is not a prime power. Can we make assumptions (for sake of contradiction) that create situations with higher “leverage” than merely being disjoint?
[changed subtitle; linked hoc-est-corpus which literally means, “here is the body”; deleted and changed remarks before “Open Problems”; inserted tighter sum into description complexity formula.]
Can you explain the last sentence before “Open Problems”, about needing full generality? I can’t work out what it means: what can you do for general that you can’t do for ?
It was based on E+G’s remarks about the case but I see it’s not needed as a qualifier. Thanks.
Another amazing result that follows from these techniques for one of my favorite problems: http://arxiv.org/pdf/1606.01230v2.pdf. Fox and LM Lovasz improve the bounds for the arithmetic triangle removal lemma dramatically, from a tower of two’s to polynomial!
Finally someone mentions Smolensky! His proof was the first thing that came to my mind too after reading the CLP paper.