Why Is 290 Special?
How exceptions in theorems may affect their complexity
Cropped from India Today source |
Manjul Bhargava is a mathematician who just won one of the 2014 Fields Medals. We offer our congratulations on this achievement. He is an expert in number theory, which makes him special to us among Fields medalists. His Fields citation includes his doctoral work on a powerful reformulation and extension of Carl Gauss’s composition law for quadratic forms. He also proved a sense in which 290 is special to us among numbers, since we have been thinking recently about quadratic forms as tools in complexity theory.
Today we talk about his “290 Theorem” with Jonathan Hanke, which is quite accessible, and also raise complexity-related questions about this result.
One of his papers is in the American Math Monthly. It is on the factorial function and its generalizations. We always applaud when a pathbreaker writes a popular survey. It is hard to imagine a more concrete subject and a more accessible journal.
A sub-surface question in computational complexity is how much difference can finite changes make? With asymptotic time or space bounds, finite changes make zero difference—finite sets have zero asymptotic complexity. However, for measures of concrete complexity, such as the size of proofs and the information content of strings, even small exceptions may make a difference.
Quadratic Forms and Universality
Why is special? Well the answer starts with, why is special? Call a quadratic form in integer variables universal if it represents all positive integers, that is, if .
- We can view the ancient result that is irrational as saying that the quadratic form is not universal—it fails of course to represent among many other integers.
- The function is universal since we can have and anything. However, it “cheats” by violating a historical essence of being “quadratic,” which is that all nonzero arguments should map to positive values.
-
To find a universal form that doesn’t cheat requires . The fact that
is universal is Joseph Lagrange’s famous Four-Square Theorem, whose implications in complexity theory we have already noted.
Indeed, Jeffrey Shallit and Michael Rabin proved that this form is effectively universal in the sense that given any integer , we can find in random polynomial time integers so that
Our complexity problems here are perhaps related but different: given a form , how easy is it to test whether is universal, and how easy is it to prove the theorem on which the test is based?
We need to define the allowed range of forms carefully. Every can be specified by a matrix such that for all ,
If we were using non-commutative algebra where , then would be unique, but now the coefficient of can be split any way between the entries and . The convention is to split evenly, so that is symmetric, and once again unique. Then the condition that for is the same as being positive definite, and often this is assumed when talking about quadratic forms.
In Lagrange’s case, is the identity matrix. Diagonal matrices are positive definite if and only if all the diagonal entries are positive, but for symmetric real matrices in general things are trickier. The matrix for is positive definite despite the two entries, while is another “cheater” despite all-positive entries. In the case , is positive definite but has half-integer entries:
Then is said to be integer-valued but not classically integral. For much of two centuries, was also thought to “cheat,” so that only forms with even cross-term coefficients were distinguished, but Bhargava’s work has helped solidify the looser “twos-out” condition as the standard.
15 and 290
Obviously, universal forms are interesting. In 1993, John Conway and William Schneeberger proved that if a positive-definite quadratic form with integer matrix represents all positive integers up to , then it represents all positive integers. Pretty neat. This clearly makes the checking of such a form to see if it is universal a relatively simple task. The proof was not published and was quite intricate. This result is naturally called the 15 Theorem.
Bhargava found a simpler proof in 2000, which was hailed by Conway himself. He followed this up by proving Conway and Schneeberger’s conjecture that one could relax to an integer-valued form upon replacing by : If a form represents all positive integers up to , then it is universal. Precisely stated:
Theorem 1 For any integer-valued positive-definite quadratic form , if includes
then includes . Moreover, this set of twenty-nine numbers is minimal—remove any one and the statement becomes false.
Four Theorem Types
This last theorem alternates senses of being easy and complex. Its statement is complex—where do those numbers come from? However, having only twenty-nine numbers to check makes it easy to prove that a given form is universal—just produce arguments yielding each number as a value. However, the problem of finding such arguments, especially given an arbitrary , remains possibly complex—can there be any kind of extended Rabin-Shallit theorem?
What we fix on, however, is what the nature of the statement says about the complexity of the proof. This raises the question of a measure of complexity of theorems. Our intent can be approached by considering four types of theorems:
Type I. Every of type has property .
Type II. Every of type has property , except possibly for belonging to this fixed finite set of values:
Type III. If every member of has property , then every of type has property .
Type IV. A theorem about objects in which a test of kind II or III is part of the statement.
To interpret this, think of as a positive integer, fix a form , and read as “.” Then for the particular form , statement I is “ is universal,” II is “ is almost universal” (in a sense we haven’t discussed but that is clear), and III is an instance of Bhargava’s theorem for the particular (when , that is). Then Bhargava’s theorem itself is type IV, where the objects are forms , and all it does is universally quantify the assertion of type III over those .
Note that in types II and III, the statement remains true if is replaced by a larger set . We intend this also in type IV—that is, the theorem is monotone in the test. So define a number to be special if there is an such that:
- the theorem is true for ,
- is the largest number in , and
- the theorem is false for .
In types II and III it is enough to specify ; in IV we also give the theorem statement about ‘s.
Some Examples
Here are some examples of all these types:
Theorem 2. Every odd-order group is solvable. (Type I)
Theorem 3. Every finite group that is not cyclic or alternating or of certain Lie types is not simple, unless it is one of 26 so-called sporadic simple groups. (Type II)
Theorem 4. Every Fibonacci number has a prime factor that does not divide any earlier Fibonacci number, except for , , and . (Also type II)
A statement of type II logically implies one of type III, but in a trivial sense. For instance, let be the assertion that the (topological generalized) Poincaré conjecture holds in dimensions. Suppose we went back in time before Michael Freedman won his Fields Medal for proving in 1982, let alone Grigori Perelman for , but after Steve Smale won his in 1966 largely for proving for . Then with we had a theorem of type II, and also type III in the form “if Poincaré is true for then it is true for all .” But the implication generally does not go the other way, and since the Poincaré is true there was no special number.
Our point is that a theorem of type I seems easier to understand than one of type II or III, let alone IV. Hence theorem I might have a clearer proof than theorem II. None of this intuition is solid, but it does seem reasonable. The presence of gives the statements other than type-I higher information complexity. Our questions are whether the freedom to enlarge mitigates the increase in complexity compared to type I, whether this might allow easier proofs of the resulting statements, and how much harder it is to prove special numbers—when must be minimal.
Complexity
For now we only have a few sketchy ideas on these questions, involving the notion of the Kolmogorov complexity of a string . This is named for Andrey Kolmogorov, but various related notions were discovered at the same time by others. Here it is enough to say that is the length of the shortest program that generates . The notion extends to define for finite sets.
Can we prove that certain theorems of types II–IV cannot be proved in a fixed formal system? Our intent is to argue like this: Suppose that we could prove the statement
where is a given finite set. The means we’re taking minimal. If itself has large complexity, then this might lead to a contradiction.
A simple result can be proved. Let denote the length of formula as a string, and let ZF be the usual set theory—or any reasonable theory, provided its set of theorems is r.e.
Theorem 5. Suppose that the statement
where is a given finite set is provable in ZF. Then there is an absolute constant that depends only on ZF so that
Proof: Let a machine search the proofs of ZF for a proof of the above theorem. It will eventually find the proof. This will give us the set and so we have a description of of size at most plus a constant that depends on the machine that does the search.
Since we can get a bound on , we may be able to argue that certain concrete properties cannot be proved to fail for sets that are finite but too high in complexity. This would be great if we could actually do something like this. We would like even better to remove the dependence on , and obtain results of this kind:
There are only finitely many numbers that ZF can prove to be special.
Although when is the largest element in , we don’t get by imitating the above proof, because the machine could output an infinite sequence of ‘s that go with different ‘s. It isn’t a contradiction because something like or the index of in this sequence needs to be given as well to specify , and this need not be bounded by any function of . One can craft artificial ‘s to falsify it. But in specific cases, or in the presence of restrictions on , this might yield (or at least suggest) interesting results.
For instance, we’ve noted that homotopy groups are finite and computable, so that the sequence of their structure must have Kolmogorov complexity at most a value that we can determine. We need not develop a fast algorithm to compute them, but just a short one. This proves that while the intuitive complexity of the structure of homotopy groups seems high, we can bound their K-complexity. Having tight bounds would be interesting, and then might help analyze what kinds of relaxations of theorems make them easier to prove.
Open Problems
We have raised a diverse array of complexity-related questions. Can more light be shed on them? Is there any kind of global limit on special numbers?
Well, 15 is 3 times the 3rd prime and 290 is 10 times the 10th prime …
To see it in pictures, looky here:
☞ https://oeis.org/wiki/Riffs_and_Rotes
Bhargava’s Monthly paper on the factorial is essentially his undergraduate senior thesis.