The More Variables, the Better?
Ideas from algebraic geometry and arithmetic complexity
Hyman Bass is a professor of both mathematics and mathematics education at the University of Michigan, after a long and storied career at Columbia University. He was one of the first generation of mathematicians to investigate K-theory, and gave what is now the recognized definition of the first generation of K-groups, that is for a ring . He co-founded with Jean-Pierre Serre a theory of graphs of groups, that is mappings associating a group to each vertex and edge of a graph. He is a past President of the American Mathematical Society—and exemplifies the idea that the AMS promotes mathematics at all levels: since 1996 he has worked on on elementary school math education with his Michigan colleague Deborah Ball.
Today Ken and I wish to discuss an idea used in a paper on the Jacobian Conjecture by Bass with Edwin Connell and David Wright.
One of the most powerful methods we use in theory is often the reduction of the dimension of a problem. The famous JL theorem and its relatives show us that often problems in high-dimensional Euclidean spaces can be reduced to much lower dimensional problems, with little error. This method can and has been used in many areas of theory to solve a variety of problems.
The idea used by Bass and many others is the opposite. Now we are interested in lifting a problem from a space to a much higher space. The critical intuition is that by moving your problem to a higher space there is more “room” to navigate, and this extra “space” allows operations to be performed that would not have been possible in the original lower-dimensional space. An easily quoted example is that the Poincaré property was proved relatively easily for spaces of dimension and higher, then for in the 1980’s, and finally for by Grigory Perelman drawing on Richard Hamilton.
Adding Variables
Let be a polynomial map from to where
The Jacobian Conjecture (JC), which we have covered several times before, indeed recently, studies when such maps are injective. The trouble with the JC is quite simply that such maps can be very complex in their structure, which explains the reason JC remains open after over 70 years.
A very simple idea, almost trivial idea, is to replace by which is defined by
for new variables . For example if then could be
The new variables are used in a “trivial” way. One reason the method is useful for the JC conjecture is that is injective if and only if is injective. This is trivial: the new variables do not interact in any manner with the original variables, and so the map is injective precisely when is.
Why is this of any use? Clearly is really just with an identity function on the variables “tacked on” to it. How can this help?
Using The Extra Variables
The answer is that we can use the extra variables to modify the polynomial so that it looks very different from . Suppose that we replace by where is a nice polynomial map. We may be able to vastly change the structure of and get a that is much simpler on some measure. The hope is that this restructuring will allow us to prove something about that then implies something about . This is exactly what happens in the study of the JC conjecture.
Here is a famous result from the Bass-Connell-Wright paper:
Theorem 1 Given there is an automorphism so that has degree at most three. Moreover is injective if and only if is.
This allows the JC question to be reduced to the study of cubic maps. Of course such low degree maps still are complex in their behavior, but the hope is that while they are on many more variables the restrict on their degree many make their structure easier to understand. This has yet to be fruitful—the JC remains open. The following quotation from the paper explains the idea:
General Use In Theory?
One idea is to try and use this stabilization philosophy to attack problems about polynomials that arise in complexity theory. For example can we use the addition of extra variables to attack the power of polynomial modulo composite numbers? Suppose that is a polynomial of many variables that modulo computes something interesting. What if we add extra variables as above, then rearrange the resulting polynomial to have a “better” structure? We must preserve not its injectivity, but for us its ability to compute something. If we can do this, then perhaps we can use the use the extra variables to obtain a lower bound.
Two simple examples from arithmetical complexity are aliasing variables in a formula to make it read-once, and the Derivative Lemma proved by Walter Baur and Volker Strassen. In the latter, the idea is to take a certain gate of a circuit computing a function , and regard it as a new variable in a circuit . The circuit computes a function of variables such that
In the Derivative Lemma the gate is chosen to involve one-or-two input variables , but the idea can be extended to other cases. When the construction is applied recursively we generate circuits in which the number of variables becomes higher and higher, but the circuits themselves become flatter and flatter, until only a “stardust” of many single-variable functions is left.
Open Problems
Are there general rules of thumb on when you should increase versus decrease the dimension? Or when to induct on the output gate(s) of a circuit, on the number of variables going down, or on the number of variables going up?
One example is extended formulations of polytopes. Sometimes optimizing over a complicated polytope Q can be done efficiently if one finds a higher dimensional polytope R that has few facets, but projects down to Q. Examples of such polytopes are the spanning tree polytope, or the permutahedron.
This brings up once again the problem of finding the proper form of derivatives (differentials, Jacobians, tangent functors, Taylor series, etc.) in Boolean spaces. A few wools of bit along these lines are gathered in this bag:
☞ Differential Logic and Dynamic Systems
Increasing variables could help in the study of random processes-processes that are highly memory dependent and non-Markovian could potentially be simplified by increasing dimensionality.
A similar idea is used in cryptography. Take a function f(x) and define a function F(x;r), where r denotes some extra variables. Then F is a *randomized encoding* of f if (1) you can efficiently “decode” f(x) from F(x,r) for any r, and (2) the distribution F(x,r) induced by random choice of r can be simulated given only f(x) — so this distribution carries no more information than f(x).
Then F inherits many of the cryptographic properties of f (for example, if f is one-way, then so is F; if f is pseudorandom, then so is F). But F can have much lower computational complexity. So often it suffices to just design something that works for a low complexity class, then via randomized encodings you get some higher classes for free. For example, every poly-size circuit has a (computationally secure) randomized encoding with constant depth.
There is a nice series of papers by Benny Applebaum, Yuval Ishai & Eyal Kushilevitz that develops randomized encoding and gives some very interesting applications.
IMHO similar approach we can see in graph theory. To represent a graph we need adjacency list only, but adjacency matrix is more useful for many algorithms and more over distance matrix may be much more useful. I.e. very often redundant representation is very helpful.