A Vast and Tiny Breakthrough
Christofides bound beaten by an epsilon’s idea of epsilon
src1, src2, src3 |
Anna Karlin, Nathan Klein, and Shayan Oveis Gharan have made a big splash with the number
No that is not the amount of the US debt, or the new relief bill. It is the fraction by which the hallowed 44-year-old upper bound of on the approximation ratio of the metric Traveling Salesperson Problem has been improved. With the help of randomization, we hasten to add.
Today we discuss the larger meaning of their tiny breakthrough.
The abstract of their paper is as pithy as can be:
For some we give a approximation algorithm for metric TSP.
Metric TSP means that the cost of the tour is the sum of the distances of the edges
according to a given metric . When the points are in with the Euclidean metric, an -time algorithm can come within a factor of the optimal cost for any prescribed . Sanjeev Arora and Joseph Mitchell jointly won the 2002 Gödel Prize for their randomized algorithms doing exactly that. The rub is the constant in the “” depends on —indeed, nobody knows how to make it scale less than linearly in . But for general metrics, getting within a factor of is known to be -hard for up to .
Some intermediate cases of metrics had allowed getting within a factor of , but for general metrics the factor found in 1976 by the late Nicos Christofides, and concurrently by Anatoliy Serdyukov, stood like a brick wall. Well, we didn’t expect it to be a brick wall at first. Let me tell a story.
A Proof in a Pub
Soon after starting as a graduate student at Oxford in 1981, I went with a bunch of dons and fellow students down to London for a one-day workshop where Christofides was among the speakers and presented his result along with newer work. I’d already heard it spoken of as a combinatorial gem and perfect motivator for a graduate student to appreciate the power of combining simplicity and elegance:
- Calculate the (or a) minimum spanning tree of the given points.
- Take to be the leaves and any other odd-degree nodes of and calculate a minimum matching of them.
- The graph has all nodes of even degree so it has an easily-found Eulerian cycle .
- The cycle may repeat vertices, but by the triangle inequality for the distance metric , we can bypass repeats to create a Hamilton cycle giving .
Now any optimal TSP tour arises as a spanning tree plus an edge, so . And can be partitioned into two sets of paths with endpoints in . One of those sets has weight at most and yet matches all pairs of . Thus . It follows that and we’re done.
My memory of what we did after the workshop is hazy but I’m quite sure we must have gone to a pub for dinner and drinks before taking the train back up to Oxford. My point is, the above proof is the kind that can be told and discussed in a pub. It combines several greatest hits of the field: minimum spanning tree, perfect matching, Euler tour, Hamiltonian cycle, triangle inequality. The proof needs no extensive calculation; maybe a napkin to draw on and the partition helps.
The conversation would surely have gone to the question,
Can the factor be beaten?
A perfect topic for mathematical pub conversation. Let’s continue as if that’s what happened next—I wish I could recall it.
Trees That Snake Around
Note that the proof already “beats” it in the sense of there being a strict inequality, and it really shows
where is the maximum cost of any edge. The advantage shrinks to zero as grows, however. Moreover, examples where Christofides’s algorithm does no better than approach are easy to draw. Pub walls are often covered with emblems of local organizations, and if one has a caduceus symbol it can serve as the drawing:
The staff is a path of nodes while the snakes alternate edges of weight between nodes two apart on the path. Going up one snake and down the other gives an optimal tour of weight (using the two outermost path edges to switch between the snakes), which . The snake edges don’t change the path’s being the minimum spanning tree, and for this costs plus the weight required to match the path’s endpoints. The extra weight is reckoned as the length of one snake, which , so the ratio approaches as and . Here are some tantalizing aspects:
- The snake edges, plus one path edge to connect them, make a maximum-weight spanning tree in the graph formed by the two kinds of edges. Yet followed by the same steps 2–4 of Christofides’s algorithm would yield and optimum tour.
- When one is given only the points plus the graph metric induced by , not itself, then there are much worse spanning trees. The single edge connecting the endpoints of the previous path has weight .
- Thus has relatively low weight compared to these possible other trees. And its weight approaches that of as . This means that small changes in the size of the tree yield large changes in the quality of the induced tour.
- The advantage of is that its odd-valence nodes have small distance under . As a path it snakes around so that its ends are near each other, unlike those of the minimum spanning tree . This raises the question of weighting spanning trees according to a slightly different measure that incorporates a term for “odd-closeness.”
In 1981, we would not have known about Arora’s and Mitchell’s results, so we would have felt fully on the frontier by embedding the points in the plane and sketching spanning trees and cycles on a piece of paper. After a couple pints of ale we might have felt sure that a simple proof with such evident slack ought to yield to a more sophisticated attack.
Helpful Trees
There is one idea that we might have come up with in a pub. The motivation for choosing to be a minimum spanning tree is that many of its edges go into the Euler tour and those bound the final even if shortcuts them. So making the total edge weight of minimum seems to be the best way to help at that stage. We might have wondered, however, whether there is a way to create to have a stronger direct relation to good tours, if not to the optimal tour.
Oveis Gharan did have such an idea jointly with a different group of authors a decade ago, in the best paper of SODA 2010. We cannot seem to get our hands on the optimal tour, nor even a “good” tour if that means a better than factor approximation—that is what we are trying to find to begin with. But there is another “tour” that we can compute. This is an optimum of the linear programming relaxation of TSP, whose relation to the exact-TSP methods of Michael Held and Dick Karp we covered a while back. is not a single tour but rather an ensemble of “fractional tours” where each edge has a rational number representing its contribution to the LP solution. The higher , the more helpful the edge.
The objective then becomes to design distributions of spanning trees so that:
- Sampling is polynomial-time efficient.
- For every edge , where is tiny.
- The distribution promotes trees with fewer leaves and odd-valence interior nodes.
The algorithmic strategy this fits into is to sample from , plug into the first step of the Christofides algorithm, and continue as before.
The Proof and the Pudding
The first two conditions are solidly defined. Considerable technical details in the SODA 2010 paper and another paper at FOCS 2011 that was joint with Amin Saberi and Mohit Singh are devoted to them. A third desideratum is that the distribution not be over-constrained but rather have maximum entropy, so that for efficiently computable numbers approaching one has also:
The third condition, however, follows the maxim,
“the proof of the pudding is in the eating.”
As our source makes clear, this does not refer to American-style dessert pudding, but rather savory British pub fare going back to 1605 at least. The point is that we ultimately know a choice of is good by proving it gives a better approximation factor than .
In America, we tend to say the maxim a different way:
“the proof is in the pudding.”
The new paper uses the “pudding” from the 2011 paper but needed to deepen the proof. Here is where we usually say to refer to the paper for the considerable details. But in this case we find that a number of the beautiful concepts laid out in the paper’s introduction, such as real stability and strong Rayleigh distributions, are more accessibly described in the notes for the first half of a course taught last spring by Oveis Gharan with Klein as TA. One nub is that if a set of complex numbers all have positive imaginary part, then any product of two of the numbers has real part less than the product of the real parts, and if the latter product is positive, then is not a real number. This rules out assignments drawn from the set from being solutions to certain polynomials as well as setting up odd/even parity properties elsewhere.
Rigidity of the TSP Universe
I’ll close instead with some remarks while admitting that my own limited time—I have been dealing with more chess cases—prevents them from being fully informed.
The main remark is to marvel that the panoply of polynomial properties and deep analysis buy such a tiny improvement. It is hard to believe that the true space of TSP approximation methods is so rigid. In this I am reminded of Scott Aaronson’s calculations that a collision of two stellar black holes a mere 3,000 miles away would stretch space near you by only a millimeter. There is considerable belief that the approximation factor ought to be improvable at least as far as .
It strikes me that the maximum-entropy condition, while facilitating the analysis, works against the objective of making the trees more special. It cannot come near the kind of snaky tree obtained by deleting any edge from a good tour , such that plugging into step 1 yields back again. The theory of polynomials and distributions that they develop has a plug-and-play element, so that they can condition the distributions toward the third objective using the parity properties. But their framework has inflexibility represented by needing to postulate a real-valued function on the optimum edges whose expectation is of order the square of a parameter already given the tiny value . Of the requirement that be a small fraction of their governing epsilon parameter, they say in section 3:
This forces us to take very small, which is why we get only a “very slightly” improved approximation algorithm for TSP. Furthermore, since we use OPT edges in our construction, we don’t get a new upper bound on the integrality gap. We leave it as an open problem to find a reduction to the “cactus” case that doesn’t involve using a slack vector for OPT (or a completely different approach).
What may be wanting is a better way of getting the odd-valence tree nodes to be closer, not just fewer in number. To be sure, ideas for “closer” might wind up presupposing a metric topology on the given points, leading to cases that have already been improved by other means.
Open Problems
Will the tiny but fixed wedge below become a lever by which to find better approximations?
There is also the kvetch that the algorithm is randomized, whereas the original by Christofides and Serdyukov is deterministic. Can the new methods be derandomized?
[fixed = to + sign at end of Christofides proof; fixed wording of “nub” at end of pudding section; added to denominator of equation in third section; used correct Held-Karp link to 2017 not 2012]
Is there connection between UGC and this?