Tuesday, July 5, 2011

Better than 3/2 for TSP, take two

In April Tobias Momke and Ola Svensson published a beautiful paper on Arxiv about an(other) better-than-3/2 approximation algorithm for the traveling salesman problem when the input is the shortest path metric in an unweighted undirected graph G. It's a 1.461 approximation. Here's a summary of the algorithm, without the analysis.

The algorithm first solves the Held-Karp linear programming relaxation, which in this presentation is written using one variable x(e) per graph edge e, and whose objective is to minimize the sum of x(e), subject to having at least 2 edges between S and V-S for every nontrivial subset S of vertices. It drops from the graph all edges not used by the LP solution.

From the LP solution, they construct a depth-first-search tree starting from some arbitrary root, where ties when choosing what vertex to visit next are always broken in favor of traversing the edge e with maximum value x(e). They can then view the graph as directed, with every tree edge directed away from the root and every back edge directed towards the root, and solve a circulation problem to find a minimum "cost" collection of cycles covering all tree edges. Here the "cost" is the number of back edges, except that for each vertex v and each child-subtree of v, the first back edge into v out of the subtree is free. (The fact that the tree is not just any DFS tree, but one that has been guided by the LP solution, helps bound the "cost" of the circulation.) Then the algorithm drops from the graph all edges not used by the circulation. Consider the result G'.

The algorithm will then make this graph into an Eulerian multigraph by making all the degrees even. In the Christofides heuristic specialized to unweighted graph metrics, one first takes an arbitrary spanning tree T and then makes it Eulerian by adding a set M of edges forming a min cost collection of paths in G connecting odd degree vertices.

Here, in the first phase the authors construct G' instead of T, so a priori they seem to doing worse than Christofides: the graph G' is not a tree but something with more edges, hence more costly. In the second phase they choose edges M forming those paths based on edges in G' only instead of G, thus depriving themselves of possibilities, so it's also a source of additional cost a priori. But they have a brilliant idea to make up for it: how about if, instead of adding all the edges of M to G', one removed some of them from G'? Indeed, if adding duplicate copies of some of the edges adjacent to a vertex v makes the degree of v even, then removing some of them instead of duplicating them also makes the degree even. That's the key new observation: x-1 is congruent to x+1 modulo 2. One wonders why it took 30 years to realize that!

The problem, of course, is that when one starts deleting some edges instead of adding them, then the graph risks becoming disconnected. To prevent that from happening, the authors define local constraints that suffice to ensure that connectivity is maintained: consider the structure of the graph G'. If, for each vertex v and each child-subtree of v, the algorithm makes sure to either preserve the tree edge from v into the subtree or preserve the first back edge from the subtree into v, then connectivity is maintained. These are called "removable pairs" of edges. Other edges of M will always be added if they are tree edges and will always be deleted if they are back edges.

So, now the authors want to choose M, a collection of paths connecting odd-degree vertices, in such a way that many of these edges can be deleted of the graph instead of being added to the graph. To that end, they just do it the stupid way, so to speak, i.e. without using any further structural property of their graph: it is not hard (via a reduction to matchings in cubic graphs) to pick a random M whose one-dimensional marginal is uniform, such that every edge is present with probability 1/3.

In total, some parts of the algorithm make it worse than the Christofides heuristic, some parts make it better, and there did not seem to be any way to predict a priori how those pros and cons would work out. It's a lucky coincidence that the resulting algorithm beats 3/2!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.