Tuesday, June 21, 2011

Better than 3/2 for TSP

I just saw the video of a talk by Mohit Singh presenting the TSP result discussed on the Fortnow-Gasarch blog a few months ago.

The input is the metric given by shortest paths in an undirected unweighted graph. The output is a traveling salesman tour with length less than
OPT*1.4999999999999999999999999999999999999999999999999996.
Ever since my student days, I have heard of this - beating the factor of 1.5 for metric TSP - as a major open problem. I even spent some time working on this as a young graduate student, under Andy Yao's guidance (and with no luck whatsoever), so this is exciting, even if it's restricted to graph metrics.

The plan follows the Christofides algorithm, but finds a particular MST such that the minimum weight matching of odd degree vertices is a better than .5 approximation. The algorithm studies the following two LP relaxations:

First, the Held-Karp LP relaxation of TSP, LP1: minimize the sum of d(u,v)x(u,v), where for every vertex u the sum over v of x(u,v)=2, and for every cut, the sum over cut edges {u,v} of x(u,v) is at least 2. Here x(u,v) is non-negative.

Second, given a spanning tree T, the following LP relaxation to build a minimum weight perfect matching over vertices that have odd degree in T, LP2(T): minimize the sum of d(u,v)y(u,v), where for every cut crossed by an odd number of edges of T, the sum S over cut edges {u,v} of y(u,v) is at least 1. Here y(u,v) is non-negative.

It is known that the minimum spanning tree has value less than OPT(LP1), which is less than the OPT tour. It is known that the minimum weight perfect matching of odd degree vertices has value exactly equal to OPT(LP2(T)). Observe that if x is feasible for LP1, then for every T, y=x/2 is feasible for LP2(T), so OPT(LP2(T)) is at most (1/2)OPT(LP1). This proves the 1.5 bound of Christofides.

Now, here is the first idea: imagine that instead of defining y(u,v) as x(u,v)/2, we defined it as x(u,v)/(2.001). Would this still be feasible for LP2(T)? Yes, if for every cut that contained edge {u,v}, either the sum S was greater than 2.001, or the cut was crossed by an even number of edges of T. If we could find such a minimum spanning tree T, then OPT(LP2(T)) would be less than (1/2.001)LPT(LP1) and we'd have a better-than-1.5 approximation.

Studying OPT(LP1), the authors can show that (except for the special case when the solution of LP1 is near integral, dealt with separately) for a significant number of edges called "good" edges, the number of cuts containing that edge and such that the sum S is less than 2.001 is bounded by a constant c. So all we need is a tree T such that for many of those edges, every one of the c cuts of interest is crossed by an even number of edges.

Unfortunately it is difficult to construct a tree T having such a nice property, so instead, (as in the SODA 2010 paper by Asadpour, Goemans, Madry, Gharan and Saberi) the authors pick a random tree from the following distribution: there exists a unique distribution of trees s.t. for each edge uv, the probability that uv is in the tree is x(u,v), and such that for every tree T, the probability of T is proportional to the product of x(u,v) for each edge uv of T. In fact, that's the weighted uniform distributon defined as follows: scale the x's up so that they're all integers, replace the graph by a multigraph where each edge {u,v} is replaced by x(u,v) identical edges between u and v, then generate a uniform spanning tree of the resulting graph. Consider the random tree T thus obtained.

It only remains to study, for each good edge, the probability that all c cuts of interest are crossed by an even number of edges of T. For each edge {u,v}, the probability that the edge is in T is x(u,v). For each set of k edges, the probability that all of those k edges are in T is known to be given by a certain determinant. A lot is known about weighted uniform spanning trees. The very vague intuition is that if the average number of edges crossing a cut is between 2 and 2.001, since the weighted spanning tree distribution is very roughly similar to picking edges independently at random (except for the tree constraint...), there is a significant probability that the number of edges crossing the cut will be, not 1, not 3, but exactly 2. Since 2 is even, we're in good shape.

It is beautiful how this puts together techniques from the study of linear programs, and recent advances on random trees and discrete probability.