## Monday, June 6, 2011

### The easiest primal-dual algorithm

What is a good algorithm to teach the primal-dual technique for the design of approximation algorithms? One simple algorithm is in Charikar and Panigrahy's 2001 paper on clustering. This particular solution contains all the basic ideas of primal-dual algorithms, but none of the technical issues that come up in most applications. It's like a baby version of the facility location primal dual algorithm.

The input is a finite metric space and a cost f for each cluster opening. The output is a collection of balls (clusters), each ball B(i,r) centered at some point i and of some radius r, so as to cover all points. The value to minimize is f times the number of clusters, plus the sum of cluster radii.

First they design a linear programming relaxation, where x(i,r) indicates whether the ball B(i,r) is in the solution, and where there is one constraint for each point j, indicating that there has to be at least one ball B(i,r) selected for some r and some i within distance r of j. Then they write the dual. The dual linear program simply has one variable a(j) for each point j, and one constraint for each ball B(i,r), stating that the a(j)'s for j in that ball must sum to at most f+r.

The algorithm picks an arbitrary feasible dual solution that is a local maximum w.r.to changing a single coordinate a(j). In other words, the dual solution (a(j)) is such that for every j, there exists a ball B(i,r) containing j, such that the dual constraint is tight. Then, to deduce a feasible primal solution, let T be the set of such balls B(i,r). The algorithm greedily goes through all balls of T by order of non-increasing radius, to construct a subset S of disjoint balls of T. Finally, the primal output solution consists of the balls B(i,3r) for each ball B(i,r) in S.

For the analysis, they first need to prove that the output is feasible, that is, that the output balls cover every j. Fix a j: it's in some ball B(i,r) of T. If that ball is not in S, it's because it intersects some ball B(i',r') that is in S, so the output contains ball B(i',3r'). By the greedy definition of S, r' is larger than r, so the ball B(i',3r') covers j.

Then they need to bound the cost of the output balls. It's at most three times the cost of the balls in S. Since the balls in S are a disjoint subset of T, the corresponding tight constraints sum to give that the cost of S is at most the sum of all a(j)'s, which is less than OPT by linear programming duality. That concludes the proof of a 3-approximation.