## Sunday, December 12, 2010

### How not to present algorithms

I decided to look at the award-winning Arora-Boaz-Steurer paper giving a subexponential algorithm for unique games. I first wanted to take a look at the algorithm itself, temporarily ignoring the analysis and focusing on the algorithmic techniques only.

According to the plan in the intro, the algorithm is in section 5. There, first, without loss of generality they assume that the graph is regular -- for that, the reader can simply check Appendix A. I go there and review the construction. Going back to section 5, they then make the graph lazy by adding self-loops - I skim the construction.

We are then sent to section 4 and Theorem 4.5. Going there, we are redirected to Theorem 4.1 "using Theorem 3.2 instead of 2.3". Making a mental note of that and going to Theorem 4.1, I see that the proof starts with a Lemma, 4.2, whose proof is a mere call to Theorem 2.3 for some particular value of a parameter. Thus it is time to go to Theorem 3.2. Going there, I am redirected to Theorem 2.3. Going there, I am redirected to Theorem 3.1. Going to its proof, I see that the lemmas are just doing an analysis, without algorithmic contents, until I get to a comment "... moreover, following the proof we see that if the condition is violated we can get a set S ...". Aaargh!

Giving up on that for now and going back to the thread of the proof of Theorem 4.1, I skim the next couple of lemmas, that seem to do some combination of greedy partitioning and of iteratively re-partitioning parts into smaller pieces a few times. Going back to section 5, the next step, in my impression, assigns labels independently for each part of the partition. More precisely, for each label-extended part of the partition, it redirects us to the algorithm of Theorem 2.2. Miraculously, the proof of Theorem 2.2 is basically self-contained: take the subspace spanned by eigenvectors of the large eigenvalues, put an epsilon-net in its unit ball, and, for each vector v in that epsilon-net, use its "large enough" coordinates to define a subset S(v) of vertices of the graph. Going back to section 5, apparently for each set S(v) there will be a mysterious assignment, and the algorithm exhaustively enumerates the v's and picks the assignment satisfying the most constraints.

What have I learned? Basically nothing so far. Frustrating! If the result was not so good, I would give up on that paper. Instead, I may consider setting aside a few days solely for deciphering it...

1. As an undergraduate student, I once had a very difficult and long, complex paper to present. I resorted to drawing a digraph of the theorems and propositions in the paper (acyclic!) and reading them in a topological order.

2. Claire, as the intro says, the unique games algorithm relies upon the earlier algorithm for small set expansion. You should not have skipped ahead to Section 5 (unique games). Why is it unusual that Section 5 depends upon Section 4?

Sorry about not giving a succinct description of the algorithm though. To be rectified in future versions.

Sanjeev Arora

3. I don't find this unreasonable. The process behind a paper is having an idea, then slowly prove everything you need for that idea to work, then use it to prove the problem solved by the paper. So, one has to build upon what he has already proved.

You can always write down on a piece of paper (if you printed the paper) or in a text document what every lemma and document proves, a kind of an index.

4. Daveagp: I sometimes do that when I write the final version of a paper, and when I do it, it vastly improves the quality of the final product. The digraph often suggests a reordering of the lemmas, the elimination of some (when a node in the digraph has outdegree 1 or indegree 1, one should seriously consider eliminating it unless there's a good reason to keep it...), and it leads to re-organizing the paper.

Sanjeev: of course there's nothing unusual about Section 5 depending upon section 4, but it is the total number of redirections threw me off. No matter: I'll get around to reading it anyway!

Chazisop: the process you describe is perfectly fine for designing the first draft for your own consumption, but once you have dotted all the i's and crossed all the t's and are confident that your proof is complete, it is time to rewrite your paper from scratch with the reader in mind.

5. As Sanjeev mentions, I think you'll find it easier to read our paper in linear order.

Section 2 in our paper contains a description of the algorithm for the small set expansion problem, while Section 3 contains its analysis.

Section 2 in particular is only 2 pages, since it is a very simple algorithm. (The UG algorithm is also fairly simple, but I think its best understood as an extension of the small-set expansion algorithm.)

Some people may find these slides on our paper useful, see http://www.cs.princeton.edu/~boaz/Papers/subexpug.ppsx