Friday, November 25, 2011

Teaching deterministic linear time median

When I teach Algorithms, one challenging lecture is the one where I present the deterministic linear-time algorithm for median-finding. The big challenge is: How can we present the algorithmic design so that it is natural? I had an idea: I could simply watch online videos and see how others do it! So, I found and watched videos of lectures at Stanford (by Tim Roughgarden) and at MIT (by Erik Demaine) on that topic. For a teacher, it is fascinating to watch how others go about teaching the same material as you!

I have picked up a couple of ideas from Erik's lecture.

First, the runtime analysis involves some quantities with floors and ceilings. I always sweep the floors and ceilings under the carpet (with a vacuum cleaner). Erik, on the other hand, gave a short, relatively polished analysis, that managed to take those into account rigorously without making a mess of the analysis (well, except that he is assuming without saying it that T(n) is a monotone function of n, but none of the students noticed that gap in the proof). I may try to copy his approach when I get to that point.

Second, like me, in the design phase Erik highlights the use of transitivity of ordering: if a < b and b < c then a < c. I realized while watching him that I could highlight that already when I teach Quicksort: why does Quicksort give a correct output? Because of transitivity: if a < pivot and pivot < b, then a < b. That explains why we are doing the correct thing when in Quicksort we commit to placing a to the left of b. For Quicksort, that part of correctness is obvious and usually goes without saying. For linear-time medians, it comes up much less naturally and is actually a "creative" step. But if I have taken a minute to bring it out while teaching Quicksort, then it will seem more natural when I bring it out while teaching linear-time median.

So, watching those videos was mildly useful. Unfortunately I did not find what I was looking for: the challenge of presenting the algorithm in a natural manner is still open, as far as I know.

1. Another question is: why teach this topic at all?

Is it practically useful? No.
Are the ideas in its analysis useful for other problems in the course? No.
Is it an introduction to modern research? No.

It was a cool result, to be sure, but in the past 40 years, its coolness has faded.

2. It's only useful as an illustration of divide-and-conquer.

3. Why not teach the randomized divide and conquer version of the median instead? It's expected linear time, and the analysis is very simple and intuitive.

4. I do, and Erik does, and Tim does, and I believe pretty much everyone does. Somehow, teaching both the randomized and the deterministic versions of linear-time median side by side really drives home the relative simplicity of randomized algorithms.

I have not taught deterministic median every year, but on the years when I didn't, I think the students missed it a little bit; it's like a little bit of culture, a nice curiosity (and a good divide-and-conquer practice). At least, that's how I usually conclude that class: by saying that they should take it merely as a curiosity.

5. In other words, you might be right; there are probably better uses of my lecture time.

6. I found another video of a lecture on linear-time median, that one, at IIT Bombay:
http://freevideolectures.com/Course/2281/Design-and-Analysis-of-Algorithms/7
Superficially the lecture is not very good for silly reasons such as apparent nervousness of the instructor, and I have minor quibbles about lack of concrete examples, but the intuition is developed extremely well, and in particular there is a very nicely done reduction from median to order statistics, a reduction (that I also teach) and that was completely omitted in the other two videos I watched.

Unfortunately, at the 17 minute mark the description is about to go into the detailed algorithm, but that's preceded by a disclaimer apologizing for presenting a step that is not intuitive: we don't really understand how they came up with this, it is said.

So, my quest for an intuitive presentation of the full algorithm is still unfulfilled so far.

7. Another lecture on the same topic:
http://www.cs.ucdavis.edu/~gusfield/cs222f07/median.mov

Comparing styles is very instructive, pedagogically. (Still no intuition on the core of the algorithm, though.)

8. In Chazelle's discrepancy book (Ch 11) the algorithm is described as building a sort of epsilon net (the median of medians trick), computing the median on the epsilon net, and recursing to find the exact median. This kind of epsilon net trick is actually a very useful technique. But it might go over undergraduate students' heads.

I think this algorithm is worth teaching for its elegance.

9. First, I think it is a lovely algorithm to teach. I do it in my undergrad and grad algorithms classes. Teaching and learning are not always about the latest or the best or what ever. Inspiration and coolness counts too.

Your post made me think a bit more about the intuition for the algorithm. Assume that we can motivate the need for an approximately good pivot. Now comes the tradeoff between how good the pivot is and how much work to do in the recursion. For example we could pick the median of say the first n/2 elements and this will be an approximate pivot but the recursion won't work out in the worst case. One can sort of see that the right tradeoff won't happen if we do not make use of all the array in coming up with the pivot because the pivot quality won't be good enough. Grouping into subarrays and then filtering from them is perhaps possible to think about since we have to look at all elements. The question is what group size one should think about. Here I think one can try the game of using group size k and work out what happens given the constraints: linear work to process the input, recursion reducing the problem size by a constant factor etc and conclude that k = O(1) is more or less what we can get away with. Not sure this adds anything useful but after all it is a blog comment :).

10. Chandra: yes, approximately good pivot can be motivated. I have never thought of trying large groups: that might be a good route.

I was thinking about doing the opposite:trying to get students to think of groups of 3. Some of them already know the "median of 3" heuristic for quicksort. Then we could do the entire design and analysis using groups of 3, realize that it does not work, stare at the recurrence, and then someone would realize that if 3 was replaced by 5, it would work...