In the US, sometimes at Mass I see people quietly playing angry birds or scrabble on their iPhone during the homily. France I thought had been preserved until now. But at midnight Mass here in France, I saw a man texting on his phone immediately after receiving communion, while others were still standing in line for communion.
Are there no limits to that need to be constantly connected?
When are people not connected?
As far as I know, the shower might be the last refuge against the invasion of technology. I have yet to see a gadget to send email while showering.
Sunday, December 25, 2011
Friday, December 23, 2011
Learning Algorithms online
No, this is not about online algorithms, nor about learning theory.
I have signed up for Tim Roughgarden's online course on Algorithms, http://www.algo-class.org
One might wonder why I need to learn about divide-and-conquer, searching, quicksort and shortest paths. In fact, I'll be teaching Algorithms myself at the same time. What's wrong with me? Do I plan to steal Tim's lectures and repeat them verbatim to my students, thus sparing myself the need to understand the material? Have I been secretly incompetent during all the years during which I pretended to teach the material, and is this a remedial course to finally patch my ignorance in my alleged area of expertise? Am I so stingy that I can't refuse something that is offered for free? Or have I so bought into the school system of grades and diplomas that I can't pass on the chance for an easy way to get one more certificate? In a few months I will be able to add it to my vita, and perhaps it'll help me get a raise. Or do I enjoy criticizing other people so much that I'm going be watching those videos of Tim and snickering, and then posting on my blog nasty comments about every slip that might occur? Am I secretly in love with Tim, or do I secretly bear a grudge against him?
None of those reasons. I am curious to see someone else teach the same material that I teach. I look forward to the ability of fast-forwarding the parts I have down pat, and paying careful attention to the material which I have trouble conveying to my students. Perhaps I can use some of it to improve my own teaching.
Additionally, I am curious to see how the Stanford online courses work in practice, with the thought, in the back of my mind, that perhaps I will also engage in online teaching in the future, if the technology is user-friendly enough. The idea of higher education freely available to all in that manner is fascinating, and experiencing how it plays out on a familiar subject might help me understand the strong points and the weaknesses of the approach.
I have signed up for Tim Roughgarden's online course on Algorithms, http://www.algo-class.org
One might wonder why I need to learn about divide-and-conquer, searching, quicksort and shortest paths. In fact, I'll be teaching Algorithms myself at the same time. What's wrong with me? Do I plan to steal Tim's lectures and repeat them verbatim to my students, thus sparing myself the need to understand the material? Have I been secretly incompetent during all the years during which I pretended to teach the material, and is this a remedial course to finally patch my ignorance in my alleged area of expertise? Am I so stingy that I can't refuse something that is offered for free? Or have I so bought into the school system of grades and diplomas that I can't pass on the chance for an easy way to get one more certificate? In a few months I will be able to add it to my vita, and perhaps it'll help me get a raise. Or do I enjoy criticizing other people so much that I'm going be watching those videos of Tim and snickering, and then posting on my blog nasty comments about every slip that might occur? Am I secretly in love with Tim, or do I secretly bear a grudge against him?
None of those reasons. I am curious to see someone else teach the same material that I teach. I look forward to the ability of fast-forwarding the parts I have down pat, and paying careful attention to the material which I have trouble conveying to my students. Perhaps I can use some of it to improve my own teaching.
Additionally, I am curious to see how the Stanford online courses work in practice, with the thought, in the back of my mind, that perhaps I will also engage in online teaching in the future, if the technology is user-friendly enough. The idea of higher education freely available to all in that manner is fascinating, and experiencing how it plays out on a familiar subject might help me understand the strong points and the weaknesses of the approach.
My other theory on grades
Assignments are either for helping the students learn, or for evaluating them. Grades are either to give them an incentive to do the work involved in learning, or to evaluate them. Two very different goals!
For evaluation purposes, grades give the maximum information is the distribution has maximum entropy, so ideally there ought to be about 1/3 A's, 1/3 B's, and 1/3 C's.
For evaluation purposes, assignments are most useful if they help distinguish students from one another, so the ideal exercise is at the right level of difficulty so as to be solvable by exactly half of the students. An exercise that is successfully done by all serves no purpose for evaluation. An exercise that almost no one can do may still serve a purpose: that of distinguishing the few students who stand out by their ability to solve a problem that the vast majority could not do. Exercises with a variety of difficulty levels help evaluate more finely. Since the primary goal of midterms and finals is to evaluate, that's how they should be designed, and the median grade for those would probably be around 50. In fact, that's more or less how things are done in France. The learning part happens in exercises that are not graded, and the sole purpose of grading is to evaluate.
In the US, my experience is that assignments are usually designed so that, with enough time, most students can do most of the assignment: their primary purpose is to help the student learn. A very different perspective!
For evaluation purposes, grades give the maximum information is the distribution has maximum entropy, so ideally there ought to be about 1/3 A's, 1/3 B's, and 1/3 C's.
For evaluation purposes, assignments are most useful if they help distinguish students from one another, so the ideal exercise is at the right level of difficulty so as to be solvable by exactly half of the students. An exercise that is successfully done by all serves no purpose for evaluation. An exercise that almost no one can do may still serve a purpose: that of distinguishing the few students who stand out by their ability to solve a problem that the vast majority could not do. Exercises with a variety of difficulty levels help evaluate more finely. Since the primary goal of midterms and finals is to evaluate, that's how they should be designed, and the median grade for those would probably be around 50. In fact, that's more or less how things are done in France. The learning part happens in exercises that are not graded, and the sole purpose of grading is to evaluate.
In the US, my experience is that assignments are usually designed so that, with enough time, most students can do most of the assignment: their primary purpose is to help the student learn. A very different perspective!
Thursday, December 22, 2011
My theory on grades
I wish students typically got a B in my course.
C's would be reserved to the students who stand out because they're at the lower tail of the distribution (something of the order of 5 percent of the students), and
A's would be reserved to the students who stand out because they're at the upper tail of the distribution (also something of the order of 5 percent).
This way students would expect to get a B, the bulk of the students would get the grade which they expected to get, and the occasional A would be a pleasant surprise.
As it happens, this semester my class was so good and hard-working that I gave a very large number of A's (40 percent!!), which meant that the cutoff point fell in the dense part of the distribution of grades, which meant that it was rather arbitrary: between the 20 "A students" with lowest average grade, and the 20 "B students" with highest average grade, I bet that the difference in performance is statistically completely insignificant. That's a lot of students who will curse their bad luck or question my judgment!
But even for next semester, I don't dare try my theory. B versus A grades don't really matter very much anyway for Brown CS students. The expectation is that they work hard, learn well, and get a good education. Whether that means giving A's to the majority (because we're so pleased with them), or giving B's to the majority (because they're doing as expected, and it's difficult for a student to stand out in an environment in which most are putting so much effort into their coursework), really does not matter much.
C's would be reserved to the students who stand out because they're at the lower tail of the distribution (something of the order of 5 percent of the students), and
A's would be reserved to the students who stand out because they're at the upper tail of the distribution (also something of the order of 5 percent).
This way students would expect to get a B, the bulk of the students would get the grade which they expected to get, and the occasional A would be a pleasant surprise.
As it happens, this semester my class was so good and hard-working that I gave a very large number of A's (40 percent!!), which meant that the cutoff point fell in the dense part of the distribution of grades, which meant that it was rather arbitrary: between the 20 "A students" with lowest average grade, and the 20 "B students" with highest average grade, I bet that the difference in performance is statistically completely insignificant. That's a lot of students who will curse their bad luck or question my judgment!
But even for next semester, I don't dare try my theory. B versus A grades don't really matter very much anyway for Brown CS students. The expectation is that they work hard, learn well, and get a good education. Whether that means giving A's to the majority (because we're so pleased with them), or giving B's to the majority (because they're doing as expected, and it's difficult for a student to stand out in an environment in which most are putting so much effort into their coursework), really does not matter much.
Wednesday, December 21, 2011
First impressions of Paris after a transatlantic red-eye.
As the airplane approaches, we see mostly darkness, punctuated by village and highway lights. In the distance, the City of Lights: a beneath a light veil-like cloud, a large cluster of dense, very bright lights, whose reflection in the dark gives the cloud a yellowish, almost golden appearance. I am musing on how funny it is that the cloud picked precisely Paris to hand over, not the surrounding countryside; I note that the cloud is much lighter than the occasional black clouds between us and the land down below. Then it hits me: it's smog!
Paris is choking in air pollution.
In the airport hallway, large signs on the sides advertised Paris. The first sign: Eiffel tower, Champs-Elysees,... Parisian monuments. The second sign: Opera, jazz, ... (meanwhile background music had Edith Piaf and then accordeon music). The third sign: strolls along the river, romantic scenery, used book shops by Notre Dame,... That was it. This is Paris! Nothing about La Defense, about businesses, or about the people living and working there. It looked like a pure tourist destination and nothing else.
Paris markets itself as merely a more authentic version of Disneyland.
At 8am in Paris, the subways were full of people standing close to one another, some fraction of them sniffing, sneezing or coughing. Colds and minor epidemics plague Parisians all winter long, in ways unknown to Rhode Islanders.
Paris is the Kingdom of Kleenex.
I had the usual hassles with unhelpful staff and buggy official information, no big deal but annoying and a waste of time.
In Paris, when something works right and easily at the first attempt, you stop in wonder and give thanks to God for that small miracle.
Paris is choking in air pollution.
In the airport hallway, large signs on the sides advertised Paris. The first sign: Eiffel tower, Champs-Elysees,... Parisian monuments. The second sign: Opera, jazz, ... (meanwhile background music had Edith Piaf and then accordeon music). The third sign: strolls along the river, romantic scenery, used book shops by Notre Dame,... That was it. This is Paris! Nothing about La Defense, about businesses, or about the people living and working there. It looked like a pure tourist destination and nothing else.
Paris markets itself as merely a more authentic version of Disneyland.
At 8am in Paris, the subways were full of people standing close to one another, some fraction of them sniffing, sneezing or coughing. Colds and minor epidemics plague Parisians all winter long, in ways unknown to Rhode Islanders.
Paris is the Kingdom of Kleenex.
I had the usual hassles with unhelpful staff and buggy official information, no big deal but annoying and a waste of time.
In Paris, when something works right and easily at the first attempt, you stop in wonder and give thanks to God for that small miracle.
Tuesday, December 20, 2011
The experience of reading a paper
Trying to read a proof in depth (say, because I was preparing a blog post on it), the reader may go through various feelings in sequence.
1. Stupidity; How come I don't understand it at all? Am I becoming stupid?
2. Anger and frustration; how come the authors leave some definitions ambiguous, and why did the referee not send the paper back to ask for another round of polishing?
3. An idea: let me just look for slides of a talk on the subject! Slides usually put much more emphasis on intuition and might help me understand what the authors meant to say.
4. Obstination; I can do it. There are no slides available, but I'm interested in the result and just need to put my mind to it. Go through examples systematically, try out the various interpretations until one fits and the proof goes through.
5. Suspicion. Is that result really correct? None of my interpretations seem to work.
6. Discouragement; I can't do it. I'll have to look up references of prior work and see which ideas are present in there and whether they are explained any better. This is turning into a major time sink.
As an intellectual activity, reading technical text is really not much different from doing research.
1. Stupidity; How come I don't understand it at all? Am I becoming stupid?
2. Anger and frustration; how come the authors leave some definitions ambiguous, and why did the referee not send the paper back to ask for another round of polishing?
3. An idea: let me just look for slides of a talk on the subject! Slides usually put much more emphasis on intuition and might help me understand what the authors meant to say.
4. Obstination; I can do it. There are no slides available, but I'm interested in the result and just need to put my mind to it. Go through examples systematically, try out the various interpretations until one fits and the proof goes through.
5. Suspicion. Is that result really correct? None of my interpretations seem to work.
6. Discouragement; I can't do it. I'll have to look up references of prior work and see which ideas are present in there and whether they are explained any better. This is turning into a major time sink.
As an intellectual activity, reading technical text is really not much different from doing research.
Friday, December 16, 2011
On applying for jobs
At this time of year, I often get contacted by potential job applicants. It is often quite difficult to gauge how interested I might be in their application, just by looking at their vita or trying to read one of their papers (and who has time to do more than that.) Arranging to meet with them is usually a big hassle for one or the other party, since they are seldom local.
What would help me would be if I saw them give a talk. For me a talk is much more informative than a paper and with much less effort. But how can that happen without the bother of traveling?
Simple: job applicants could simply videotape themselves giving their job talk, and post it on their web page.
Then I could simply download and watch their talk, and in 50 minutes I would have a good idea of their research, of how interested I am in it, and of how good they are at communicating. That would be great for me!
Why not?
What would help me would be if I saw them give a talk. For me a talk is much more informative than a paper and with much less effort. But how can that happen without the bother of traveling?
Simple: job applicants could simply videotape themselves giving their job talk, and post it on their web page.
Then I could simply download and watch their talk, and in 50 minutes I would have a good idea of their research, of how interested I am in it, and of how good they are at communicating. That would be great for me!
Why not?
Wednesday, December 14, 2011
The in-between time
I am in a transition week. The lectures of the semester are over, but final grades have not yet been calculated and turned in. I have a few days of delicious freedom to catch up with the semester's unfinished business before heading home for Christmas. It's a time to wrap things up, archive the semester's work, and assign priorities to tasks for the break.
A similar transition week happens just before the first day of classes of each semester. People plan their lectures, their travels, and organize the semester.
Transition periods are wonderful. I think that they are key to being in control of one's professional life.
When I was a professor in France, there was no clean break between semesters. The Fall semester ran well into January, and the last meetings and unresolved questions related to the Fall semester overlapped with the beginning of the Spring semester. As for the summer, one of the worst things about it was that it ended, not merely with the preparation of the Fall, but also with make-up exams for students who had failed in the Spring and who had spent the summer studying. Because of those make-up exams, there was a continuum of teaching-related tasks, with never a chance for a reboot.
Transition periods happen at all time scales. For example, google calendars offers the option of planning hour-long meetings so that they're actually 50 minutes long. That gives time for a bathroom and coffee break between meetings. Brilliant idea!
A similar transition week happens just before the first day of classes of each semester. People plan their lectures, their travels, and organize the semester.
Transition periods are wonderful. I think that they are key to being in control of one's professional life.
When I was a professor in France, there was no clean break between semesters. The Fall semester ran well into January, and the last meetings and unresolved questions related to the Fall semester overlapped with the beginning of the Spring semester. As for the summer, one of the worst things about it was that it ended, not merely with the preparation of the Fall, but also with make-up exams for students who had failed in the Spring and who had spent the summer studying. Because of those make-up exams, there was a continuum of teaching-related tasks, with never a chance for a reboot.
Transition periods happen at all time scales. For example, google calendars offers the option of planning hour-long meetings so that they're actually 50 minutes long. That gives time for a bathroom and coffee break between meetings. Brilliant idea!
Tuesday, December 13, 2011
Organizing recommendation letters
A novelty this year: I have several students who have created a google document in excel format listing the places where they have applied, how to send a recommendation letter, and a column for each letter writer to mark when they actually do send the recommendation letter.
I find it convenient. I no longer have to worry about whether I accidentally missed one of the requests. They no longer have to worry about reminding us uselessly.
In addition this gives us new data to mine: when do the various schools send their requests for recommendation letters? How prompt are my various colleagues in responding to those requests? It's all there for us to see. Of course it's not clear what to do with that interesting data. Maybe it's useless knowledge, but I like it anyway. Plus, in our race to the bottom, it's making me realize that I am not the most disorganized: I still have room to spare!
I find it convenient. I no longer have to worry about whether I accidentally missed one of the requests. They no longer have to worry about reminding us uselessly.
In addition this gives us new data to mine: when do the various schools send their requests for recommendation letters? How prompt are my various colleagues in responding to those requests? It's all there for us to see. Of course it's not clear what to do with that interesting data. Maybe it's useless knowledge, but I like it anyway. Plus, in our race to the bottom, it's making me realize that I am not the most disorganized: I still have room to spare!
Monday, December 12, 2011
On naming files
Last week I renamed my files containing the slides of my lectures for this past semester. I had started the semester by naming them: "lecture1", "lecture2", "lecture3", etc. At some point I lost track, and started using dates instead: "october25", "october28", etc. Now I wanted to given them a name that would be helpful, should I look them up again in the future.
Should I call them by their number - lecture1 through lecture35? Not so good, because I may want on some future day to look up the lecture about, say, quick sort, and with that labeling I have no easy way to know where those slides are.
Should I call them by their topic - "Ocamlpatternmatching", "quick sort", etc.? Not so good, because there is a logical order that is lost when I do that.
Should I call them by a combination - "15_higherorderprocedures", "25_runtimeanalysis", etc.? Pretty good, except for one minor flaw: when I list the files in the directory, they do not appear in the correct order: 12blabla precedes 2blabla.
So I ended up calling them by a combination, by started the numbering at 100: "115_higherorderprocedures", "125_runtimeanalysis", etc.
The result looks pretty good. It only took me 25 years of professional experience to figure that out. How come no one has ever taught me how to organize my files? Did I miss a class on "remedial organization" at some point?
Should I call them by their number - lecture1 through lecture35? Not so good, because I may want on some future day to look up the lecture about, say, quick sort, and with that labeling I have no easy way to know where those slides are.
Should I call them by their topic - "Ocamlpatternmatching", "quick sort", etc.? Not so good, because there is a logical order that is lost when I do that.
Should I call them by a combination - "15_higherorderprocedures", "25_runtimeanalysis", etc.? Pretty good, except for one minor flaw: when I list the files in the directory, they do not appear in the correct order: 12blabla precedes 2blabla.
So I ended up calling them by a combination, by started the numbering at 100: "115_higherorderprocedures", "125_runtimeanalysis", etc.
The result looks pretty good. It only took me 25 years of professional experience to figure that out. How come no one has ever taught me how to organize my files? Did I miss a class on "remedial organization" at some point?
Sunday, December 11, 2011
Outdegree, contractions, planar graphs
You can always direct the edges of a planar graph so that every vertex has outdegree at most 5. That fact is clear to anyone who has taken a discrete math course at one point in their life.
My question is:
Q: What happens if the edges of the graph get contracted one by one?
A: Then the outdegree risks growing. For example, if one contracts two adjacent vertices that each have outdegree 5, the resulting vertex will then have outdegree 5+5-1=9. The next contraction may bring its degree up to 9+5-1=13, then 17, etc.
My question then is:
Q: Is it possible to efficiently modify the orientation of some edges along the way and prevent the outdegree from blowing up?
A: Yup. Here's the algorithm, appealingly simple and natural. After each contraction, do the following: while there exists a vertex v of outdegree greater than 10, reverse the orientation of every outgoing edge of v.
At this point your question surely is or should be:
Q (Toto): Cute, maybe, if that's your taste, but why do you care?
A: Because it is used in an upcoming SODA paper by David Eisenstat, Philip Klein and myself, to reduce the runtime of a primal-dual region-growing phase.
Now that the motivation is out of the way, we can think about it together.
Q (Toto): Ok, I'm willing to temporarily accept that this might be interesting. Then, why does this algorithm work and how efficient is it?
A: By design, it obviously maintains an orientation in which each vertex has outdegree no more than 10. The neat observation is that it is an efficient algorithm: the total number of reversals, and hence the runtime (assuming you don't mess up the low-level implementation details), is O(n log n).
Q (Toto): Why is the total number of reversals O(n log n)? I don't see it.
A: The proof is magic.
Q (Toto): I don't like magic.
A: The magic would disappear if you did a few examples of amortized analysis, but for a single analysis of a single problem, you have to accept the magic. Let G(i) be the graph after i contractions. We consider two orientations of the edges of G(i): Alg(G(i)), the orientation obtained by executing the algorithm, and a certain orientation Ideal(G(i)) such that the maximum outdegree in Ideal(G(i)) is at most 5. To start the proof, magically define the potential of G(i) to be
Potential(G(i))= distance(Alg(G(i),Ideal(G(i))), that is, the number of edges that are oriented differently in the two orientations.
Now, here's my question.
Q: What happens to the potential when the algorithm reverses all the outgoing edges of a vertex v?
A: That's not hard to see. Let S denote that group of edges. Immediately before the reversal of S, in Alg the outdegree of v is |S|>=11, and in Ideal the outdegree is <=5, so S contributes at least |S|-5 to the distance; immediately after the reversal of S, S contributes at most 5 to the distance. So the reversal of S has caused the distance to decrease by at least (|S|-5-5)>=1. Overall the total number of edge reversals is within a constant factor of (sum of decreases in the potential function): instead of analyzing the number of edge reversals, we can just analyze the number of decreases of the potential function.
Q (Toto): Big deal. It's not clear to me that you have made any progress: how do you analyze the number of decreases of the potential function?
A: Let's see. Since the potential function is always non-negative, each decrease must be compensated by some previous increase (or by the initial value). So, the total number of decreases is at most the initial value of the potential (which is O(n)), plus the increases that may happen when going from Ideal(G(i)) to Ideal(G(i+1)).
Q (Toto): But you haven't even told me what that "certain orientation Ideal(G(i))" was! I am frustrated.
A: Patience! It is now time to define Ideal(G(i)). Let's do it in backwards order.
Q (Toto): Why? I am frustrated.
A: Because that's what works. Be quiet! Start from the final graph, give it an orientation such that the maximum outdegree is 5, and "uncontract" the edges one by one. To define Ideal(G(i)) from Ideal(G(i+1)), do the following. After the uncontraction of edge uv, in the orientation of G(i) that is inherited from G(i+1), vertex u and vertex v each have outdegree at most 5, and edge uv needs to be given an orientation: arbitrarily direct it from u to v; if u now has degree 6, find a shortest path p (in the directed graph) from u to a vertex with outdegree less than or equal to 4, and reverse the orientation of every edge on p. Now the outdegree of u is reduced to 5. That defines Ideal(G(i)).
Q (Toto): I'm tired. Are we there yet?
A: To finish the proof, it only remains to observe that p has length O(log n).
Q (Toto): Why does p have length O(log n)? Oh, whatever.
A: Easy. Do breadth first search from u. How can it be that all vertices in the first i levels have degree >=5? From Euler's formula for planar graphs, it is easy to see (it's an exercise) that it can only happen if the growth rate is exponential, and such a growth rate can only be maintained for O(log n) levels.
Q (Toto): If you say so. Whatever.
A: Hang on! We're done, just about: So p has length O(log n), so Ideal(G(i)) differs from Ideal(G(i+1)) in O(log n) edges, so the potential can only increase by O(log n) at each contraction, so it can only increase by O(n log n) overall, so it can only decrease by O(initial value + n log n), so the total number of edges reversals is O(n + n log n)=O(n log n) overall, so the algorithm is efficient. I'm done.
My last question:
Q: Isn't it elegant, simple, and clever?
A (Toto): I guess so. You certainly seem excited about it.
Your final question is:
Q (Toto): How do you know all that stuff?
A: The basic technique is from Brodal and Fagerberg, "Dynamic Representations of Sparse Graphs." The application to contractions is from Kowalik, "Fast 3-coloring triangle-free planar graphs". It's also explained in an upcoming textbook by Philip Klein.
My question is:
Q: What happens if the edges of the graph get contracted one by one?
A: Then the outdegree risks growing. For example, if one contracts two adjacent vertices that each have outdegree 5, the resulting vertex will then have outdegree 5+5-1=9. The next contraction may bring its degree up to 9+5-1=13, then 17, etc.
My question then is:
Q: Is it possible to efficiently modify the orientation of some edges along the way and prevent the outdegree from blowing up?
A: Yup. Here's the algorithm, appealingly simple and natural. After each contraction, do the following: while there exists a vertex v of outdegree greater than 10, reverse the orientation of every outgoing edge of v.
At this point your question surely is or should be:
Q (Toto): Cute, maybe, if that's your taste, but why do you care?
A: Because it is used in an upcoming SODA paper by David Eisenstat, Philip Klein and myself, to reduce the runtime of a primal-dual region-growing phase.
Now that the motivation is out of the way, we can think about it together.
Q (Toto): Ok, I'm willing to temporarily accept that this might be interesting. Then, why does this algorithm work and how efficient is it?
A: By design, it obviously maintains an orientation in which each vertex has outdegree no more than 10. The neat observation is that it is an efficient algorithm: the total number of reversals, and hence the runtime (assuming you don't mess up the low-level implementation details), is O(n log n).
Q (Toto): Why is the total number of reversals O(n log n)? I don't see it.
A: The proof is magic.
Q (Toto): I don't like magic.
A: The magic would disappear if you did a few examples of amortized analysis, but for a single analysis of a single problem, you have to accept the magic. Let G(i) be the graph after i contractions. We consider two orientations of the edges of G(i): Alg(G(i)), the orientation obtained by executing the algorithm, and a certain orientation Ideal(G(i)) such that the maximum outdegree in Ideal(G(i)) is at most 5. To start the proof, magically define the potential of G(i) to be
Now, here's my question.
Q: What happens to the potential when the algorithm reverses all the outgoing edges of a vertex v?
A: That's not hard to see. Let S denote that group of edges. Immediately before the reversal of S, in Alg the outdegree of v is |S|>=11, and in Ideal the outdegree is <=5, so S contributes at least |S|-5 to the distance; immediately after the reversal of S, S contributes at most 5 to the distance. So the reversal of S has caused the distance to decrease by at least (|S|-5-5)>=1. Overall the total number of edge reversals is within a constant factor of (sum of decreases in the potential function): instead of analyzing the number of edge reversals, we can just analyze the number of decreases of the potential function.
Q (Toto): Big deal. It's not clear to me that you have made any progress: how do you analyze the number of decreases of the potential function?
A: Let's see. Since the potential function is always non-negative, each decrease must be compensated by some previous increase (or by the initial value). So, the total number of decreases is at most the initial value of the potential (which is O(n)), plus the increases that may happen when going from Ideal(G(i)) to Ideal(G(i+1)).
Q (Toto): But you haven't even told me what that "certain orientation Ideal(G(i))" was! I am frustrated.
A: Patience! It is now time to define Ideal(G(i)). Let's do it in backwards order.
Q (Toto): Why? I am frustrated.
A: Because that's what works. Be quiet! Start from the final graph, give it an orientation such that the maximum outdegree is 5, and "uncontract" the edges one by one. To define Ideal(G(i)) from Ideal(G(i+1)), do the following. After the uncontraction of edge uv, in the orientation of G(i) that is inherited from G(i+1), vertex u and vertex v each have outdegree at most 5, and edge uv needs to be given an orientation: arbitrarily direct it from u to v; if u now has degree 6, find a shortest path p (in the directed graph) from u to a vertex with outdegree less than or equal to 4, and reverse the orientation of every edge on p. Now the outdegree of u is reduced to 5. That defines Ideal(G(i)).
Q (Toto): I'm tired. Are we there yet?
A: To finish the proof, it only remains to observe that p has length O(log n).
Q (Toto): Why does p have length O(log n)? Oh, whatever.
A: Easy. Do breadth first search from u. How can it be that all vertices in the first i levels have degree >=5? From Euler's formula for planar graphs, it is easy to see (it's an exercise) that it can only happen if the growth rate is exponential, and such a growth rate can only be maintained for O(log n) levels.
Q (Toto): If you say so. Whatever.
A: Hang on! We're done, just about: So p has length O(log n), so Ideal(G(i)) differs from Ideal(G(i+1)) in O(log n) edges, so the potential can only increase by O(log n) at each contraction, so it can only increase by O(n log n) overall, so it can only decrease by O(initial value + n log n), so the total number of edges reversals is O(n + n log n)=O(n log n) overall, so the algorithm is efficient. I'm done.
My last question:
Q: Isn't it elegant, simple, and clever?
A (Toto): I guess so. You certainly seem excited about it.
Your final question is:
Q (Toto): How do you know all that stuff?
A: The basic technique is from Brodal and Fagerberg, "Dynamic Representations of Sparse Graphs." The application to contractions is from Kowalik, "Fast 3-coloring triangle-free planar graphs". It's also explained in an upcoming textbook by Philip Klein.
Saturday, December 10, 2011
Ruth Simmons on interviewing and hiring
Ruth Simmons, our beloved president here at Brown university, has some advice about interviewing.
I look for people who are supremely self-confident, very secure, but also profoundly interested in other people. [...] How curious are they about other people, and about new things outside their own area of specialization? [...]
I keep going back to this fundamental idea of being able to respect other people, especially if you’re in a senior position. You can get a lot more done if people have a sense that you respect them, and that you listen to them. You would be surprised at the number of interviews I’ve done where the person never stops talking. If I’m interviewing someone and if they never stop talking, I will never hire them, no matter how qualified they are. If you cannot listen, you can’t be the site of welcoming, nurturing, facilitating new ideas, innovation, creativity, because it really is ultimately only about you. So I look for people who listen well and can respect the ideas of others.
I look for people who are strong enough to be critical of things that are not very good. And more than being critical of things that are not very good, they have to have the capacity to tell people that. [...].
(From the NYT)
I look for people who are supremely self-confident, very secure, but also profoundly interested in other people. [...] How curious are they about other people, and about new things outside their own area of specialization? [...]
I keep going back to this fundamental idea of being able to respect other people, especially if you’re in a senior position. You can get a lot more done if people have a sense that you respect them, and that you listen to them. You would be surprised at the number of interviews I’ve done where the person never stops talking. If I’m interviewing someone and if they never stop talking, I will never hire them, no matter how qualified they are. If you cannot listen, you can’t be the site of welcoming, nurturing, facilitating new ideas, innovation, creativity, because it really is ultimately only about you. So I look for people who listen well and can respect the ideas of others.
I look for people who are strong enough to be critical of things that are not very good. And more than being critical of things that are not very good, they have to have the capacity to tell people that. [...].
(From the NYT)
Friday, December 9, 2011
ITCS: some innovations
Today I looked at the ITCS web page, to register (even though I'll miss the first day), and, lo and behold! They're going to have some innovations!
People looking for jobs (graduating students and postdocs completing their postdoc this year) will each be given a few minutes to present their work. They are asked to send email before December 15. I am looking forward to that event, to get a quick snapshot of our field, as represented by the most recent PhD theses.
Each session will be preceded by a short "chair rant". I am very curious to see how that will turn out. It all depends on the personality of the session chairs, of course. Will it be a waste of time, will it provide some actual insights into the papers, will it be fun? It is intriguing.
There's also going to be improvisational theater. I am more dubious about that and will probably simply sit in the back; it doesn't seem like my thing. But de gustibus non est disputandum.
People looking for jobs (graduating students and postdocs completing their postdoc this year) will each be given a few minutes to present their work. They are asked to send email before December 15. I am looking forward to that event, to get a quick snapshot of our field, as represented by the most recent PhD theses.
Each session will be preceded by a short "chair rant". I am very curious to see how that will turn out. It all depends on the personality of the session chairs, of course. Will it be a waste of time, will it provide some actual insights into the papers, will it be fun? It is intriguing.
There's also going to be improvisational theater. I am more dubious about that and will probably simply sit in the back; it doesn't seem like my thing. But de gustibus non est disputandum.
Thursday, December 8, 2011
Things about my work that I enjoy
I enjoy interacting with students.
I enjoy helping someone learn something new.
I enjoy learning something new.
I enjoy discovering something new.
I enjoy understanding something new.
I enjoy teaching a well-prepared lecture.
I enjoy seeing students become researchers.
I enjoy good writing.
I enjoy writing well (but it takes lots of effort).
I enjoy having an idea.
I enjoy discussing research.
I enjoy brainstorming.
I enjoy going to Israel.
I enjoy hearing Hebrew.
I enjoy people laughing at my jokes.
I enjoy compliments.
I enjoy the feeling of work well done (but that's rare).
I enjoy having caught up with email (but that's rare).
I enjoy exploring Dagstuhl.
I enjoy sunsets in Bertinoro.
I enjoy Ivy League architecture.
I enjoy wood-paneled classrooms.
I enjoy the smell of white board pens.
I enjoy hiking with colleagues.
I enjoy meeting academics who are not scientists.
I enjoy free coffee.
I enjoy the view from my office.
I enjoy the money and status.
I enjoy the freedom to define my work.
I enjoy the freedom to, occasionally, wake up without an alarm and go to work when I feel like it.
I enjoy being part of a field that is changing the world.
I enjoy having a paper accepted.
I enjoy helping someone learn something new.
I enjoy learning something new.
I enjoy discovering something new.
I enjoy understanding something new.
I enjoy teaching a well-prepared lecture.
I enjoy seeing students become researchers.
I enjoy good writing.
I enjoy writing well (but it takes lots of effort).
I enjoy having an idea.
I enjoy discussing research.
I enjoy brainstorming.
I enjoy going to Israel.
I enjoy hearing Hebrew.
I enjoy people laughing at my jokes.
I enjoy compliments.
I enjoy the feeling of work well done (but that's rare).
I enjoy having caught up with email (but that's rare).
I enjoy exploring Dagstuhl.
I enjoy sunsets in Bertinoro.
I enjoy Ivy League architecture.
I enjoy wood-paneled classrooms.
I enjoy the smell of white board pens.
I enjoy hiking with colleagues.
I enjoy meeting academics who are not scientists.
I enjoy free coffee.
I enjoy the view from my office.
I enjoy the money and status.
I enjoy the freedom to define my work.
I enjoy the freedom to, occasionally, wake up without an alarm and go to work when I feel like it.
I enjoy being part of a field that is changing the world.
I enjoy having a paper accepted.
Wednesday, December 7, 2011
Things about my work that I don't enjoy
I don't enjoy faculty meetings.
I don't enjoy giving final grades.
I don't enjoy failing students.
I don't enjoy reporting suspected cases of cheating.
I don't enjoy setting up my laptop for class.
I don't enjoy refereeing.
I don't enjoy being an editor.
I don't enjoy writing recommendation letters.
I don't enjoy revising papers.
I don't enjoy flying.
I don't enjoy chain hotels.
I don't enjoy committees.
I don't enjoy being hopelessly stuck on a research problem.
I don't enjoy failure.
I don't enjoy being stuck on some notation as I read a paper.
I don't enjoy being behind in my work.
I don't enjoy not doing what I'm supposed to be doing.
I don't enjoy working in the evening.
I don't enjoy most seminars.
I don't enjoy email.
I don't enjoy criticism.
I don't enjoy the lack of tangible impact of my work.
I don't enjoy working indoors.
I don't enjoy writing introductions.
I don't enjoy having a paper rejected.
I don't enjoy deadlines.
I don't enjoy having my research community scattered around the world.
I don't enjoy decisions by consensus.
I don't enjoy giving final grades.
I don't enjoy failing students.
I don't enjoy reporting suspected cases of cheating.
I don't enjoy setting up my laptop for class.
I don't enjoy refereeing.
I don't enjoy being an editor.
I don't enjoy writing recommendation letters.
I don't enjoy revising papers.
I don't enjoy flying.
I don't enjoy chain hotels.
I don't enjoy committees.
I don't enjoy being hopelessly stuck on a research problem.
I don't enjoy failure.
I don't enjoy being stuck on some notation as I read a paper.
I don't enjoy being behind in my work.
I don't enjoy not doing what I'm supposed to be doing.
I don't enjoy working in the evening.
I don't enjoy most seminars.
I don't enjoy email.
I don't enjoy criticism.
I don't enjoy the lack of tangible impact of my work.
I don't enjoy working indoors.
I don't enjoy writing introductions.
I don't enjoy having a paper rejected.
I don't enjoy deadlines.
I don't enjoy having my research community scattered around the world.
I don't enjoy decisions by consensus.
Tuesday, December 6, 2011
Only one lecture left!
One more lecture, and I am free!
Much as I enjoy teaching, the end of the semester is always a relief.
Much as I enjoy hiking, arriving at the refuge is always a relief.
Much as I enjoy family reunions, I am always relieved when they end.
Everything that takes effort and concentration is always, to some degree, a relief to get over with, even if it is gratifying.
Much as I enjoy living, when death comes near, will I be relieved to welcome it?
Much as I enjoy teaching, the end of the semester is always a relief.
Much as I enjoy hiking, arriving at the refuge is always a relief.
Much as I enjoy family reunions, I am always relieved when they end.
Everything that takes effort and concentration is always, to some degree, a relief to get over with, even if it is gratifying.
Much as I enjoy living, when death comes near, will I be relieved to welcome it?
Monday, December 5, 2011
A good deed
The other day, I was starting my car just as the light turned green, when a car honked behind me. Surprised, I thought: "this driver is really impatient!" Then he passed me to my right, and I got annoyed.
So, at the next red light, instead of sitting patiently behind him, I changed lanes and, with my car lined up next to his, I waited in high alert. When the light turned green, I took advantage of my stick shift to start the car sharply. Vroom!
The other driver reacted by speeding up, passed me and positioned himself in front of me, his car squarely straddling the center line in such a way that I would not be able to pass him on either side. I had to laugh!
By giving that stranger a chance to compete and win, I probably made his day. He must have felt really good about himself, all thanks to me!
And that was my good deed for the day.
So, at the next red light, instead of sitting patiently behind him, I changed lanes and, with my car lined up next to his, I waited in high alert. When the light turned green, I took advantage of my stick shift to start the car sharply. Vroom!
The other driver reacted by speeding up, passed me and positioned himself in front of me, his car squarely straddling the center line in such a way that I would not be able to pass him on either side. I had to laugh!
By giving that stranger a chance to compete and win, I probably made his day. He must have felt really good about himself, all thanks to me!
And that was my good deed for the day.
Sunday, December 4, 2011
Yao and Preparata on US higher education
Andy Yao and Franco Preparata sat on a panel at Brown about globalization. Andy Yao said that the US system of higher education was the best in the world, for the following reasons.
- Indirect evidence from the results. US institutions of higher learning train the best scientists and the best engineers, he said. I would say that it is an indicator of the quality of the PhD and Masters programs, but not necessarily of the undergraduate programs.
- Merit-based decisions for admitting students, hiring faculty, and promotion to tenure. Andy Yao specifically praised the tenure-track system, that, by putting new hires on probation for 6 years, prevents hiring mistakes from having life-long consequences. As someone who has greatly benefited from the French tenured-upon-hiring system, as a friend of women who have avoided the stressful tenure-track phase by spending the early part of their career in industry labs, and as a witness of women who wait until tenure before having children (at 35, the age when fertility starts to decrease significantly and the risk of genetic defects in babies rises), I see the great personal cost of the tenure-track system and am reluctant to consider its advantages.
- Franco Preparata added the 9-month salary system: US universities only pay faculty for work 9 months of the year. That gives professors both freedom and incentive to go elsewhere for the remaining three months: visit another university or industry lab. The arrangement promotes the exchange of ideas and cross-fertilization between different places.
- Indirect evidence from the results. US institutions of higher learning train the best scientists and the best engineers, he said. I would say that it is an indicator of the quality of the PhD and Masters programs, but not necessarily of the undergraduate programs.
- Merit-based decisions for admitting students, hiring faculty, and promotion to tenure. Andy Yao specifically praised the tenure-track system, that, by putting new hires on probation for 6 years, prevents hiring mistakes from having life-long consequences. As someone who has greatly benefited from the French tenured-upon-hiring system, as a friend of women who have avoided the stressful tenure-track phase by spending the early part of their career in industry labs, and as a witness of women who wait until tenure before having children (at 35, the age when fertility starts to decrease significantly and the risk of genetic defects in babies rises), I see the great personal cost of the tenure-track system and am reluctant to consider its advantages.
- Franco Preparata added the 9-month salary system: US universities only pay faculty for work 9 months of the year. That gives professors both freedom and incentive to go elsewhere for the remaining three months: visit another university or industry lab. The arrangement promotes the exchange of ideas and cross-fertilization between different places.
Saturday, December 3, 2011
Andy Yao: the atom whisperer
Andy Yao visited Brown this week.
He gave a talk about quantum computing, ending with a slide on "What makes great science". His answer was poetic; he apologized for being a romantic, then said: we are learning the language of atoms. Once we understand their language, we will whisper a computational problem into their ear (?), and watch them do an elegant dance to perform the computation. After a few minutes the answer will appear.
He gave a talk about quantum computing, ending with a slide on "What makes great science". His answer was poetic; he apologized for being a romantic, then said: we are learning the language of atoms. Once we understand their language, we will whisper a computational problem into their ear (?), and watch them do an elegant dance to perform the computation. After a few minutes the answer will appear.
Friday, December 2, 2011
Quarter or semester?
What's the difference in workload between the semester and the quarter systems?
Universities that follow a semester-based calendar usually have 12 to 15 weeks of lecturing during the semester, for a total of 24 to 30 weeks per year; let's say 27 weeks, just for definiteness. Universities that follow a quarter-based calendar have about 10 weeks of lecturing in the quarter, I think, for a total of about 30 weeks per year. That's 3 more weeks of teaching.
In addition, a good part of the work teaching a course is in the initial setup and in the final wrapping-up. The week before the semester or quarter starts is usually occupied by various routine teaching-related tasks, and, after the course ends, it takes about two weeks before the final exam is done, graded, and final course grades are turned in: that's an additional 3 weeks per semester/quarter, so those who are using the quarter system have an extra 3 weeks of that stuff during the year.
3 more weeks of teaching and 3 more weeks of "stuff": that's an additional 6 weeks spent on teaching-related activities - 39 weeks per year instead of 33. For professors, going from the semester to the quarter system is an 18 percent increase in teaching load.
Universities that follow a semester-based calendar usually have 12 to 15 weeks of lecturing during the semester, for a total of 24 to 30 weeks per year; let's say 27 weeks, just for definiteness. Universities that follow a quarter-based calendar have about 10 weeks of lecturing in the quarter, I think, for a total of about 30 weeks per year. That's 3 more weeks of teaching.
In addition, a good part of the work teaching a course is in the initial setup and in the final wrapping-up. The week before the semester or quarter starts is usually occupied by various routine teaching-related tasks, and, after the course ends, it takes about two weeks before the final exam is done, graded, and final course grades are turned in: that's an additional 3 weeks per semester/quarter, so those who are using the quarter system have an extra 3 weeks of that stuff during the year.
3 more weeks of teaching and 3 more weeks of "stuff": that's an additional 6 weeks spent on teaching-related activities - 39 weeks per year instead of 33. For professors, going from the semester to the quarter system is an 18 percent increase in teaching load.
Thursday, December 1, 2011
The great pendulum
Professors don't think in terms of calendar years but of academic years. The school year is partitioned into four great periods, corresponding to the two semesters and the long breaks between semesters.
During the breaks, we are primarily focused on our research.
During the semester, many of us are primarily focused on our teaching. To prevent excesses in that regard, if may be good to plan a trip to a conference some time during the semester: that is one purpose served by conferences, that neither Arxiv nor journals can be a substitute for.
So, every three months or so, the great pendulum of our work-life swings from teaching to research and from research back to teaching.
During the breaks, we are primarily focused on our research.
During the semester, many of us are primarily focused on our teaching. To prevent excesses in that regard, if may be good to plan a trip to a conference some time during the semester: that is one purpose served by conferences, that neither Arxiv nor journals can be a substitute for.
So, every three months or so, the great pendulum of our work-life swings from teaching to research and from research back to teaching.
Wednesday, November 30, 2011
Conferences and Arxiv. Utilitarian values.
Yesterday I took a look at a few abstracts of papers accepted at STACS 2012. I must have been in a contrarian mind, because two remarks came to my mind, both going against my usual sense of what's desirable.
First, I was struck by the existence of prior publications on Arxiv. It is no longer exciting to look at lists of papers accepted at a conference: that's not where we learn about new results, but on Arxiv. I knew that, but this brought it home for me: conferences no longer serve the purpose of quickly communicating the latest research results. That was a major service they were doing, perhaps the main one, and that is no longer the case.
Then, why bother look at such a list at all? Because, in areas outside my own small subarea, I don't follow and can't judge what's appearing on Arxiv, so the conference committees provide a useful filter. Why do I care what happens outside my own small subarea? Truly, I do not care very much, but if a big result comes out, I would like to hear about it. Why? So that I don't sound stupid at parties when people ask me about it; and because of some guilty feeling that I ought to be broader, even if I don't really feel like it. In addition, I am always interested if some research happens that might provide good teaching material. Finally, there is a little bit of curiosity: a good talk given by a good speaker on a good topic can be enjoyable even if it's not my area. (Then, it does not even have to be Theory, and in fact, not even Computer Science.)
Those reasons suggest that I do not need to look at lists of accepted papers other than STOC and FOCS, and that I would benefit from those lists more if they were even more selective. I do not want STOC and FOCS to be more selective, but I personally might appreciate it if they were a two-tier conference.
Second, I have recently criticized Oded Goldreich's latest essay on Luca Trevisan's blog, but looking at the abstracts of papers accepted at STACS, I could not but recognize the turns of phrase that he lamented. Consider, for example, the paper I mentioned yesterday. Here is the full abstract: The Travelling Salesman Problem is one the most fundamental and most studied problems in approximation algorithms. For more than 30 years, the best algorithm known for general metrics has been Christofides's algorithm with approximation factor of 3/2, even though the so-called Held-Karp LP relaxation of the problem is conjectured to have the integrality gap of only 4/3. Very recently, significant progress has been made for the important special case of graphic metrics, first by Oveis Gharan et al., and then by Momke and Svensson. In this paper, we provide an improved analysis for the approach introduced by Momke and Svensson yielding a bound of 13/9 on the approximation factor, as well as a bound of 19/12+epsilon for any epsilon>0 for a more general Travelling Salesman Path Problem in graphic metrics.
Before I read his essay, I would not have noticed anything special about that abstract. But now, I can see the choice of arguments. Except for the first sentence of the abstract, everything else reads like a list of Olympic race athletic records - an empty race for ever-better approximation factors, with no concern for the underlying structure. The approximation factor has become, not just a (key) indication of the understanding of the problem, but a goal in itself. As I read that, I was a little bit frustrated: I wanted a hint of how the author got that improvement to an important result! What did they learn about the problem, that Moemke and Svensson did not know?
Then I remembered Goldreich's essay.
Strike one forutilitarian instrumental values.
Update: Just to make it clearer. I like the result, I think it's worthwhile, I am interested, I would have been happy to do that research and be an author of that paper. My only quibble is with the style of the abstract.
First, I was struck by the existence of prior publications on Arxiv. It is no longer exciting to look at lists of papers accepted at a conference: that's not where we learn about new results, but on Arxiv. I knew that, but this brought it home for me: conferences no longer serve the purpose of quickly communicating the latest research results. That was a major service they were doing, perhaps the main one, and that is no longer the case.
Then, why bother look at such a list at all? Because, in areas outside my own small subarea, I don't follow and can't judge what's appearing on Arxiv, so the conference committees provide a useful filter. Why do I care what happens outside my own small subarea? Truly, I do not care very much, but if a big result comes out, I would like to hear about it. Why? So that I don't sound stupid at parties when people ask me about it; and because of some guilty feeling that I ought to be broader, even if I don't really feel like it. In addition, I am always interested if some research happens that might provide good teaching material. Finally, there is a little bit of curiosity: a good talk given by a good speaker on a good topic can be enjoyable even if it's not my area. (Then, it does not even have to be Theory, and in fact, not even Computer Science.)
Those reasons suggest that I do not need to look at lists of accepted papers other than STOC and FOCS, and that I would benefit from those lists more if they were even more selective. I do not want STOC and FOCS to be more selective, but I personally might appreciate it if they were a two-tier conference.
Second, I have recently criticized Oded Goldreich's latest essay on Luca Trevisan's blog, but looking at the abstracts of papers accepted at STACS, I could not but recognize the turns of phrase that he lamented. Consider, for example, the paper I mentioned yesterday. Here is the full abstract: The Travelling Salesman Problem is one the most fundamental and most studied problems in approximation algorithms. For more than 30 years, the best algorithm known for general metrics has been Christofides's algorithm with approximation factor of 3/2, even though the so-called Held-Karp LP relaxation of the problem is conjectured to have the integrality gap of only 4/3. Very recently, significant progress has been made for the important special case of graphic metrics, first by Oveis Gharan et al., and then by Momke and Svensson. In this paper, we provide an improved analysis for the approach introduced by Momke and Svensson yielding a bound of 13/9 on the approximation factor, as well as a bound of 19/12+epsilon for any epsilon>0 for a more general Travelling Salesman Path Problem in graphic metrics.
Before I read his essay, I would not have noticed anything special about that abstract. But now, I can see the choice of arguments. Except for the first sentence of the abstract, everything else reads like a list of Olympic race athletic records - an empty race for ever-better approximation factors, with no concern for the underlying structure. The approximation factor has become, not just a (key) indication of the understanding of the problem, but a goal in itself. As I read that, I was a little bit frustrated: I wanted a hint of how the author got that improvement to an important result! What did they learn about the problem, that Moemke and Svensson did not know?
Then I remembered Goldreich's essay.
Strike one for
Update: Just to make it clearer. I like the result, I think it's worthwhile, I am interested, I would have been happy to do that research and be an author of that paper. My only quibble is with the style of the abstract.
Tuesday, November 29, 2011
STACS 2012 list of accepted papers
The STACS 2012 accepted papers are out. STACS is in Paris and starts on February 29.
The most noteworthy article for my taste, at first glance, is the 13/9-approximation for Graphic TSP by Marcin Mucha, available here. (In some sense this is old news: the paper will be presented at STACS, but was published on Arxiv in August.) They improve the analysis of the Moemke-Svensson paper from last Spring (presented last month at FOCS; was published on Arxiv in April).
Here's a quote: We present an improved analysis of the cost of the circulation used in the construction of the TSP tour/path. [...] The circulation used [by Moemke and Svensson] consists of two parts: the ”core” part based on an extreme optimal solution to the Held-Karp relaxation of TSP, and the ”correction” part that adds enough flow to the core part to make it feasible. We improve bounds on costs of both part; in particular we show that the second part is in a sense free. As for the first part, similarly to the original proof of Moemke and Svensson, our proof exploits its knapsack-like structure. However, we use the 2-dimensional knapsack problem in our analysis, instead of the standard knapsack problem.
The most noteworthy article for my taste, at first glance, is the 13/9-approximation for Graphic TSP by Marcin Mucha, available here. (In some sense this is old news: the paper will be presented at STACS, but was published on Arxiv in August.) They improve the analysis of the Moemke-Svensson paper from last Spring (presented last month at FOCS; was published on Arxiv in April).
Here's a quote: We present an improved analysis of the cost of the circulation used in the construction of the TSP tour/path. [...] The circulation used [by Moemke and Svensson] consists of two parts: the ”core” part based on an extreme optimal solution to the Held-Karp relaxation of TSP, and the ”correction” part that adds enough flow to the core part to make it feasible. We improve bounds on costs of both part; in particular we show that the second part is in a sense free. As for the first part, similarly to the original proof of Moemke and Svensson, our proof exploits its knapsack-like structure. However, we use the 2-dimensional knapsack problem in our analysis, instead of the standard knapsack problem.
Monday, November 28, 2011
Who is in the computer science building at 11pm?
Today I came by the computer science building at Brown around 11pm. There were a dozen undergraduate students in the atrium, taking a short break from work or eating a late dinner, and each of the computer labs had about 10 undergraduate students working.
That's a lot of students.
They work there. They study there. They debug there. They go to class there. The eat there. Sometimes they take a nap on a couch there. That's the computer science subculture at its most extreme!
Maybe one of the classrooms should be converted into a dorm with bunk beds, Appalachian-hut style. That would save the students some money.
That's a lot of students.
They work there. They study there. They debug there. They go to class there. The eat there. Sometimes they take a nap on a couch there. That's the computer science subculture at its most extreme!
Maybe one of the classrooms should be converted into a dorm with bunk beds, Appalachian-hut style. That would save the students some money.
Sunday, November 27, 2011
The Khan academy on the foundations of computational complexity
The other day i watched some Khan academy videos about substraction and about multiplication.
The ones about subtraction said something along the following lines:
"How do you compute 5-3? Here is one way. Say that you have 5 lemons. Here I am drawing 5 lemons. Minus 3 means that you remove 3 lemons. Here I cross them out. How many are left? Let me count: 1,2. There are two lemons left, so 5-3=2." After presenting some other methods based on the number line, he then moves on to the next step: "How do you compute 67 - 34? You could do the same method as before: draw 67 lemons, cross out 34 lemons, and count how many lemons are left. That would work. It would give you the correct answer, but it would take a really long time. So, now I am going to give you a way to do it faster." He then moves on to explaining how to substract the ones digits and the tens digits separately.
The one about multiplications went along the following lines:
"How do you compute 4*3? Here is one way. 3 is 3 cherries. Let me draw 3 cherries on a row. I do this 4 times, so I draw 4 rows that each have 3 cherries. Then I can count: 3+3+3+3 equals 12, so 4*3=12." After presenting equivalent methods, he then moves on: "But what happens if your numbers are really big? For example, if you have 100 lemons, drawing 100 lemons on a row would take a really long time. And if you wanted to compute 100*100 with this method, you'd have to draw 100 rows that each have 100 lemons, and then add them all up: that would take forever. Here, I am going to show you a way to do it fast, even with really big numbers."
One might say: what's the big deal? Of course, when I teach the multiplication algorithm in class, I remark that doing the computation in unary is inefficient, but I always thought: "Why bother mentioning that? Who in their right mind would work in unary?" it hadn't crossed my mind before that there was a natural reason to bring that up, and so my remark goes by unnoticed.
The narrative in those videos is amazing. I am so impressed with this man! He is laying the foundations of asymptotic analysis of algorithms, explained to first graders.
The ones about subtraction said something along the following lines:
"How do you compute 5-3? Here is one way. Say that you have 5 lemons. Here I am drawing 5 lemons. Minus 3 means that you remove 3 lemons. Here I cross them out. How many are left? Let me count: 1,2. There are two lemons left, so 5-3=2." After presenting some other methods based on the number line, he then moves on to the next step: "How do you compute 67 - 34? You could do the same method as before: draw 67 lemons, cross out 34 lemons, and count how many lemons are left. That would work. It would give you the correct answer, but it would take a really long time. So, now I am going to give you a way to do it faster." He then moves on to explaining how to substract the ones digits and the tens digits separately.
The one about multiplications went along the following lines:
"How do you compute 4*3? Here is one way. 3 is 3 cherries. Let me draw 3 cherries on a row. I do this 4 times, so I draw 4 rows that each have 3 cherries. Then I can count: 3+3+3+3 equals 12, so 4*3=12." After presenting equivalent methods, he then moves on: "But what happens if your numbers are really big? For example, if you have 100 lemons, drawing 100 lemons on a row would take a really long time. And if you wanted to compute 100*100 with this method, you'd have to draw 100 rows that each have 100 lemons, and then add them all up: that would take forever. Here, I am going to show you a way to do it fast, even with really big numbers."
One might say: what's the big deal? Of course, when I teach the multiplication algorithm in class, I remark that doing the computation in unary is inefficient, but I always thought: "Why bother mentioning that? Who in their right mind would work in unary?" it hadn't crossed my mind before that there was a natural reason to bring that up, and so my remark goes by unnoticed.
The narrative in those videos is amazing. I am so impressed with this man! He is laying the foundations of asymptotic analysis of algorithms, explained to first graders.
Friday, November 25, 2011
Teaching deterministic linear time median
When I teach Algorithms, one challenging lecture is the one where I present the deterministic linear-time algorithm for median-finding. The big challenge is: How can we present the algorithmic design so that it is natural? I had an idea: I could simply watch online videos and see how others do it! So, I found and watched videos of lectures at Stanford (by Tim Roughgarden) and at MIT (by Erik Demaine) on that topic. For a teacher, it is fascinating to watch how others go about teaching the same material as you!
I have picked up a couple of ideas from Erik's lecture.
First, the runtime analysis involves some quantities with floors and ceilings. I always sweep the floors and ceilings under the carpet (with a vacuum cleaner). Erik, on the other hand, gave a short, relatively polished analysis, that managed to take those into account rigorously without making a mess of the analysis (well, except that he is assuming without saying it that T(n) is a monotone function of n, but none of the students noticed that gap in the proof). I may try to copy his approach when I get to that point.
Second, like me, in the design phase Erik highlights the use of transitivity of ordering: if a < b and b < c then a < c. I realized while watching him that I could highlight that already when I teach Quicksort: why does Quicksort give a correct output? Because of transitivity: if a < pivot and pivot < b, then a < b. That explains why we are doing the correct thing when in Quicksort we commit to placing a to the left of b. For Quicksort, that part of correctness is obvious and usually goes without saying. For linear-time medians, it comes up much less naturally and is actually a "creative" step. But if I have taken a minute to bring it out while teaching Quicksort, then it will seem more natural when I bring it out while teaching linear-time median.
So, watching those videos was mildly useful. Unfortunately I did not find what I was looking for: the challenge of presenting the algorithm in a natural manner is still open, as far as I know.
I have picked up a couple of ideas from Erik's lecture.
First, the runtime analysis involves some quantities with floors and ceilings. I always sweep the floors and ceilings under the carpet (with a vacuum cleaner). Erik, on the other hand, gave a short, relatively polished analysis, that managed to take those into account rigorously without making a mess of the analysis (well, except that he is assuming without saying it that T(n) is a monotone function of n, but none of the students noticed that gap in the proof). I may try to copy his approach when I get to that point.
Second, like me, in the design phase Erik highlights the use of transitivity of ordering: if a < b and b < c then a < c. I realized while watching him that I could highlight that already when I teach Quicksort: why does Quicksort give a correct output? Because of transitivity: if a < pivot and pivot < b, then a < b. That explains why we are doing the correct thing when in Quicksort we commit to placing a to the left of b. For Quicksort, that part of correctness is obvious and usually goes without saying. For linear-time medians, it comes up much less naturally and is actually a "creative" step. But if I have taken a minute to bring it out while teaching Quicksort, then it will seem more natural when I bring it out while teaching linear-time median.
So, watching those videos was mildly useful. Unfortunately I did not find what I was looking for: the challenge of presenting the algorithm in a natural manner is still open, as far as I know.
Thursday, November 24, 2011
The Khan academy on insertion sort
The Khan academy puts 8-minute youtube video lectures (and a few exercises) on the internet to teach all sorts of topics in an online fashion. I just realized that there are some lectures on computer science, and I have watched a few with great interest. I was mesmerized by the lectures on insertion sort - I taught that in class just a few weeks ago. My thinking was: could I borrow some ideas from that renowned teaching style?
The instructor is careful, thorough, and incredibly patient. He follows a well-trodden path to communicate, but does not announce signposts explicitly as a general paradigm for problem-solving.
Signpost 1. Define the problem: he does that in a very concrete way, using an example. For sorting, there is only one possible ambiguity in the problem definition: what about duplicates? He deals with that without explicitly drawing the student's attention to it, just by checking his program with an example that has duplicates.
Although I would also use an example, I am more explicit and state the problem definition in terms of a pair (input, output). Why? Because I don't want students to only learn about sorting: I also want to inculcate the notion of computational problem, that they will see later in many more advanced Theory courses.
Signpost 2. Explain the algorithm by executing it slowly on an example.
I would also do that if it was the beginning of the semester, but in addition to what he wrote on his electronic blackboard, I would write down comparisons as I perform them: instead of just saying "we compare 3 to 7", I would write down "3<7" on the side. Why? Because I will at some later point want to analyze the runtime, and for that, I might want to focus on the number of comparisons: highlighting the comparisons is a good preliminary perspective that will come in handy later.
One minor thing I really did not like was that he kept calling elements "the zeroth element" for the first element, "the first element" for the second element, and so on. I am resolutely opposed to molding the English language after conventions used in programming languages.
One major thing is that the instructor explains how the algorithm works, but not how one would come up with that algorithm in the first place. In his place, I might try to ask a leading question to get my unknown audience to come up with the algorithm. Why? Because I try to teach, not so much specific algorithms, but how to design algorithms. For example: "One way to go about it could be called "sort as you go": scan the entire list, and, at every step of the scan, try to make sure that everything you have scanned so far is sorted." Then I can show intermediate steps in the execution - the desired state of the list after processing the first item, the first two items, the first three items, etc. And only then would I discuss how to fill in by processing each additional item scanned. That's a modular approach, and hopefully after I have stated the "sort as you go" idea, many students can figure out the rest of the algorithm. It would, maybe, stimulate the students' creativity a little bit more.
Signpost 3. Write code for insertion sort in Python.
Before that, in his place I would first have written some high-level pseudo-code. Why? As a preparation, I suppose, for more difficult problems, settings when it is imperative to lay out the algorithm before starting to code.
The way in which he explained his code was slow and very intuitive. However, I noticed that he wrote and explained the code in a very linear fashion, line by line. That's not the way we do it: we always go for a modular presentation, starting with incomplete code that contains some "..." in places, and filling in the details during a second pass. For insertion sort, the difference is not huge, but that bothered me slightly.
He then ran it on an example in the end, as a way to implicitly suggest the need for testing. I would have been more explicit about that need.
Signpost 4.Execute the code step-by-step on a small example.
I skipped much of that presentation, but he did exactly what I do in class after I have presented an algorithm, when I am not 100 percent sure that the students understood it. Fortunately it's automated in our scheme (Racket) environment: one of the options when running a procedure in our programming environment is to do it not all at once but step by step, with highlights showing what parts of the code are being executed and what the current values of interest are.
Signpost 5. Do some small code optimization.
I really liked that short explanation. He replaced "While A: if B: ... else break" by "While (A and not B) ...". That was quite natural, and his last few words indicated that it was a general pattern. I had never thought it through quite so explicitly, so I was happy to learn that.
No video about runtime, but maybe that'll come some day.
My main reservation was that he taught one particular algorithm, but did not take that opportunity to teach how to design algorithms. One could present insertion sort as an example of problem-solving by reducing to an easier subproblem (namely, inserting a single element in an already sorted list). Making that explicit introduces the idea of reductions.
Overall, if it is a way to teach insertion sort for the sake of teaching insertion sort, it is great. If it is a way to introduce students to algorithmic design through the example of insertion sort as a stepping stone, it is less impressive. Obviously, he is targeting a lower level than my students, but I would still have liked him to embed some deeper level of understanding in his presentation (even if it's done in an implicit manner), like sowing seeds of knowledge, just to prepare students, so that, when they meet more advanced material, that material will seem natural to them. They would be ready without even knowing that they have been prepared for it!
In short: very pedagogical, very engaging, superbly clear, but there is a little bit of a lack of vision.
The instructor is careful, thorough, and incredibly patient. He follows a well-trodden path to communicate, but does not announce signposts explicitly as a general paradigm for problem-solving.
Signpost 1. Define the problem: he does that in a very concrete way, using an example. For sorting, there is only one possible ambiguity in the problem definition: what about duplicates? He deals with that without explicitly drawing the student's attention to it, just by checking his program with an example that has duplicates.
Although I would also use an example, I am more explicit and state the problem definition in terms of a pair (input, output). Why? Because I don't want students to only learn about sorting: I also want to inculcate the notion of computational problem, that they will see later in many more advanced Theory courses.
Signpost 2. Explain the algorithm by executing it slowly on an example.
I would also do that if it was the beginning of the semester, but in addition to what he wrote on his electronic blackboard, I would write down comparisons as I perform them: instead of just saying "we compare 3 to 7", I would write down "3<7" on the side. Why? Because I will at some later point want to analyze the runtime, and for that, I might want to focus on the number of comparisons: highlighting the comparisons is a good preliminary perspective that will come in handy later.
One minor thing I really did not like was that he kept calling elements "the zeroth element" for the first element, "the first element" for the second element, and so on. I am resolutely opposed to molding the English language after conventions used in programming languages.
One major thing is that the instructor explains how the algorithm works, but not how one would come up with that algorithm in the first place. In his place, I might try to ask a leading question to get my unknown audience to come up with the algorithm. Why? Because I try to teach, not so much specific algorithms, but how to design algorithms. For example: "One way to go about it could be called "sort as you go": scan the entire list, and, at every step of the scan, try to make sure that everything you have scanned so far is sorted." Then I can show intermediate steps in the execution - the desired state of the list after processing the first item, the first two items, the first three items, etc. And only then would I discuss how to fill in by processing each additional item scanned. That's a modular approach, and hopefully after I have stated the "sort as you go" idea, many students can figure out the rest of the algorithm. It would, maybe, stimulate the students' creativity a little bit more.
Signpost 3. Write code for insertion sort in Python.
Before that, in his place I would first have written some high-level pseudo-code. Why? As a preparation, I suppose, for more difficult problems, settings when it is imperative to lay out the algorithm before starting to code.
The way in which he explained his code was slow and very intuitive. However, I noticed that he wrote and explained the code in a very linear fashion, line by line. That's not the way we do it: we always go for a modular presentation, starting with incomplete code that contains some "..." in places, and filling in the details during a second pass. For insertion sort, the difference is not huge, but that bothered me slightly.
He then ran it on an example in the end, as a way to implicitly suggest the need for testing. I would have been more explicit about that need.
Signpost 4.Execute the code step-by-step on a small example.
I skipped much of that presentation, but he did exactly what I do in class after I have presented an algorithm, when I am not 100 percent sure that the students understood it. Fortunately it's automated in our scheme (Racket) environment: one of the options when running a procedure in our programming environment is to do it not all at once but step by step, with highlights showing what parts of the code are being executed and what the current values of interest are.
Signpost 5. Do some small code optimization.
I really liked that short explanation. He replaced "While A: if B: ... else break" by "While (A and not B) ...". That was quite natural, and his last few words indicated that it was a general pattern. I had never thought it through quite so explicitly, so I was happy to learn that.
No video about runtime, but maybe that'll come some day.
My main reservation was that he taught one particular algorithm, but did not take that opportunity to teach how to design algorithms. One could present insertion sort as an example of problem-solving by reducing to an easier subproblem (namely, inserting a single element in an already sorted list). Making that explicit introduces the idea of reductions.
Overall, if it is a way to teach insertion sort for the sake of teaching insertion sort, it is great. If it is a way to introduce students to algorithmic design through the example of insertion sort as a stepping stone, it is less impressive. Obviously, he is targeting a lower level than my students, but I would still have liked him to embed some deeper level of understanding in his presentation (even if it's done in an implicit manner), like sowing seeds of knowledge, just to prepare students, so that, when they meet more advanced material, that material will seem natural to them. They would be ready without even knowing that they have been prepared for it!
In short: very pedagogical, very engaging, superbly clear, but there is a little bit of a lack of vision.
Wednesday, November 23, 2011
Which textbook to use when teaching Algorithms?
Which textbook should one use when teaching Algorithms?
Let me start with the winner. For the past couple of years I have been using the textbook written by Dasgupta, Papadimitriou and Vazirani. Here is what I think of it.
-Its first quality is that it is softcover and therefore relatively inexpensive. Some preliminary version is available for free on the authors' web pages.
-Its second quality is that it is relatively small and therefore light to carry around.
-Its third quality is that it is written in good English. The style is crisp, and the authors occasionally inject emotions in their descriptions, giving more texture to the book. "Now the time has come to meet the quest’s most embarrassing and persistent failures. "
-Its fourth quality is that for each problem, the textbook goes straight to the heart of the difficulty. It does not waste space on secondary notes, avoids redundancy, and gives the gist of an argument in just a few lines. The proofs are often beautiful because they are direct and minimal, with no unnecessary detours. They are a pleasure to read.
-Its fifth quality is that it is short enough that a student could aim to read it in its entirety.
However, the book is not perfect.
-Its first fault is that it contains mistakes. The authors obviously did not implement their algorithms. Special cases may be overlooked in the problem specifications. Base cases may be done in a hurry, or barely mentioned when they contain nothing of interest. Calculations can be slightly off where the constants are concerned. The authors are keen on communicating the central ideas, but there is a certain carelessness about details. In that sense, the presentation is not up to the standards that we expect from our own students.
-Its second fault is a lack of examples. For me, designing algorithms is first done by working through examples, and a student who wishes to study the course material has to supplement the book with more examples. In that case, the tradeoff between size of book and content comes out in favor of examples, in my opinion.
-Its third fault is the choice of topics with which to start the book: the first chapter talks about arithmetic operations. That's natural because students learned algorithms for those at a young age, but that immediately raises problems that should be swept under the carpet until much later: when we talk about an input of size n, what is n? Starting by numbers creates problems that spoil the beginning of the semester: when we sort by comparisons, why do we assume that comparing two numbers can be done in constant time? What if the numbers are really large? Those concerns are valid, but they should come up at the end of the semester, when the student is comfortable with runtime analysis and ready to revisit his assumptions. This textbook's approach naturally leads students to raise those questions early on, and at that stage they only create confusion.
-Its fourth fault is a lack of exercises. There are not enough exercises for students to practice what they are learning.
-Its fifth fault is the choice of topics to be covered, that does not correspond one hundred per cent to my own taste; it is more of a reflection of the authors' tastes; however that is natural for any book.
Overall, I like this book very much. It is the one that comes closest to my taste. A cleaned up and slightly expanded version of that book would be ideal for me.
Let me start with the winner. For the past couple of years I have been using the textbook written by Dasgupta, Papadimitriou and Vazirani. Here is what I think of it.
-Its first quality is that it is softcover and therefore relatively inexpensive. Some preliminary version is available for free on the authors' web pages.
-Its second quality is that it is relatively small and therefore light to carry around.
-Its third quality is that it is written in good English. The style is crisp, and the authors occasionally inject emotions in their descriptions, giving more texture to the book. "Now the time has come to meet the quest’s most embarrassing and persistent failures. "
-Its fourth quality is that for each problem, the textbook goes straight to the heart of the difficulty. It does not waste space on secondary notes, avoids redundancy, and gives the gist of an argument in just a few lines. The proofs are often beautiful because they are direct and minimal, with no unnecessary detours. They are a pleasure to read.
-Its fifth quality is that it is short enough that a student could aim to read it in its entirety.
However, the book is not perfect.
-Its first fault is that it contains mistakes. The authors obviously did not implement their algorithms. Special cases may be overlooked in the problem specifications. Base cases may be done in a hurry, or barely mentioned when they contain nothing of interest. Calculations can be slightly off where the constants are concerned. The authors are keen on communicating the central ideas, but there is a certain carelessness about details. In that sense, the presentation is not up to the standards that we expect from our own students.
-Its second fault is a lack of examples. For me, designing algorithms is first done by working through examples, and a student who wishes to study the course material has to supplement the book with more examples. In that case, the tradeoff between size of book and content comes out in favor of examples, in my opinion.
-Its third fault is the choice of topics with which to start the book: the first chapter talks about arithmetic operations. That's natural because students learned algorithms for those at a young age, but that immediately raises problems that should be swept under the carpet until much later: when we talk about an input of size n, what is n? Starting by numbers creates problems that spoil the beginning of the semester: when we sort by comparisons, why do we assume that comparing two numbers can be done in constant time? What if the numbers are really large? Those concerns are valid, but they should come up at the end of the semester, when the student is comfortable with runtime analysis and ready to revisit his assumptions. This textbook's approach naturally leads students to raise those questions early on, and at that stage they only create confusion.
-Its fourth fault is a lack of exercises. There are not enough exercises for students to practice what they are learning.
-Its fifth fault is the choice of topics to be covered, that does not correspond one hundred per cent to my own taste; it is more of a reflection of the authors' tastes; however that is natural for any book.
Overall, I like this book very much. It is the one that comes closest to my taste. A cleaned up and slightly expanded version of that book would be ideal for me.
Tuesday, November 22, 2011
Workers' rights, students' training
Thanksgiving is the most respected holiday in the US. It is the one day on which every non-vital service is closed in the afternoon. It is the main family holiday in this country. For many people it is part of a 4-day weekend from Thursday to Sunday. The next day is traditionally devoted to shopping at heavily discounted prices. Because of the crowds in the stores, it is called Black Friday.
Some retail store employees have complained that for Black Friday (the day of frenetic shopping that follows Thanksgiving Thursday), they are required to start a full day's work beginning at 11pm on Thursday evening. Indeed, stores open as early as 1am on Friday morning.
I have been shocked to see the public reaction to employees' complaints. A typical reaction is: "If they don't like it, they can quit, and there will be plenty of people ready to take their job!". In other words, the market economy rules. It dictates what's right. At a time of high unemployment, workers have no rights. This is one of the situations when, in spite of having lived here for many years, I still feel that there is something alien about this country.
Sometimes I wonder if the students' training is a little bit similar. They cram a perhaps excessive workload, then they become highly marketable and get good jobs: not just because they know a lot and are well trained, but also because they have demonstrated their compliance with the demands made of them, even if those demands are unreasonable. They'll be good software engineers, always willing to put in long hours without counting, and the occasional all-nighter when a deadline is coming up. They are being trained for a way of life in which, in exchange for a good salary, they will basically belong to their employer.
Some retail store employees have complained that for Black Friday (the day of frenetic shopping that follows Thanksgiving Thursday), they are required to start a full day's work beginning at 11pm on Thursday evening. Indeed, stores open as early as 1am on Friday morning.
I have been shocked to see the public reaction to employees' complaints. A typical reaction is: "If they don't like it, they can quit, and there will be plenty of people ready to take their job!". In other words, the market economy rules. It dictates what's right. At a time of high unemployment, workers have no rights. This is one of the situations when, in spite of having lived here for many years, I still feel that there is something alien about this country.
Sometimes I wonder if the students' training is a little bit similar. They cram a perhaps excessive workload, then they become highly marketable and get good jobs: not just because they know a lot and are well trained, but also because they have demonstrated their compliance with the demands made of them, even if those demands are unreasonable. They'll be good software engineers, always willing to put in long hours without counting, and the occasional all-nighter when a deadline is coming up. They are being trained for a way of life in which, in exchange for a good salary, they will basically belong to their employer.
Monday, November 21, 2011
How to make students fail
Here is a simple recipe to make students fail: create assignments densely packed throughout the quarter, and where each assignment relies on the previous one completely, so that it requires prior completion of the previous assignments before it can be started.
Then, consider the case of a student who has a bit of trouble with the course and who can only manage to do 90% of the work assignment in a given week.
On the first week, he will turn in 90% of the work.
On the second week, he will first need to complete the remaining 10% of the first assignment (for no credit), so he will only be able to turn in 80% of the work.
On the third week, he will first need to complete the remaining 20% of the second assignment (for no credit), so he will only be able to turn in 70% of the work.
and so on until the 10th week, when he will first need to complete the remaining 90% of the previous assignment (for no credit) and will then have no time left to work on the 10th assignment.
On average, for a 10-week quarter, his grade will be 50%, or a bit less if end-of-quarter assignments are weighed more heavily; that, in spite of having done 90% of the work overall.
Note that giving the option of turning in assignments a couple of days late does not help. The student is doomed.
Now, consider a student who only just manages to do the entire assignment in the time allocated, and imagine that after three or four weeks, there is one week when he is sick, or tired, or something, and only does 50% of the work.
Then, from that point on, each week he'll spend half the time catching up on the previous assignment and half the time on the current assignment, so all of his grades will be 50% from then on, and at the end of the semester his average grade will be 60% or so.
Then, consider the case of a student who has a bit of trouble with the course and who can only manage to do 90% of the work assignment in a given week.
On the first week, he will turn in 90% of the work.
On the second week, he will first need to complete the remaining 10% of the first assignment (for no credit), so he will only be able to turn in 80% of the work.
On the third week, he will first need to complete the remaining 20% of the second assignment (for no credit), so he will only be able to turn in 70% of the work.
and so on until the 10th week, when he will first need to complete the remaining 90% of the previous assignment (for no credit) and will then have no time left to work on the 10th assignment.
On average, for a 10-week quarter, his grade will be 50%, or a bit less if end-of-quarter assignments are weighed more heavily; that, in spite of having done 90% of the work overall.
Note that giving the option of turning in assignments a couple of days late does not help. The student is doomed.
Now, consider a student who only just manages to do the entire assignment in the time allocated, and imagine that after three or four weeks, there is one week when he is sick, or tired, or something, and only does 50% of the work.
Then, from that point on, each week he'll spend half the time catching up on the previous assignment and half the time on the current assignment, so all of his grades will be 50% from then on, and at the end of the semester his average grade will be 60% or so.
Saturday, November 19, 2011
Two goals of teaching
There are two goals accomplished by teachings university level students.
One is to educate the people who will, in a few years, be professors in academia, or entrepreneurs in innovative businesses, or CEOs of large companies whose decisions have significant impact on society. The idea of shaping the knowledge of tomorrow's leaders was very present when I taught at Ecole Polytechnique in France.
The other one is to educate people in general, forming the minds of our the new adult citizens who will be voting and holding jobs everywhere in society. That idea was very present when I taught at Universite Paris-Sud in France.
For the first goal, the focus of attention are the best students in the class. The instructor tries to constantly challenge them, stretching their mental agility and pushing them to learn as much, as deeply, and as fast as possible. The students who are in difficulty are not on the radar. For the second goal, the focus of attention is the majority. The instructor tries to reach as many students as possible and is focused on the fringe students who are struggling to learn the course material. The students who are effortlessly cruising through the course are mostly ignored and are sometimes even considered to have an unfair advantage. So, to some extent the two goals are in tension.
It's one of the difficulties of teaching at a large public university. For example, if you teach a class of 100 undergraduates at UC Berkeley: are you going to spend effort on the two or three students who are shining in class and who might, with encouragement, continue on to a PhD, or on the thirty students who seem to have overreached slightly by taking the course?
At a selective institution such as Ecole Polytechnique or Stanford, the problem is largely solved by the filtering that happens beforehand.
Teaching with both goals in mind is taken to its limit this semester Stanford, with an online course being taught on introduction to AI. The course material is taught to the Stanford students, a selective group if there ever was one, but also, in an online manner (including automatic online grading of homeworks and exams), to the entire world, with no filtering whatsoever. How can that be?
There must some self-selection, so that the unknown crowd of students taking the course online are each, individually, a student who is highly motivated, smart, hard-working, and who, in different circumstances, might have ended up in Stanford anyway. It's an idea very much aligned with the American Dream: offering an opportunity for all to better their education, regardless of social class or circumstances of birth. That makes the project an idealistic and, if even half-way successful, exciting endeavor.
One is to educate the people who will, in a few years, be professors in academia, or entrepreneurs in innovative businesses, or CEOs of large companies whose decisions have significant impact on society. The idea of shaping the knowledge of tomorrow's leaders was very present when I taught at Ecole Polytechnique in France.
The other one is to educate people in general, forming the minds of our the new adult citizens who will be voting and holding jobs everywhere in society. That idea was very present when I taught at Universite Paris-Sud in France.
For the first goal, the focus of attention are the best students in the class. The instructor tries to constantly challenge them, stretching their mental agility and pushing them to learn as much, as deeply, and as fast as possible. The students who are in difficulty are not on the radar. For the second goal, the focus of attention is the majority. The instructor tries to reach as many students as possible and is focused on the fringe students who are struggling to learn the course material. The students who are effortlessly cruising through the course are mostly ignored and are sometimes even considered to have an unfair advantage. So, to some extent the two goals are in tension.
It's one of the difficulties of teaching at a large public university. For example, if you teach a class of 100 undergraduates at UC Berkeley: are you going to spend effort on the two or three students who are shining in class and who might, with encouragement, continue on to a PhD, or on the thirty students who seem to have overreached slightly by taking the course?
At a selective institution such as Ecole Polytechnique or Stanford, the problem is largely solved by the filtering that happens beforehand.
Teaching with both goals in mind is taken to its limit this semester Stanford, with an online course being taught on introduction to AI. The course material is taught to the Stanford students, a selective group if there ever was one, but also, in an online manner (including automatic online grading of homeworks and exams), to the entire world, with no filtering whatsoever. How can that be?
There must some self-selection, so that the unknown crowd of students taking the course online are each, individually, a student who is highly motivated, smart, hard-working, and who, in different circumstances, might have ended up in Stanford anyway. It's an idea very much aligned with the American Dream: offering an opportunity for all to better their education, regardless of social class or circumstances of birth. That makes the project an idealistic and, if even half-way successful, exciting endeavor.
Friday, November 18, 2011
How to solve a system of linear equations
What could be more boring than solving a system of linear equations, Ax=b? We've all learnt in school how to do it: Gaussian elimination; and we've practiced it until we got sick of it. End of story!
But there are people who are never happy with what they have. Gaussian elimination is too slow, they whine. Polynomial time is not good enough for them. They are afraid that it will fail on massive data, and want something that is fast "for real". Near-linear time is the goal these days.
They currently focus on a special case: "SDD solvers", meaning that the matrix A is symmetric and diagonally dominant, meaning that the value on the diagonal is larger than the sum of absolute values of all off-diagonal values on its row. In fact, let us simplify things further and assume that A is the "Laplacian matrix" of a graph - the matrix where rows and columns are indexed by V, where the diagonal entry is the vertex degree, and where the off-diagonal entry is -1 if and only if there is an edge between the two vertices.
So, for that special case, in their anxious search for faster algorithms, people have started playing a game that is the opposite of what we traditionally do in approximation algorithms: instead of giving highest priority to the accuracy of the output, and treating runtime as a second thought, they sacrifice some accuracy in the hope if getting a faster algorithm. Indeed, if we don't insist on finding the exact solution x* of Ax=b, then we can try some iterative method that constructs a sequence x1,x2,x3,., where each xi is an improvement over the previous one, and pray that it gets quickly to a value xT close to x*.
But scientists don't trust the power of prayer. Instead, we analyze the runtime, which is basically the cost of each iteration times the number of iterations. Unfortunately, if the matrix A is dense (many non-zero entries), each iteration is costly; and if it's not well-rounded, one needs a large number of iterations to get close to x*.
To speedup the algorithm, instead of solving the system Ax=b, let us solve the equivalent system
B^{-1}Ax=B^{-1}b
for a well-chosen matrix B. Namely, B should be sparse, so that each iteration is cheap, and B should be similar to A, so that few iterations suffice to converge. (What's the measure of similarity? It's k such that for any vector x, x^TBx is between x^TAx and k times that quantity.)
Fortunately, there is an algorithm to find a matrix B that is very sparse- only a linear number of non-zero entries - and very well conditioned - k is constant. Unfortunately, that algorithm is slow, which defeats the whole purpose. So, how can one go about quickly finding a good matrix B?
Here's an algorithm for that subproblem.
1. First, find a low-stretch spanning tree T of G, the graph corresponding to A: it's a spanning tree such that the sum of all interpoint distances in T is not much more than the sum of all interpoint distances in G. That's been studied before in a sequence of papers, so it's just a matter of recycling old work.
2. Second, for every edge uv of G, define P(uv) as the distance from u to v in T. Let t be the sum of P(e) over all edges e of G.
3. Third, build a multigraph H by repeating t log(n) times:
pick an edge e with probability P(e)/t, and put t/P(e) copies of e in H.
4. The desired matrix B is (1/t log(n)) times the Laplacian matrix of H.
And that is what I learned because of a seminar given at Brown by Richard Peng, who presented some more advanced stuff in the same area (some work he's done with Ioannis Koutis and Gary Miller).
Why should we care about this line of research? Getting an improved polynomial-time algorithm for a special case of linear system solvers, is that really such a big deal? Why is there so much interest for those papers in the Theory community? Pick your answer:
( a ) . we should care, because Dan Spielman (or insert the name of your favorite researcher in that area) cares, and he has good taste. [Trust of authorities]
( b ) . we should care, because the algorithmic perspective on those problems is pretty new. It's the development of a new technique. [Algorithmic design]
( c ) . we should care, because this has been implemented and tested and it is claimed to be competitive with existing solvers, so it leads, maybe, to actual real life programs that run faster. This could be a big success story of Theory! [Applications to Applications]
( d ) . we should care, because it has implications for Theory: it leads to faster algorithms for graph problems such as generating a random spanning tree or maximum flow. [Applications to Theory]
( e ) . we should care, because it's sunny out, I am in a good mood, and today I am inclined to see things as exciting in general. [Random]
( a' ) . we should ignore this, because it has too much heavy math, it's not fun, and it's not my taste. [Trust of own taste]
( b' ) . we should ignore this, because the core of it is just low-stretch algorithms, sampling, etc: it's just putting together basic algorithmic tools that have nothing new about them, and the claim of novelty are pure marketing [Disgruntled]
( c' ) . we should ignore this, because the claims of applications are merely what people always say in their introductions, and those claims never mean anything. [Skepticism]
( d' ) . we should ignore this, because it's not going to help understand the P versus NP problem, even in the long term. [Lack of applications to Theory]
( e' ) . we should ignore this, because I had too much to eat and now I have a stomachache. [Random]
But there are people who are never happy with what they have. Gaussian elimination is too slow, they whine. Polynomial time is not good enough for them. They are afraid that it will fail on massive data, and want something that is fast "for real". Near-linear time is the goal these days.
They currently focus on a special case: "SDD solvers", meaning that the matrix A is symmetric and diagonally dominant, meaning that the value on the diagonal is larger than the sum of absolute values of all off-diagonal values on its row. In fact, let us simplify things further and assume that A is the "Laplacian matrix" of a graph - the matrix where rows and columns are indexed by V, where the diagonal entry is the vertex degree, and where the off-diagonal entry is -1 if and only if there is an edge between the two vertices.
So, for that special case, in their anxious search for faster algorithms, people have started playing a game that is the opposite of what we traditionally do in approximation algorithms: instead of giving highest priority to the accuracy of the output, and treating runtime as a second thought, they sacrifice some accuracy in the hope if getting a faster algorithm. Indeed, if we don't insist on finding the exact solution x* of Ax=b, then we can try some iterative method that constructs a sequence x1,x2,x3,., where each xi is an improvement over the previous one, and pray that it gets quickly to a value xT close to x*.
But scientists don't trust the power of prayer. Instead, we analyze the runtime, which is basically the cost of each iteration times the number of iterations. Unfortunately, if the matrix A is dense (many non-zero entries), each iteration is costly; and if it's not well-rounded, one needs a large number of iterations to get close to x*.
To speedup the algorithm, instead of solving the system Ax=b, let us solve the equivalent system
for a well-chosen matrix B. Namely, B should be sparse, so that each iteration is cheap, and B should be similar to A, so that few iterations suffice to converge. (What's the measure of similarity? It's k such that for any vector x, x^TBx is between x^TAx and k times that quantity.)
Fortunately, there is an algorithm to find a matrix B that is very sparse- only a linear number of non-zero entries - and very well conditioned - k is constant. Unfortunately, that algorithm is slow, which defeats the whole purpose. So, how can one go about quickly finding a good matrix B?
Here's an algorithm for that subproblem.
1. First, find a low-stretch spanning tree T of G, the graph corresponding to A: it's a spanning tree such that the sum of all interpoint distances in T is not much more than the sum of all interpoint distances in G. That's been studied before in a sequence of papers, so it's just a matter of recycling old work.
2. Second, for every edge uv of G, define P(uv) as the distance from u to v in T. Let t be the sum of P(e) over all edges e of G.
3. Third, build a multigraph H by repeating t log(n) times:
pick an edge e with probability P(e)/t, and put t/P(e) copies of e in H.
4. The desired matrix B is (1/t log(n)) times the Laplacian matrix of H.
And that is what I learned because of a seminar given at Brown by Richard Peng, who presented some more advanced stuff in the same area (some work he's done with Ioannis Koutis and Gary Miller).
Why should we care about this line of research? Getting an improved polynomial-time algorithm for a special case of linear system solvers, is that really such a big deal? Why is there so much interest for those papers in the Theory community? Pick your answer:
( a ) . we should care, because Dan Spielman (or insert the name of your favorite researcher in that area) cares, and he has good taste. [Trust of authorities]
( b ) . we should care, because the algorithmic perspective on those problems is pretty new. It's the development of a new technique. [Algorithmic design]
( c ) . we should care, because this has been implemented and tested and it is claimed to be competitive with existing solvers, so it leads, maybe, to actual real life programs that run faster. This could be a big success story of Theory! [Applications to Applications]
( d ) . we should care, because it has implications for Theory: it leads to faster algorithms for graph problems such as generating a random spanning tree or maximum flow. [Applications to Theory]
( e ) . we should care, because it's sunny out, I am in a good mood, and today I am inclined to see things as exciting in general. [Random]
( a' ) . we should ignore this, because it has too much heavy math, it's not fun, and it's not my taste. [Trust of own taste]
( b' ) . we should ignore this, because the core of it is just low-stretch algorithms, sampling, etc: it's just putting together basic algorithmic tools that have nothing new about them, and the claim of novelty are pure marketing [Disgruntled]
( c' ) . we should ignore this, because the claims of applications are merely what people always say in their introductions, and those claims never mean anything. [Skepticism]
( d' ) . we should ignore this, because it's not going to help understand the P versus NP problem, even in the long term. [Lack of applications to Theory]
( e' ) . we should ignore this, because I had too much to eat and now I have a stomachache. [Random]
Thursday, November 17, 2011
Confusing maps
Yesterday I went to the MFA in Boston, and found my way around the museum with the map. The architecture is a little bit complicated: for example, because of some atriums (atria?) that span several levels, sometimes the best way to go from one room to another room on the same floor is to go down several levels, across the atrium, and back up several levels. That makes the map design challenging. (In particular is it not clear when what appears to be a "room" on the map actually corresponds to some empty air floating above a floor at a lower level.)
As discussed in a previous thread, to help the visitors orient themselves quickly, they partition each floor into sections (clusters of rooms) that have different colors, depending on the theme of the art displayed. But to my dismay, the correspondence is not a bijection: although the colors correspond exactly to types of art, art of the same type may be displayed on several floors.
As a result, before I started my visit, I had to spend a good while pouring over the description of the rooms on one page, and the map on another page. Very inefficient! And it is not until after I had left that I noticed that they had highlighted a few particular pieces of art as particularly noteworthy: I was so busy trying to understand their visualization system that I did not even notice those marks.
All in all, a less than optimal example of data visualization.
As discussed in a previous thread, to help the visitors orient themselves quickly, they partition each floor into sections (clusters of rooms) that have different colors, depending on the theme of the art displayed. But to my dismay, the correspondence is not a bijection: although the colors correspond exactly to types of art, art of the same type may be displayed on several floors.
As a result, before I started my visit, I had to spend a good while pouring over the description of the rooms on one page, and the map on another page. Very inefficient! And it is not until after I had left that I noticed that they had highlighted a few particular pieces of art as particularly noteworthy: I was so busy trying to understand their visualization system that I did not even notice those marks.
All in all, a less than optimal example of data visualization.
Wednesday, November 16, 2011
Odds
The other day a friend was asking: "which carrot should I take, this one or that one?", showing us two carrots of different shapes.
The other person present and myself simultaneously answered: "This one!", pointing to the same carrot. So my friend picked that carrot, commenting: "Wow, you guys made exactly the same choice. Amazing!", to which I replied ironically: "Yes, amazing! What are the odds?"
Compared to most people, it's hard to get me excited about chance events. In particular a 1-in-2 chance event doesn't get me to even raise an eyebrow. Here's my scale:
1 in 10: raised an eyebrow
1 in 100: quite surprised
1 in 1000: skeptical, then amazed
1 in 10000: skeptical, then still skeptical (in spite of evidence).
1 in 100000: impossible. Not even worth refuting.
The other person present and myself simultaneously answered: "This one!", pointing to the same carrot. So my friend picked that carrot, commenting: "Wow, you guys made exactly the same choice. Amazing!", to which I replied ironically: "Yes, amazing! What are the odds?"
Compared to most people, it's hard to get me excited about chance events. In particular a 1-in-2 chance event doesn't get me to even raise an eyebrow. Here's my scale:
1 in 10: raised an eyebrow
1 in 100: quite surprised
1 in 1000: skeptical, then amazed
1 in 10000: skeptical, then still skeptical (in spite of evidence).
1 in 100000: impossible. Not even worth refuting.
Tuesday, November 15, 2011
Chance encounter: what are the odds?
Through Amazon, I bought a used copy of the book "Fear and Trembling" by Kierkegaard, in an edition that had been recommended to me by a friend (Translator of that edition: Walter Lowrie). When we met, her eyes fell on the name on the first page of my copy of the book: "Lois!" She exclaimed. "Where do you know her from?" - but I had no clue who that Lois person was. I said: "I don't know. It's just a name. Some previous owner of my book, obviously." As it turned out, she was my friend's fellow student in college many years ago. They were both in a course in which they studied Kierkegaard.
My friend was amazed at the coincidence. But how amazing was it really? What are the odds?
One might imagine that reading Kierkegaard is so rare that all Kierkegaard readers are at distance 2 from one another, in a tiny-world phenomenon. But that's not true. A few weeks ago a random colleague of mine came to my house, saw that book and started discussing it: that suggests an abundance of Kierkegaard readers.
Imagine that there are 6000 copies of that edition floating around the US. About 30 students took that course with my friend. The odds that I would end up with that book would still be 30/6000 = 1/200. That seems about right: low enough to be surprised by the coincidence, but not so low as to be in disbelief. Now the only information I need, to check my calculation, the the number of print copies of that book.
And that's where search engines failed me, to my surprise. How many copies of a book have been printed or have sold? That information is apparently not freely accessible. So I am left to my guesses about how amazed I should be.
My friend was amazed at the coincidence. But how amazing was it really? What are the odds?
One might imagine that reading Kierkegaard is so rare that all Kierkegaard readers are at distance 2 from one another, in a tiny-world phenomenon. But that's not true. A few weeks ago a random colleague of mine came to my house, saw that book and started discussing it: that suggests an abundance of Kierkegaard readers.
Imagine that there are 6000 copies of that edition floating around the US. About 30 students took that course with my friend. The odds that I would end up with that book would still be 30/6000 = 1/200. That seems about right: low enough to be surprised by the coincidence, but not so low as to be in disbelief. Now the only information I need, to check my calculation, the the number of print copies of that book.
And that's where search engines failed me, to my surprise. How many copies of a book have been printed or have sold? That information is apparently not freely accessible. So I am left to my guesses about how amazed I should be.
Monday, November 14, 2011
Greetings
Greetings are one situation where cultural differences are most visible. Once I introduced myself to an Asian student, and he surprised me by bowing to me as we were shaking hands. Bowing! To me!! I was... rather flattered.
To learn proper greetings, I take my cue from my environment, learning the proper protocol by following the example of others. I try to do it right, but it's not always easy. Once in the jungle in the Golden Triangle of Thailand, some children in a ditch saw me walking by and did the namaste gesture, accompanied by a slight bow. Wanting to fit in, I reciprocated the gesture and slight bow. But, to my confusion, the kids then burst out laughing!
Having learned from that incident, when that Asian student greeted me with a bow I was careful not to bow back.
Now, next month, Andy Yao is coming to give a talk at Brown. Should I bow when I greet him? Would that be nice, or would it be misplaced? I have no idea.
To learn proper greetings, I take my cue from my environment, learning the proper protocol by following the example of others. I try to do it right, but it's not always easy. Once in the jungle in the Golden Triangle of Thailand, some children in a ditch saw me walking by and did the namaste gesture, accompanied by a slight bow. Wanting to fit in, I reciprocated the gesture and slight bow. But, to my confusion, the kids then burst out laughing!
Having learned from that incident, when that Asian student greeted me with a bow I was careful not to bow back.
Now, next month, Andy Yao is coming to give a talk at Brown. Should I bow when I greet him? Would that be nice, or would it be misplaced? I have no idea.
Sunday, November 13, 2011
Journals, Conferences, Arxiv: my solution.
There has recently been a flurry of posts in the blog world discussing the publication system.
Journals: articles take a long time to publish. Reviewers take forever to review. They don't do a good job. They're picky, and they lack good taste. Not that it matters, because people don't read journal articles anyway. Selective journals are biased for or against one fashionable area or another. Less selective journals drown us with too many articles to choose from. They're also expensive. It's a rip off.
Conferences: articles are half-baked. Reviewers don't read past the introduction. Committees make choices based on author reputation and fashionable keywords. The system is rigged. They accept too few papers, so researchers can't have access to that means of making their work known. They accept too many papers, so researchers drown in an excessive quantity of results being output by the community. The quality is low. They lack depth, accepting flashy results with big words and no substance. They lack breadth, accepting results in areas that are currently the fad and ignoring everything else. They're also expensive. It's a rip off.
Arxiv: an Arxiv paper is not a publication. It's not properly vetted, it's not refereed, we don't know whether they'll be around in one hundred years, there are way too many reports being published there, it doesn't cover everything and some researchers are still resisting to put their results on the arxiv, there is no guarantee that the results are correct or interesting, it's a way for people to claim a result with a poorly written manuscript, it impedes research by preventing more serious researchers from doing a more thorough job.
So many problems! So many complaints! What a dark age we live in! Now, it is my turn to put in my two cents.
My solution: I propose a simple and effective solution, namely: do not publish your work.
What could be simpler to implement? There are many advantages to not publishing. Just consider: If you do not publish, you do not waste other people's valuable time by forcing them to referee your papers, you do not add to the information overload by adding your own stuff on arxiv. You are performing a public service by not publishing. It also frees your own time by giving you the liberty to turn down refereeing requests, so you'll have more time to do better research.
But then, one might ask: how are your results going to get disseminated? First, most results don't matter, so no one will notice the difference. Second, if a result does matter, the odds are that several other people will also find it at the same time, so the result emerges naturally from the research community, without you having to go through the pain of publishing. Third, if by some extraordinary chance a result matters and you're the only one who truly understands it, then, as you give talks about it, it will catch on. Eventually someone else will be interested enough to think and write about it - here I'm thinking, for example, of part of Thurston's paper entitled "Conway's tiling groups".
That leads me to the second part of my proposed solution: publish other people's work. By that, of course, I do not mean stealing other people's ideas or results. Rather, what I mean is: when you hear about something that gets you excited, something that you want to understand badly enough that you are willing to spend many hours studying it, something that makes you so happy that you want to share the joy by communicating what you learned to others around you, then it is time to write an article about it.
As to your own research, leave it for others to write about if anyone cares to do it.
Journals: articles take a long time to publish. Reviewers take forever to review. They don't do a good job. They're picky, and they lack good taste. Not that it matters, because people don't read journal articles anyway. Selective journals are biased for or against one fashionable area or another. Less selective journals drown us with too many articles to choose from. They're also expensive. It's a rip off.
Conferences: articles are half-baked. Reviewers don't read past the introduction. Committees make choices based on author reputation and fashionable keywords. The system is rigged. They accept too few papers, so researchers can't have access to that means of making their work known. They accept too many papers, so researchers drown in an excessive quantity of results being output by the community. The quality is low. They lack depth, accepting flashy results with big words and no substance. They lack breadth, accepting results in areas that are currently the fad and ignoring everything else. They're also expensive. It's a rip off.
Arxiv: an Arxiv paper is not a publication. It's not properly vetted, it's not refereed, we don't know whether they'll be around in one hundred years, there are way too many reports being published there, it doesn't cover everything and some researchers are still resisting to put their results on the arxiv, there is no guarantee that the results are correct or interesting, it's a way for people to claim a result with a poorly written manuscript, it impedes research by preventing more serious researchers from doing a more thorough job.
So many problems! So many complaints! What a dark age we live in! Now, it is my turn to put in my two cents.
My solution: I propose a simple and effective solution, namely: do not publish your work.
What could be simpler to implement? There are many advantages to not publishing. Just consider: If you do not publish, you do not waste other people's valuable time by forcing them to referee your papers, you do not add to the information overload by adding your own stuff on arxiv. You are performing a public service by not publishing. It also frees your own time by giving you the liberty to turn down refereeing requests, so you'll have more time to do better research.
But then, one might ask: how are your results going to get disseminated? First, most results don't matter, so no one will notice the difference. Second, if a result does matter, the odds are that several other people will also find it at the same time, so the result emerges naturally from the research community, without you having to go through the pain of publishing. Third, if by some extraordinary chance a result matters and you're the only one who truly understands it, then, as you give talks about it, it will catch on. Eventually someone else will be interested enough to think and write about it - here I'm thinking, for example, of part of Thurston's paper entitled "Conway's tiling groups".
That leads me to the second part of my proposed solution: publish other people's work. By that, of course, I do not mean stealing other people's ideas or results. Rather, what I mean is: when you hear about something that gets you excited, something that you want to understand badly enough that you are willing to spend many hours studying it, something that makes you so happy that you want to share the joy by communicating what you learned to others around you, then it is time to write an article about it.
As to your own research, leave it for others to write about if anyone cares to do it.
Saturday, November 12, 2011
What is Theoretical Computer Science?
Some people study Number Theory. Others study the Geometry of Manifolds. In TCS, we study the Power of Computation. We prove theorems, just like in any other pure science. What can be computed, how well, how fast, with what techniques, and what are the limitations? In the same way that number theorists were long fixated on Fermat's conjecture, our eyes are fixed on the distant P versus NP problem. Instances of applied problems are a chance for us to test our understanding of the hidden structure. "Reality check" means an example against which to confront our ideas on computation. Whenever a problem is hard in theory but easy in practice, or vice-versa, it's an invitation to revisit our assumptions and refine our models.
The real world is useful, as a tool towards developing a better theory.
The real world is useful, as a tool towards developing a better theory.
Friday, November 11, 2011
Automatic spelling correctors
My MacBook Air has an automatic spelling corrector. I haven't yet looked at how to turn it off, but what I can say is, how incredibly annoying!
At first I thought those typos were my fault. I started worrying about my mental state, watching more carefully as I typed, and realized what was going on. Now I keep wanting to proof-read my texts, just to check one more time that they didn't get modified behind my back - a mild case of paranoia. A minor typo on my part, "fixed" by the automatic corrector, can lead to something unrecognizable. For example, the word "poem", that I accidentally typed as "poen" - a word for which the reader can easily substitute what I originally intended - found itself transformed into "open", and the reader will now be scratching their head (at least it didn't become "porn"!).
How anyone would be willing to suffer such intrusion on our writing, I can't imagine. I will have to poke around to find how to fix this intolerable feature of my new laptop.
At first I thought those typos were my fault. I started worrying about my mental state, watching more carefully as I typed, and realized what was going on. Now I keep wanting to proof-read my texts, just to check one more time that they didn't get modified behind my back - a mild case of paranoia. A minor typo on my part, "fixed" by the automatic corrector, can lead to something unrecognizable. For example, the word "poem", that I accidentally typed as "poen" - a word for which the reader can easily substitute what I originally intended - found itself transformed into "open", and the reader will now be scratching their head (at least it didn't become "porn"!).
How anyone would be willing to suffer such intrusion on our writing, I can't imagine. I will have to poke around to find how to fix this intolerable feature of my new laptop.
Wednesday, November 9, 2011
On writing recommendation letters
We are entering the season of recommendation letters. If the applicant makes it to the final stage, the letters are scrutinized and every word said or unsaid gets interpreted. How does one go about writing a recommendation letter? I am not very good at it, but here is my usual template:
Initial draft
In the first sentence, I say where I know the person from and how long I have known them. Then I summarize our joint work, if there is any.
If it is about a student who took my course, in the second paragraph I recall his or her academic performance, highlighting if there was anything special, any interesting remark that came up in class for example.
If it is about someone who I did research with, in the second and third paragraphs I talk about the technical content of our work, giving my own assessment of how deep or interesting the results are, and mentioning any particular role that might have been played by the collaborator. I make sure to insert some reservation of some kind, because no one is perfect, and a letter is not believable if it does not contain a negative. The compliments are more compelling if there is something to balance them. If necessary, I add a comparison to specific people who have applied for similar positions.
If it is an application for a position that has some teaching, I add a sentence or two about teaching.
Then I insert a few words about the person's personality.
In my final sentence, I give my overall opinion, which I try to make as multidimensional as I can, so that each person is best possible in his or her own dimension and will be sought after by the places that are looking for exactly this kind of person.
Polishing
After that initial draft, depending on how much time I have, I polish the letter: I reconsider the choice of words in the final sentence. I look up some research papers or documents to refresh my mind on the most interesting features of the work, and I develop the technical paragraphs a little bit. I search for adjectives that are an evaluation, and look for errors in emphasis. I remove expressions such as "it seems that", "I believe that", etc. Writing "this work is deep" is much more forceful than "I believe that this work is deep". Obviously, if I write it, it's because I believe it, so why add "I believe"? Pointless. I look at the relative balance between the various parts. For example, it's not good to be too effusive about somebody's personality, because then in comparison it makes the rest of the letter look bad. It's better to put the reservation, whatever it may be, in the middle of the technical evaluation rather than first or last. I may ask a colleague to take a look and tell me the impression conveyed by my letter. Sometimes it is quite different from what I had intended! Sometimes I may add some comment about the place where the applicant is applying, and the suitability of the place for the person or vice versa (but that has to be tailored to each place, so mostly I do not bother). The one thing I never do is polish the style of the English to remove ambiguous antecedents, to correct mismatches in number, or to fix faulty co-references.
Good letters
The most important thing about writing a good recommendation letter has nothing to do with the applicant: it is that I have to be in a good mood. If I am in a good mood, good words will come to my mind, upbeat expressions, accurate but positive turns of phrase.
The next most important thing also has nothing to do with the applicant: it is that good letters take time to write. (At least an hour, but easily half a day for more senior applicants.) Usually, the first time I write a letter for someone, I am pressed for time and it is not that great; by the second or third iteration I have had some time to polish it, and it is a much better description of the person's strengths.
Initial draft
In the first sentence, I say where I know the person from and how long I have known them. Then I summarize our joint work, if there is any.
If it is about a student who took my course, in the second paragraph I recall his or her academic performance, highlighting if there was anything special, any interesting remark that came up in class for example.
If it is about someone who I did research with, in the second and third paragraphs I talk about the technical content of our work, giving my own assessment of how deep or interesting the results are, and mentioning any particular role that might have been played by the collaborator. I make sure to insert some reservation of some kind, because no one is perfect, and a letter is not believable if it does not contain a negative. The compliments are more compelling if there is something to balance them. If necessary, I add a comparison to specific people who have applied for similar positions.
If it is an application for a position that has some teaching, I add a sentence or two about teaching.
Then I insert a few words about the person's personality.
In my final sentence, I give my overall opinion, which I try to make as multidimensional as I can, so that each person is best possible in his or her own dimension and will be sought after by the places that are looking for exactly this kind of person.
Polishing
After that initial draft, depending on how much time I have, I polish the letter: I reconsider the choice of words in the final sentence. I look up some research papers or documents to refresh my mind on the most interesting features of the work, and I develop the technical paragraphs a little bit. I search for adjectives that are an evaluation, and look for errors in emphasis. I remove expressions such as "it seems that", "I believe that", etc. Writing "this work is deep" is much more forceful than "I believe that this work is deep". Obviously, if I write it, it's because I believe it, so why add "I believe"? Pointless. I look at the relative balance between the various parts. For example, it's not good to be too effusive about somebody's personality, because then in comparison it makes the rest of the letter look bad. It's better to put the reservation, whatever it may be, in the middle of the technical evaluation rather than first or last. I may ask a colleague to take a look and tell me the impression conveyed by my letter. Sometimes it is quite different from what I had intended! Sometimes I may add some comment about the place where the applicant is applying, and the suitability of the place for the person or vice versa (but that has to be tailored to each place, so mostly I do not bother). The one thing I never do is polish the style of the English to remove ambiguous antecedents, to correct mismatches in number, or to fix faulty co-references.
Good letters
The most important thing about writing a good recommendation letter has nothing to do with the applicant: it is that I have to be in a good mood. If I am in a good mood, good words will come to my mind, upbeat expressions, accurate but positive turns of phrase.
The next most important thing also has nothing to do with the applicant: it is that good letters take time to write. (At least an hour, but easily half a day for more senior applicants.) Usually, the first time I write a letter for someone, I am pressed for time and it is not that great; by the second or third iteration I have had some time to polish it, and it is a much better description of the person's strengths.
Spanning trees with few hops
Minimum spanning tree problems often become NP-hard when one introduces side constraints. One popular such constraint is to limit the depth of the tree - the maximum number of "hops" when traversing a path from the root to a leaf in the tree. That's the k-hops minimum spanning tree problem, supposedly motivated by network applications, because delays along a path are roughly proportional to the number of hops.
In this problem, unlike the minimum spanning tree problem, the input edge weights do not necessarily define a metric. Indeed, in the minimum spanning tree problem, we usually start by saying: "Without loss of generality we can assume that the triangular inequality holds; otherwise, replace costly edges by shortest paths." But in the k-hop minimum spanning tree problem, such replacements may be invalid because they increase the number of hops.
When the input weights do not define a metric, known algorithms are quite poor. If there exists a tree with k hops and weight w, then one can produce a tree with O(k log n) hops and weight O(w log n), but doing better is impossible.
However, the situation is much brighter with random inputs drawn from a nice distribution. In a 2008 arxiv paper by Angel, Flaxman and Wilson, one assumes that the input is a complete graph and that the edges are drawn independently from the same distribution, for example the uniform U[0,1] distribution, or an exponential, or some other reasonable distribution (i.e. with similar density function near zero). Then there is a simple algorithm to produce a near-optimal tree.
Here is the algorithm.
First, assume that the bound k on the number of hops is less than log log n. Then the algorithm guesses x1, x2,..., xk, where xi is the number of nodes that are i hops away from the root. Then it performs a greedy Prim-like algorithm, going level by level, where, to build level i, you pick the xi nodes that can most cheaply be connected to the nodes in the part of the tree already built. (The "guess" relies on probabilistic analysis to compute the optimal sequence x(i)).
Second, assume that the bound k on the number of hops is more than log log n. Then the algorithm builds an (unconstrained) minimum spanning tree, then greedily breaks it up into small-depth clusters, then reconnects the clusters using a straightforward extension of the first algorithm, (where the goal is now not to connect the entire graph but to create a small depth subtree intersecting all the clusters).
Although the independent weight assumption is quite restrictive, the algorithm itself is natural. In fact, the unexpected bonus is that when k>log log n, not only is the output tree near-optimal, but its cost is almost the same as the cost of the unconstrained MST: in other words, one get tree of depth O(log log n) at no extra cost. So one is left to wonder: how good is this algorithm in practice?
In this problem, unlike the minimum spanning tree problem, the input edge weights do not necessarily define a metric. Indeed, in the minimum spanning tree problem, we usually start by saying: "Without loss of generality we can assume that the triangular inequality holds; otherwise, replace costly edges by shortest paths." But in the k-hop minimum spanning tree problem, such replacements may be invalid because they increase the number of hops.
When the input weights do not define a metric, known algorithms are quite poor. If there exists a tree with k hops and weight w, then one can produce a tree with O(k log n) hops and weight O(w log n), but doing better is impossible.
However, the situation is much brighter with random inputs drawn from a nice distribution. In a 2008 arxiv paper by Angel, Flaxman and Wilson, one assumes that the input is a complete graph and that the edges are drawn independently from the same distribution, for example the uniform U[0,1] distribution, or an exponential, or some other reasonable distribution (i.e. with similar density function near zero). Then there is a simple algorithm to produce a near-optimal tree.
Here is the algorithm.
First, assume that the bound k on the number of hops is less than log log n. Then the algorithm guesses x1, x2,..., xk, where xi is the number of nodes that are i hops away from the root. Then it performs a greedy Prim-like algorithm, going level by level, where, to build level i, you pick the xi nodes that can most cheaply be connected to the nodes in the part of the tree already built. (The "guess" relies on probabilistic analysis to compute the optimal sequence x(i)).
Second, assume that the bound k on the number of hops is more than log log n. Then the algorithm builds an (unconstrained) minimum spanning tree, then greedily breaks it up into small-depth clusters, then reconnects the clusters using a straightforward extension of the first algorithm, (where the goal is now not to connect the entire graph but to create a small depth subtree intersecting all the clusters).
Although the independent weight assumption is quite restrictive, the algorithm itself is natural. In fact, the unexpected bonus is that when k>log log n, not only is the output tree near-optimal, but its cost is almost the same as the cost of the unconstrained MST: in other words, one get tree of depth O(log log n) at no extra cost. So one is left to wonder: how good is this algorithm in practice?
Tuesday, November 8, 2011
Models for discrete metric spaces
It is good to have random models to test algorithms.
For graph algorithms, the G(n,p) model is like an old friend. For network algorithms, the analog is the complete graph with i.i.d. exponentially distributed edge weights. For algorithms on discrete metric spaces, what's a good model for a random discrete metric space?
Is there a model such that the marginal distribution of the distance between any pair of vertices has an exponential distribution? If so, then what would be a maximum entropy distribution that would have those marginals?
For graph algorithms, the G(n,p) model is like an old friend. For network algorithms, the analog is the complete graph with i.i.d. exponentially distributed edge weights. For algorithms on discrete metric spaces, what's a good model for a random discrete metric space?
Is there a model such that the marginal distribution of the distance between any pair of vertices has an exponential distribution? If so, then what would be a maximum entropy distribution that would have those marginals?
Monday, November 7, 2011
Hierarchically well-separated trees, bottom-up
Recently I taught the FRT algorithm to approximate any metric by a hierarchically well separated tree metric (HST) with edge lengths that are powers of 2. The construction is top-down. Initially all vertices are together in one big cluster, and later the clusters are progressively subdivided using smaller and smaller balls. A question came up: is it possible to construct the tree metric bottom-up?
Here is an easy way to simulate FRT going roughly bottom-up, described in the papers "Network Warehouses: Efficient Information Distribution to Mobile Users" (Motskin, Downes, Kusy, Gnawali, Guibas) and "Distributed Resource Management and Matching in Sensor Networks" (Gao, Guibas, Milosavljevic, Zhou).
Pick a random priority ordering over the vertices, and a random scalar r in [1,2]. For each vertex v in parallel and for each distance d which is a power of 2 times r, find the highest priority vertex u such that B(u,d) contains v. As the power of 2 increases, this defines a sequence of balls containing v, of larger and larger radius, and with centers that have higher and higher priority.
To compute the distance between v and w in the tree metric, find the maximum d such that, in their respective sequences, v and w are in the same ball of radius d but in different balls of radius d/2. That determines the place in the tree where v and w have their lowest common ancestor x, corresponding to the ball of radius d that contains both v and w. Then the x-to-v and the x-to-w paths in the tree both have edges of lengths d/2, d/4, d/8, ..., so the distance from v to w in the tree metric is 2d.
Here is a question: is it possible to construct a hierarchical clustering that has the properties of HSTs, but that is truly constructed in a bottom-up manner? More precisely, is it possible to commit to a partition of the points into a hierarchical clustering in clusters of radius < d, (that is, commit to the bottom levels of the HST), without looking at the distances that are greater than, say, 16d?
Here is a possible answer: put v and w in the same cluster of radius d iff their sequences of balls have the same balls for radius d, 2d, 4d, 8d and 16d. From that point on, in later, coarser clustering, v and w are committed to staying together. What happens if the FRT algorithm wanted to separate them? Perhaps there is a high priority vertex x such that B(x,32d) contains v but not w. Well - tough. If v is the highest priority vertex in the cluster, then w yields to v and puts B(x,32d) in its sequence, even though technically it wasn't supposed to do it.
How does that affect the HST properties? It still creates a hierarchically well-separated tree, and the distance in the tree metric is still not much less than the distance in the input metric. But is it still upper bounded by O(log n) times the distance in the input metric, in expectation?
Here is an easy way to simulate FRT going roughly bottom-up, described in the papers "Network Warehouses: Efficient Information Distribution to Mobile Users" (Motskin, Downes, Kusy, Gnawali, Guibas) and "Distributed Resource Management and Matching in Sensor Networks" (Gao, Guibas, Milosavljevic, Zhou).
Pick a random priority ordering over the vertices, and a random scalar r in [1,2]. For each vertex v in parallel and for each distance d which is a power of 2 times r, find the highest priority vertex u such that B(u,d) contains v. As the power of 2 increases, this defines a sequence of balls containing v, of larger and larger radius, and with centers that have higher and higher priority.
To compute the distance between v and w in the tree metric, find the maximum d such that, in their respective sequences, v and w are in the same ball of radius d but in different balls of radius d/2. That determines the place in the tree where v and w have their lowest common ancestor x, corresponding to the ball of radius d that contains both v and w. Then the x-to-v and the x-to-w paths in the tree both have edges of lengths d/2, d/4, d/8, ..., so the distance from v to w in the tree metric is 2d.
Here is a question: is it possible to construct a hierarchical clustering that has the properties of HSTs, but that is truly constructed in a bottom-up manner? More precisely, is it possible to commit to a partition of the points into a hierarchical clustering in clusters of radius < d, (that is, commit to the bottom levels of the HST), without looking at the distances that are greater than, say, 16d?
Here is a possible answer: put v and w in the same cluster of radius d iff their sequences of balls have the same balls for radius d, 2d, 4d, 8d and 16d. From that point on, in later, coarser clustering, v and w are committed to staying together. What happens if the FRT algorithm wanted to separate them? Perhaps there is a high priority vertex x such that B(x,32d) contains v but not w. Well - tough. If v is the highest priority vertex in the cluster, then w yields to v and puts B(x,32d) in its sequence, even though technically it wasn't supposed to do it.
How does that affect the HST properties? It still creates a hierarchically well-separated tree, and the distance in the tree metric is still not much less than the distance in the input metric. But is it still upper bounded by O(log n) times the distance in the input metric, in expectation?
Sunday, November 6, 2011
Students sleeping in class
As the semester goes on, good resolutions fade. Among the students who still make the effort to come to class, I see some who desperately try to stay awake, but their eyes close in spite of themselves. (That happens to me too, especially if I listen to a talk right after lunch.) But what I teach is sometimes demanding. It requires an effort of the intellect.
When students do sports, at practice they have to put in the effort, day in and day out, even if sometimes it feels tiresome. But in class, even if they diligently put in the required effort to come to class, their brain might sometimes be reluctant to engage. They could just be too tired.
To awaken the brain from its slumber, the instructor can try to entertain. What if we had drills, like in sports? What if I asked questions and the students had to push some button to answer? Would that wake them up?
When I was in elementary school the teacher started the morning with operations. He or she would say: "3 plus 1" or "6 times 9", we would each write the answer on our small individual chalkboard, and raise the board high above our heads, for the teacher to see all the answers simultaneously. I always thought the purpose of that exercise was to memorize elementary arithmetic operations, but maybe there was something more to it: it woke up our brain, like drills waking up our body in sports. After a few minutes of such exercises, we, perhaps, felt more alert, more ready to listen to and learn new material.
When students do sports, at practice they have to put in the effort, day in and day out, even if sometimes it feels tiresome. But in class, even if they diligently put in the required effort to come to class, their brain might sometimes be reluctant to engage. They could just be too tired.
To awaken the brain from its slumber, the instructor can try to entertain. What if we had drills, like in sports? What if I asked questions and the students had to push some button to answer? Would that wake them up?
When I was in elementary school the teacher started the morning with operations. He or she would say: "3 plus 1" or "6 times 9", we would each write the answer on our small individual chalkboard, and raise the board high above our heads, for the teacher to see all the answers simultaneously. I always thought the purpose of that exercise was to memorize elementary arithmetic operations, but maybe there was something more to it: it woke up our brain, like drills waking up our body in sports. After a few minutes of such exercises, we, perhaps, felt more alert, more ready to listen to and learn new material.
Friday, November 4, 2011
On applying for jobs
Here is how I applied for jobs the first couple of times: I put together my vita, rushed to finish papers as soon as possible, tried to come up with the best research proposal I could think of, and sent copies of the whole application to universities and colleges, hoping some of them would show interest. That was unsuccessful.
Then Lynne Butler looked at my application materials and pointed out:
- that I ought to get the name of the university right. Sometimes I addressed my application to, say, "Williams University" instead of "Williams College". I answered that it was a detail, but she said that such details show indifference on my part;
- that my cover letter, identical for all places where I applied, would be a turnoff and that in many universities they would not even turn the page and look beyond it.
That year, in the early 1990s, the job market in TCS was very tight. In Math Lynne told me that it was even worse, with 900 applications received for one temporary 1-year instructor position!
My discussion with her at that time shifted my perspective entirely.
Here is how I apply for jobs now: I look at the web page of the place I am considering. I try to picture myself being there. What would I like about it? Could I imagine being there in the long term? How could my being there help develop my research or my teaching? How could that place make a difference for me, and how could I make a difference for that place? What would be its most attractive features for me? After I've spent some time making up a story of my hypothetical future there, after I have, perhaps, visited, or at least explored connections, once the picture is a bit more clear, I can easily write an interesting cover letter, inserting a couple of personal specific sentences.
Then Lynne Butler looked at my application materials and pointed out:
- that I ought to get the name of the university right. Sometimes I addressed my application to, say, "Williams University" instead of "Williams College". I answered that it was a detail, but she said that such details show indifference on my part;
- that my cover letter, identical for all places where I applied, would be a turnoff and that in many universities they would not even turn the page and look beyond it.
That year, in the early 1990s, the job market in TCS was very tight. In Math Lynne told me that it was even worse, with 900 applications received for one temporary 1-year instructor position!
My discussion with her at that time shifted my perspective entirely.
Here is how I apply for jobs now: I look at the web page of the place I am considering. I try to picture myself being there. What would I like about it? Could I imagine being there in the long term? How could my being there help develop my research or my teaching? How could that place make a difference for me, and how could I make a difference for that place? What would be its most attractive features for me? After I've spent some time making up a story of my hypothetical future there, after I have, perhaps, visited, or at least explored connections, once the picture is a bit more clear, I can easily write an interesting cover letter, inserting a couple of personal specific sentences.
Thursday, November 3, 2011
Walking in a city: the shortest expected path
Say that you are in a US city with streets arranged in a grid, and your destination is i blocks to the North and j blocks to the East of your current location: which path will you pick to go there? There are (i+j choose i) different routes that all have the same length, but you may or may not have to wait at intersections before crossing. The shortest path is, among the paths with minimum total length, the one that minimizes waiting time.
Whenever you reach an intersection, you are able to cross without waiting, in at least one of the two directions North and East, as long as i and j are both positive. But if i is zero, there is only one shortest path and you are stuck just going East for j blocks, forced to wait at intersections whenever the light is red for pedestrians. So, the best way is to avoid reaching the point where i or j is zero.
This means that, if you have just finished crossing at one intersection, finding yourself at the North-East corner, and now have a choice between walking East for one block or North for one block, you should always choose the direction that decreases max(i,j).
Say that you are at the North-West corner of an intersection, and that the light is red for pedestrians to go East. Should you wait for the light to change so that you can cross to the East side, or walk North for one block, hoping to be luckier at the next intersection? It depends on whether you got there by walking East for one block, or by walking North for one block and crossing the intersection from South to North: in the first case your waiting time is completely random and there is no point in waiting at this spot rather than walking North for one block and hoping to get lucky at the next intersection. In the second case, the answer is not completely clear. Since it has already been a while since the light last changed, on average you will wait less at the current intersection that at the next intersection. However you should also take into account how close min(i,j) is to zero, or perhaps how small |i-j| is: maybe you are in no hurry to go East, because i is large compared to j and you will have many other opportunities to cross going East.
I realized at my last trip to the Boston financial district that I use that algorithm without even thinking about it, relying on some gut feeling for the last case, and automatically applying the above rules for the other cases. I wonder: does everybody do that? Or at least, every computer scientist? Do people do it consciously or deliberately? Is this actually the best path, or is there a better way?
Whenever you reach an intersection, you are able to cross without waiting, in at least one of the two directions North and East, as long as i and j are both positive. But if i is zero, there is only one shortest path and you are stuck just going East for j blocks, forced to wait at intersections whenever the light is red for pedestrians. So, the best way is to avoid reaching the point where i or j is zero.
This means that, if you have just finished crossing at one intersection, finding yourself at the North-East corner, and now have a choice between walking East for one block or North for one block, you should always choose the direction that decreases max(i,j).
Say that you are at the North-West corner of an intersection, and that the light is red for pedestrians to go East. Should you wait for the light to change so that you can cross to the East side, or walk North for one block, hoping to be luckier at the next intersection? It depends on whether you got there by walking East for one block, or by walking North for one block and crossing the intersection from South to North: in the first case your waiting time is completely random and there is no point in waiting at this spot rather than walking North for one block and hoping to get lucky at the next intersection. In the second case, the answer is not completely clear. Since it has already been a while since the light last changed, on average you will wait less at the current intersection that at the next intersection. However you should also take into account how close min(i,j) is to zero, or perhaps how small |i-j| is: maybe you are in no hurry to go East, because i is large compared to j and you will have many other opportunities to cross going East.
I realized at my last trip to the Boston financial district that I use that algorithm without even thinking about it, relying on some gut feeling for the last case, and automatically applying the above rules for the other cases. I wonder: does everybody do that? Or at least, every computer scientist? Do people do it consciously or deliberately? Is this actually the best path, or is there a better way?
Wednesday, November 2, 2011
Lynne Butler on academic life
Lynne Butler is a Math professor at Haverford College and a long-time friend. Here is what she has to say about the academic life.
Claire: How would your career have been different if you had been a man instead of a woman?
Lynne: My field would have been algebraic topology. As a graduate student at MIT, that's what I wanted to do. My advisor was skeptical of the role of women in Mathematics, and thought advising them was a wasted investment because they would not be truly committed to their career. I agreed with him that marriage for a female mathematician was not a good idea, and promised to prove my commitment to Mathematics, and he finally agreed to give it a try. However after some time the situation deteriorated. Once at a meeting I saw a stack of identical papers on his desk, asked if they were copies of his latest work and if I could borrow one to read it; as a result, he complained to the chair that I was intruding on his private space. Things got gradually worse, until one day he told me that my lack of command of a paper he had given me to read confirmed his bad opinion of female mathematicians. The department chair then advised me to find an advisor in a different field, and that's how I ended up getting a PhD in combinatorics with Richard Stanley and - sweet revenge - got a job in Princeton (whereas algebraic topologists had and still have the greatest trouble finding jobs in top places), which later led to my going to Haverford, in a job that I love.
Claire: You were associate provost for a year. Is there anything interesting about that job?
Lynne: yes, although there were tasks that I hated and did very poorly, there were also some things that I really liked and was able to do really well. In particular, negotiating hiring of visiting professors: from the time when a department recommends recruiting a new faculty, to the time when they start, there are many questions that need to be resolved to decide whether the candidate is a good match with Haverford and how to make it work. For example, for faculty with interests that span across several departments, arrangements have to be made, and I loved doing that.
Claire: Do you do a lot of mentoring students?
Lynne: Yes, it's a big part of my job. Not just when they major, but when they start, when they choose their major, when they graduate, and after they graduate. I have continued advising former students through their time in graduate school, when they apply for jobs, and even advised them about the tenure process.
Claire: I remember applying for jobs from France. I was clueless, you took a look at my failed applications and explained to me how I was going about it the wrong way, I followed your advice, and it worked.
Claire: How would your career have been different if you had been a man instead of a woman?
Lynne: My field would have been algebraic topology. As a graduate student at MIT, that's what I wanted to do. My advisor was skeptical of the role of women in Mathematics, and thought advising them was a wasted investment because they would not be truly committed to their career. I agreed with him that marriage for a female mathematician was not a good idea, and promised to prove my commitment to Mathematics, and he finally agreed to give it a try. However after some time the situation deteriorated. Once at a meeting I saw a stack of identical papers on his desk, asked if they were copies of his latest work and if I could borrow one to read it; as a result, he complained to the chair that I was intruding on his private space. Things got gradually worse, until one day he told me that my lack of command of a paper he had given me to read confirmed his bad opinion of female mathematicians. The department chair then advised me to find an advisor in a different field, and that's how I ended up getting a PhD in combinatorics with Richard Stanley and - sweet revenge - got a job in Princeton (whereas algebraic topologists had and still have the greatest trouble finding jobs in top places), which later led to my going to Haverford, in a job that I love.
Claire: You were associate provost for a year. Is there anything interesting about that job?
Lynne: yes, although there were tasks that I hated and did very poorly, there were also some things that I really liked and was able to do really well. In particular, negotiating hiring of visiting professors: from the time when a department recommends recruiting a new faculty, to the time when they start, there are many questions that need to be resolved to decide whether the candidate is a good match with Haverford and how to make it work. For example, for faculty with interests that span across several departments, arrangements have to be made, and I loved doing that.
Claire: Do you do a lot of mentoring students?
Lynne: Yes, it's a big part of my job. Not just when they major, but when they start, when they choose their major, when they graduate, and after they graduate. I have continued advising former students through their time in graduate school, when they apply for jobs, and even advised them about the tenure process.
Claire: I remember applying for jobs from France. I was clueless, you took a look at my failed applications and explained to me how I was going about it the wrong way, I followed your advice, and it worked.
Tuesday, November 1, 2011
Fukushima 8 months later
News from NHK, Japan's government broadcaster:
“Xenon has been detected in the Containment Vessel of Reactor 2 at Fukushima I Nuclear Power Plant. Xenon is created when there is nuclear fission.” [...]
“The analysis done on November 1 found xenon-133 and xenon-135, which are created when uranium-235 undergoes nuclear fission.”
“Xenon-133′s half life is 5 days. TEPCO says it could not rule out the possibility of nuclear chain reaction happening again, so the company poured water with boric acid into the reactor to suppress the nuclear chain reaction for one hour starting 3AM [on November 2].” [...]
“[TEPCO says that] if the nuclear chain reaction is happening, it is small-scale.”
Reminder: news, rumors, scientific analysis and fear-mongering on Fukushima can be found at http://enenews.com/ and at http://ex-skf.blogspot.com/ while raw measurements are at http://atmc.jp/plant/rad/?n=1 (and the other graphs reachable from there by clicking on the links)
In the present case, the first US media to report this, amazingly, is Fox business news. http://www.foxbusiness.com/industries/2011/11/01/japan-government-possibility-xenon-133-135-release-at-tepco-fukushima-daiichi/
“Xenon has been detected in the Containment Vessel of Reactor 2 at Fukushima I Nuclear Power Plant. Xenon is created when there is nuclear fission.” [...]
“The analysis done on November 1 found xenon-133 and xenon-135, which are created when uranium-235 undergoes nuclear fission.”
“Xenon-133′s half life is 5 days. TEPCO says it could not rule out the possibility of nuclear chain reaction happening again, so the company poured water with boric acid into the reactor to suppress the nuclear chain reaction for one hour starting 3AM [on November 2].” [...]
“[TEPCO says that] if the nuclear chain reaction is happening, it is small-scale.”
Reminder: news, rumors, scientific analysis and fear-mongering on Fukushima can be found at http://enenews.com/ and at http://ex-skf.blogspot.com/ while raw measurements are at http://atmc.jp/plant/rad/?n=1 (and the other graphs reachable from there by clicking on the links)
In the present case, the first US media to report this, amazingly, is Fox business news. http://www.foxbusiness.com/industries/2011/11/01/japan-government-possibility-xenon-133-135-release-at-tepco-fukushima-daiichi/
Path to Professorship at MIT: with or without alcohol?
Recently I sat on a panel at the "Path to Professorship" event at MIT, for graduate students and post-docs. After the afternoon panels, it was announced that there would be a reception with wine. However, during the reception, a photographer who came to take a picture of the group with whom I was talking asked the woman holding a glass of wine to put it down while he took the picture: apparently, it's ok to drink wine, but not to post pictures of people drinking wine!
Monday, October 31, 2011
Types of research collaborators
There is the flea, who jumps from question to question in a seemingly haphazard manner, looking for food. Or the butterfly. We admire its speed, its brilliance, and its ability to connect seemingly disconnected ideas. It's like exploring the world of possibilities by breadth-first-search.
There is the mole, slowly digging forward with obstination, going into one direction and never giving up. We admire its obstination and courage to go through even horrible calculations or case analyses. It's like exploring the world of possibilities by depth-first-search.
Then there is the rest of us. We do some kind of mixture. When we collaborate with a mole, we get frustrated by its doggedness at times, when it is clear to us that the way forward to blocked and that those heavy-handed attempts will most likely be fruitless. When we collaborate with a butterfly, we get frustrated at times, because it keeps giving up and switching directions before we've had time to see whether that previous idea had any promise.
The best collaborators are those who are in a narrow range around our own style of research: when we switch questions, they are willing to give up because they're getting tired of getting nowhere. When we keep going, they follow happily because they think there might still be something to that idea, and, although they personally would give up, are willing to give it one more try for our sake.
There is the mole, slowly digging forward with obstination, going into one direction and never giving up. We admire its obstination and courage to go through even horrible calculations or case analyses. It's like exploring the world of possibilities by depth-first-search.
Then there is the rest of us. We do some kind of mixture. When we collaborate with a mole, we get frustrated by its doggedness at times, when it is clear to us that the way forward to blocked and that those heavy-handed attempts will most likely be fruitless. When we collaborate with a butterfly, we get frustrated at times, because it keeps giving up and switching directions before we've had time to see whether that previous idea had any promise.
The best collaborators are those who are in a narrow range around our own style of research: when we switch questions, they are willing to give up because they're getting tired of getting nowhere. When we keep going, they follow happily because they think there might still be something to that idea, and, although they personally would give up, are willing to give it one more try for our sake.
Sunday, October 30, 2011
Microphone education
You have been raised to put your hand over your mouth when you cough and bend your head slightly so that the possible viruses are projected downwards instead of towards your neighbors. It also muffles the noise of coughing by redirecting the noise down instead of across from you.
Do not apply this etiquette if you have a microphone attached to your shirt collar with the volume amplification set on "high"!
Do not apply this etiquette if you have a microphone attached to your shirt collar with the volume amplification set on "high"!
Saturday, October 29, 2011
The importance of being well connected
In politics, it is well known that connections are critical. When a
well-connected person needs something, they simply pick up their phone,
and give a quick call to a friend in the right place.
In research, being well connected is also an important asset. I have a
short list of "oracles" who can help me with thorny questions, each in
their respective area of specialty.
For example, the other day I was working with a colleague on a question
related to expanders and combinatorics on graphs. After we thought about
it for a bit, and searched the internet without success, I dropped an
email to Noga Alon. Less than an hour later, I got an answer!
Informative, precise, and with a reference to a directly relevant
paper. For efficient research, how precious it is to have the right
connections at the right time!
well-connected person needs something, they simply pick up their phone,
and give a quick call to a friend in the right place.
In research, being well connected is also an important asset. I have a
short list of "oracles" who can help me with thorny questions, each in
their respective area of specialty.
For example, the other day I was working with a colleague on a question
related to expanders and combinatorics on graphs. After we thought about
it for a bit, and searched the internet without success, I dropped an
email to Noga Alon. Less than an hour later, I got an answer!
Informative, precise, and with a reference to a directly relevant
paper. For efficient research, how precious it is to have the right
connections at the right time!
Friday, October 28, 2011
How to get an academic job?
How should students go about getting a job? The question came up in some recent comment. It's a tough one, but brought back a memory.
Once a relative asked her uncle: "You are so successful. How did you do it? How should I go about it? What job should I have, and what career should I pursue?" I waited with interest to hear what he might have to say. Here is what the successful young man answered: "No matter what you do, you will not be successful if you are not really into it. You've got to enjoy your work. So, why don't you figure out the things that interest you the most and make a list of the four or five most exciting jobs for you? Then, within that short list, you can look at other criteria such as opportunities, salary and career paths. But you've got to first give priority to your personal interests."
I loved that answer.
So, by analogy I would say: I believe that there is no way for people to be leading researchers unless they love what they do. A student thinking about going into academia should first think of the research areas that he or she is most interested in, make a short list of the fields that are most exciting for him or her personally. Then they can take into consideration the job market, trying to estimate, five years down the line, which subjects will be most important and will have the most openings.
That's my little piece of wisdom for the day! Not particularly surprising, not particularly unorthodox, but it might be good advice.
Once a relative asked her uncle: "You are so successful. How did you do it? How should I go about it? What job should I have, and what career should I pursue?" I waited with interest to hear what he might have to say. Here is what the successful young man answered: "No matter what you do, you will not be successful if you are not really into it. You've got to enjoy your work. So, why don't you figure out the things that interest you the most and make a list of the four or five most exciting jobs for you? Then, within that short list, you can look at other criteria such as opportunities, salary and career paths. But you've got to first give priority to your personal interests."
I loved that answer.
So, by analogy I would say: I believe that there is no way for people to be leading researchers unless they love what they do. A student thinking about going into academia should first think of the research areas that he or she is most interested in, make a short list of the fields that are most exciting for him or her personally. Then they can take into consideration the job market, trying to estimate, five years down the line, which subjects will be most important and will have the most openings.
That's my little piece of wisdom for the day! Not particularly surprising, not particularly unorthodox, but it might be good advice.
Thursday, October 27, 2011
Teaching a second programming language
If a Chinese person who has been learning French for a year now wants to learn Italian, how do you go about teaching them Italian? You don't start from scratch and painfully relate Italian to Chinese: instead, you leverage their knowledge of French and teach Italian as, roughly, a variant of French: you highlight the delta between those two relatively close languages.
Similarly, after 6 weeks of teaching Racket I have now switched to OCAML: last year the way in which I did it was to teach OCAML from scratch, only going much more quickly than when I started Racket 6 weeks before. As I was doing it, I realized that it was largely redundant and boring. So this year, I simply show OCAML and corresponding Racket code side by side, so that the students see that many of the differences are simply syntactic details that can be translated automatically. That gives me more time to focus on the differences and try to figure out the reasons for those differences.
In hindsight, it is obvious.
I wonder if there are any web resources to serve that kind of requests: "I want to learn computer language X; I already know computer language Y".
Similarly, after 6 weeks of teaching Racket I have now switched to OCAML: last year the way in which I did it was to teach OCAML from scratch, only going much more quickly than when I started Racket 6 weeks before. As I was doing it, I realized that it was largely redundant and boring. So this year, I simply show OCAML and corresponding Racket code side by side, so that the students see that many of the differences are simply syntactic details that can be translated automatically. That gives me more time to focus on the differences and try to figure out the reasons for those differences.
In hindsight, it is obvious.
I wonder if there are any web resources to serve that kind of requests: "I want to learn computer language X; I already know computer language Y".
Subscribe to:
Posts (Atom)