But there are people who are never happy with what they have. Gaussian elimination is too slow, they whine. Polynomial time is not good enough for them. They are afraid that it will fail on massive data, and want something that is fast "for real". Near-linear time is the goal these days.
They currently focus on a special case: "SDD solvers", meaning that the matrix A is symmetric and diagonally dominant, meaning that the value on the diagonal is larger than the sum of absolute values of all off-diagonal values on its row. In fact, let us simplify things further and assume that A is the "Laplacian matrix" of a graph - the matrix where rows and columns are indexed by V, where the diagonal entry is the vertex degree, and where the off-diagonal entry is -1 if and only if there is an edge between the two vertices.
So, for that special case, in their anxious search for faster algorithms, people have started playing a game that is the opposite of what we traditionally do in approximation algorithms: instead of giving highest priority to the accuracy of the output, and treating runtime as a second thought, they sacrifice some accuracy in the hope if getting a faster algorithm. Indeed, if we don't insist on finding the exact solution x* of Ax=b, then we can try some iterative method that constructs a sequence x1,x2,x3,., where each xi is an improvement over the previous one, and pray that it gets quickly to a value xT close to x*.
But scientists don't trust the power of prayer. Instead, we analyze the runtime, which is basically the cost of each iteration times the number of iterations. Unfortunately, if the matrix A is dense (many non-zero entries), each iteration is costly; and if it's not well-rounded, one needs a large number of iterations to get close to x*.
To speedup the algorithm, instead of solving the system Ax=b, let us solve the equivalent system
for a well-chosen matrix B. Namely, B should be sparse, so that each iteration is cheap, and B should be similar to A, so that few iterations suffice to converge. (What's the measure of similarity? It's k such that for any vector x, x^TBx is between x^TAx and k times that quantity.)
Fortunately, there is an algorithm to find a matrix B that is very sparse- only a linear number of non-zero entries - and very well conditioned - k is constant. Unfortunately, that algorithm is slow, which defeats the whole purpose. So, how can one go about quickly finding a good matrix B?
Here's an algorithm for that subproblem.
1. First, find a low-stretch spanning tree T of G, the graph corresponding to A: it's a spanning tree such that the sum of all interpoint distances in T is not much more than the sum of all interpoint distances in G. That's been studied before in a sequence of papers, so it's just a matter of recycling old work.
2. Second, for every edge uv of G, define P(uv) as the distance from u to v in T. Let t be the sum of P(e) over all edges e of G.
3. Third, build a multigraph H by repeating t log(n) times:
pick an edge e with probability P(e)/t, and put t/P(e) copies of e in H.
4. The desired matrix B is (1/t log(n)) times the Laplacian matrix of H.
And that is what I learned because of a seminar given at Brown by Richard Peng, who presented some more advanced stuff in the same area (some work he's done with Ioannis Koutis and Gary Miller).
Why should we care about this line of research? Getting an improved polynomial-time algorithm for a special case of linear system solvers, is that really such a big deal? Why is there so much interest for those papers in the Theory community? Pick your answer:
( a ) . we should care, because Dan Spielman (or insert the name of your favorite researcher in that area) cares, and he has good taste. [Trust of authorities]
( b ) . we should care, because the algorithmic perspective on those problems is pretty new. It's the development of a new technique. [Algorithmic design]
( c ) . we should care, because this has been implemented and tested and it is claimed to be competitive with existing solvers, so it leads, maybe, to actual real life programs that run faster. This could be a big success story of Theory! [Applications to Applications]
( d ) . we should care, because it has implications for Theory: it leads to faster algorithms for graph problems such as generating a random spanning tree or maximum flow. [Applications to Theory]
( e ) . we should care, because it's sunny out, I am in a good mood, and today I am inclined to see things as exciting in general. [Random]
( a' ) . we should ignore this, because it has too much heavy math, it's not fun, and it's not my taste. [Trust of own taste]
( b' ) . we should ignore this, because the core of it is just low-stretch algorithms, sampling, etc: it's just putting together basic algorithmic tools that have nothing new about them, and the claim of novelty are pure marketing [Disgruntled]
( c' ) . we should ignore this, because the claims of applications are merely what people always say in their introductions, and those claims never mean anything. [Skepticism]
( d' ) . we should ignore this, because it's not going to help understand the P versus NP problem, even in the long term. [Lack of applications to Theory]
( e' ) . we should ignore this, because I had too much to eat and now I have a stomachache. [Random]