The Skiena's book of algorithm contains the following explaination of Floyd Warshall algorithm:
floyd(adjacency_matrix *g)
{
int i,j; /* dimension counters */
int k; /* intermediate vertex counter */
int through_k; /* distance through vertex k */
for (k=1; k<=g->nvertices; k++)
for (i=1; i<=g->nvertices; i++)
for (j=1; j<=g->nvertices; j++) {
through_k = g->weight[i][k]+g->weight[k][j];
if (through_k < g->weight[i][j])
g->weight[i][j] = through_k;
}
}
The Floyd-Warshall all-pairs shortest path runs in O(n3) time, which is asymptotically no better than n calls to Dijkstra’s algorithm. However, the loops are so tight and the program so short that it runs better in practice. It is notable as one of the rare graph algorithms that work better on adjacency matrices than adjacency lists.
Can someone please elaborate why the bold part is true?
Let's break it down:
The Floyd-Warshall all-pairs shortest path runs in O(n3) time ...
This is because we have a triple for
loop, each with n iterations to make
... which is asymptotically no better than n calls to Dijkstra’s algorithm. ...
Recall that a single call to Dijkstra's algorithm will tell us all the shortest paths from one particular node, x1, to all nodes in the graph. We can therefore make n calls to Dijkstra's algorithm for all nodes in the graph: x1, x2, ... xn to find the shortest paths from x1 to all nodes, x2 to all nodes, ... xn to all nodes. In other words, this gives us all pairs shortest paths – just like Floyd Warshall does!
Running Dijkstra n times:
time = number of nodes × O((n + e)lgn) [assuming binary heap used for Dijkstra]
= n × O((n + e)lgn)
= O(n(n + e)lgn)
And so it is indeed the case that the O(n^3) time of Floyd-Warshall is not better than the O(n(n + e)lgn) time of making n calls to Dijkstra.
... However, the loops are so tight and the program so short that it runs better in practice.
The key words here are "in practice". Remember, asymptotic analysis isn't perfect. It's a mathematical abstraction / approximation for performance. When we actually come to run the code, there are many practical factors that it did not take in account. The processor has a complex low-level architecture for fetching and running instructions. It pipelines instructions, pre-fetches instructions, attempts to predict instructions, caches instructions and data, ... it's quite unpredictable! And yet, all of these low-level optimisations can have a massive impact on the overall performance of an algorithm. Algorithms that are theoretically slow can receive a boost, and algorithms that are theoretically quick might not receive the same boost. This is sometimes called the hidden constant factor of the big-oh notation.
It turns out that processors love optimizing nested for
loops and multi-dimentional arrays! This is what Skiena is alluding to here. The for
loops the array makes the best use of temporal and spacial locality and work well with low-level processor optimizations. Dijkstra's algorithm on the other hand doesn't do this as well and so the processor optimisations don't work as well. As a result Dijkstra could indeed be slower in practice.
Floyd-Warshall is a "short program" in the sense that is isn't using any sophisticated data structures and the number of instructions to repeat is small. These things, along with the processor optimizations contribute to Floyd-Warshall having a small hidden constant factor. That is to say, the k in O(k · n3) is small.