Topological sort to find the number of paths to t

Stratford picture Stratford · Aug 6, 2013 · Viewed 10.6k times · Source

I have to develop an O(|V|+|E|) algorithm related to topological sort which, in a directed acyclic graph (DAG), determines the number of paths from each vertex of the graph to t (t is a node with out-degree 0). I have developed a modification of DFS as follow:

DFS(G,t):
    for each vertex u ∈ V do
        color(u) = WHITE
        paths_to_t(u) = 0
    for each vertex u ∈ V do
        if color(u) == WHITE then
            DFS-Visit(u,t)

DFS-Visit(u,t):
    color(u) = GREY
    for each v ∈ neighbors(u) do
        if v == t then
            paths_to_t(u) = paths_to_t(u) + 1
        else then
            if color(v) == WHITE then
                DFS-Visit(v)
            paths_to_t(u) = paths_to_t(u) + paths_to_t(v)
    color(u) = BLACK

But I am not sure if this algorithm is related to topological sort or if should I restructure my work with another point of view.

Answer

amit picture amit · Aug 6, 2013

It can be done using Dynamic Programming and topological sort as follows:

Topological sort the vertices, let the ordered vertices be v1,v2,...,vn
create new array of size t, let it be arr
init: arr[t] = 1
for i from t-1 to 1 (descending, inclusive):
    arr[i] = 0  
    for each edge (v_i,v_j) such that i < j <= t:
         arr[i] += arr[j]

When you are done, for each i in [1,t], arr[i] indicates the number of paths from vi to vt

Now, proving the above claim is easy (comparing to your algorithm, which I have no idea if its correct and how to prove it), it is done by induction:

Base: arr[t] == 1, and indeed there is a single path from t to t, the empty one.
Hypothesis: The claim is true for each k in range m < k <= t
Proof: We need to show the claim is correct for m.
Let's look at each out edge from vm: (v_m,v_i).
Thus, the number of paths to vt starting from v_m that use this edge (v_m,v_i). is exactly arr[i] (induction hypothesis). Summing all possibilities of out edges from v_m, gives us the total number of paths from v_m to v_t - and this is exactly what the algorithm do.
Thus, arr[m] = #paths from v_m to v_t

QED

Time complexity:
The first step (topological sort) takes O(V+E).
The loop iterate all edges once, and all vertices once, so it is O(V+E) as well.
This gives us total complexity of O(V+E)