How does a Breadth-First Search work when looking for Shortest Path?

Jake picture Jake · Dec 5, 2011 · Viewed 152k times · Source

I've done some research, and I seem to be missing one small part of this algorithm. I understand how a Breadth-First Search works, but I don't understand how exactly it will get me to a specific path, as opposed to just telling me where each individual node can go. I guess the easiest way to explain my confusion is to provide an example:

So for instance, let's say I have a graph like this:

enter image description here

And my goal is to get from A to E (all edges are unweighted).

I start at A, because that's my origin. I queue A, followed by immediately dequeueing A and exploring it. This yields B and D, because A is connected to B and D. I thus queue both B and D.

I dequeue B and explore it, and find that it leads to A (already explored), and C, so I queue C. I then dequeue D, and find that it leads to E, my goal. I then dequeue C, and find that it also leads to E, my goal.

I know logically that the fastest path is A->D->E, but I'm not sure how exactly the breadth-first search helps - how should I be recording paths such that when I finish, I can analyze the results and see that the shortest path is A->D->E?

Also, note that I'm not actually using a tree, so there are no "parent" nodes, only children.

Answer

Sergey Kalinichenko picture Sergey Kalinichenko · Dec 5, 2011

Technically, Breadth-first search (BFS) by itself does not let you find the shortest path, simply because BFS is not looking for a shortest path: BFS describes a strategy for searching a graph, but it does not say that you must search for anything in particular.

Dijkstra's algorithm adapts BFS to let you find single-source shortest paths.

In order to retrieve the shortest path from the origin to a node, you need to maintain two items for each node in the graph: its current shortest distance, and the preceding node in the shortest path. Initially all distances are set to infinity, and all predecessors are set to empty. In your example, you set A's distance to zero, and then proceed with the BFS. On each step you check if you can improve the distance of a descendant, i.e. the distance from the origin to the predecessor plus the length of the edge that you are exploring is less than the current best distance for the node in question. If you can improve the distance, set the new shortest path, and remember the predecessor through which that path has been acquired. When the BFS queue is empty, pick a node (in your example, it's E) and traverse its predecessors back to the origin. This would give you the shortest path.

If this sounds a bit confusing, wikipedia has a nice pseudocode section on the topic.