Implementing Depth First Search into C# using List and Stack

Dumpen picture Dumpen · Apr 27, 2011 · Viewed 45.3k times · Source

I want to create a depth first search which I have been somewhat successful in.

Here is my code so far (Except my constructor, note the Vertex and Edge classes only contain properties, nothing important to post here):

private Stack<Vertex> workerStack = new Stack<Vertex>();
private List<Vertex> vertices = new List<Vertex>();
private List<Edge> edges = new List<Edge>();

private int numberOfVertices;
private int numberOfClosedVertices;
private int visitNumber = 1;

private void StartSearch()
{
    // Make sure to visit all vertices
    while (numberOfClosedVertices < numberOfVertices && workerStack.Count > 0)
    {
        // Get top element in stack and mark it as visited
        Vertex workingVertex = workerStack.Pop();
        workingVertex.State = State.Visited;

        workingVertex.VisitNumber = visitNumber;
        visitNumber++;

        numberOfClosedVertices++;

        // Get all edges connected to the working vertex
        foreach (Vertex vertex in GetConnectedVertices(workingVertex))
        {
            vertex.Parent = workingVertex;
            workerStack.Push(vertex);
        }
    }
}

private List<Vertex> GetConnectedVertices(Vertex vertex)
{
    List<Vertex> vertices = new List<Vertex>();

    // Get all vertices connected to vertex and is unvisited, then add them to the vertices list
    edges.FindAll(edge => edge.VertexSource == vertex && edge.VertexTarget.State == State.Unvisited).ForEach(edge => vertices.Add(edge.VertexTarget));

    return vertices;
}

Its working in the way that all vertices get visited, but not in the right order.

Here is a comparison of how mine gets visited compared to wikipedia: Comparison

Its seems mine is turned around and is beginning from right to left.

Do you know what causes it? (Also any advice on my implementation would be greatly appreciated)

EDIT: I got my answer, but still wanted to show the end result for the GetConnectedVertices method:

private List<Vertex> GetConnectedVertices(Vertex vertex)
{
    List<Vertex> connectingVertices = new List<Vertex>();

    (from edge in edges
     where edge.VertexSource == vertex && edge.VertexTarget.State == State.Unvisited
     select edge).
     Reverse().
     ToList().
     ForEach(edge => connectingVertices.Add(edge.VertexTarget));

    return connectingVertices;
}

Answer

Eric Lippert picture Eric Lippert · Apr 27, 2011

It seems mine is turned around and is beginning from right to left. Do you know what causes it?

As others have noted, you are pushing the nodes-to-visit-next on the stack in order from left to right. That means they get popped off right-to-left, since a stack reverses the order. Stacks are last-in-first-out.

You can fix the problem by making GetConnectedVertices build a stack, not a list. That way the connected vertices are reversed twice, once when they go on the returned stack and once when they go on the real stack.

Also any advice on my implementation would be greatly appreciated

The implementation works, I suppose, but it has a great many fundamental problems. If I were presented that code for review, here's what I'd say:

First off, suppose you wanted to do two depth-first searches of this data structure at the same time. Either because you were doing it on multiple threads, or because you have a nested loop in which the inner loop does a DFS for a different element than the outer loop. What happens? They interfere with each other because both try to mutate the "State" and "VisitNumber" fields. It is a really bad idea to have what should be a "clean" operation like searching actually make your data structure "dirty".

Doing so also makes it impossible for you to use persistent immutable data to represent redundant portions of your graph.

Also, I notice that you omit the code that cleans up. When is "State" ever set back to its original value? What if you did a second DFS? It would immediately fail since the root is already visited.

A better choice for all these reasons is to keep the "visited" state in its own object, not in each vertex.

Next, why are all the state objects private variables of a class? This is a simple algorithm; there's no need to build an entire class for it. A depth first search algorithm should take the graph to search as a formal parameter, not as object state, and it should maintain its own local state as necessary in local variables, not fields.

Next, the abstraction of the graph is... well, its not an abstraction. It's two lists, one of vertices and one of edges. How do we know that those two lists are even consistent? Suppose there are vertices that are not in the vertex list but are on the edge list. How do you prevent that? What you want is a graph abstraction. Let the graph abstraction implementation worry about how to represent edges and find neighbours.

Next, your use of ForEach is both legal and common, but it makes my head hurt. It is hard to read your code and reason about it with all the lambdas. We have a perfectly good "foreach" statement. Use it.

Next, you are mutating a "parent" property but it is not at all clear what this property is for or why it is being mutated during a traversal. Vertices in an arbitrary graph do not have "parents" unless the graph is a tree, and if the graph is a tree then there is no need to keep track of the "visited" state; there are no loops in a tree. What is going on here? This code is just bizarre, and it is not necessary to perform a DFS.

Next, your helper method named GetConnectedVertices is a lie. It does not get connected vertices, it gets connected not-already-visited vertices. Methods whose names lie are very confusing.

Finally, this claims to be a depth first search but it doesn't search for anything! Where is the thing being searched for? Where is the result returned? This isn't a search at all, it's a traversal.

Start over. What do you want? A depth-first traversal of a graph given a starting vertex. Then implement that. Start by defining what you are traversing. A graph. What service do you need from a graph? A way of getting the set of neighbouring vertices:

interface IGraph
{
    IEnumerable<Vertex> GetNeighbours(Vertex v);
}

What is your method returning? A sequence of Vertices in depth-first order. What does it take? A starting vertex. OK:

static class Extensions
{
    public static IEnumerable<Vertex> DepthFirstTraversal(
        this IGraph graph, 
        Vertex start) 
    { ... }
}

We now have a trivial implementation of depth first search; you can now use the Where clause:

IGraph myGraph = whatever;
Vertex start = whatever;
Vertex result = myGraph.DepthFirstTraversal(start)
                       .Where(v=>something)
                       .FirstOrDefault();

OK, so how are we going to implement that method so it does a traversal without wrecking the graph's state? Maintain your own external state:

public static IEnumerable<Vertex> DepthFirstTraversal(
    this IGraph graph, 
    Vertex start) 
{
    var visited = new HashSet<Vertex>();
    var stack = new Stack<Vertex>();

    stack.Push(start);

    while(stack.Count != 0)
    {
        var current = stack.Pop();

        if(!visited.Add(current))
            continue;

        yield return current;

        var neighbours = graph.GetNeighbours(current)
                              .Where(n=>!visited.Contains(n));

        // If you don't care about the left-to-right order, remove the Reverse
        foreach(var neighbour in neighbours.Reverse()) 
            stack.Push(neighbour);
    }
}

See how much cleaner and shorter that is? No mutation of state. No mucking around with edge lists. No badly-named helper functions. And the code actually does what it says it does: traverses a graph.

We also get the benefits of iterator blocks; namely, if someone is using this for a DF search, then the iteration is abandoned when the search criteria are met. We don't have to do a full traversal if we find the result early.