Topological sort with support for cyclic dependencies

Peter picture Peter · Jan 17, 2014 · Viewed 8.9k times · Source

Consider the following dependencies (where A --> B means B depends on A, so effectively, A is the 'parent')

A --> B
A --> C
C --> D
C --> E

More graphically:

    A
    |
----------
|        |
B        C
         |
    -----------
    |         |
    D         E

A topological sort algorithm would return something like:

ABCDE

I have found code for this (exhibit A and exhibit B), but neither support cyclice dependencies. I am in the situtation that this could happen:

A --> B
B --> C
B --> D
C --> B
C --> E

Graphically:

A
|
B <--> C
|      |
D      E

This could return ABCDE or ACBDE. So because B and C are on the same 'level', the order between them is not important (likewise for D and E).

How could I accomplish such a thing. I realize this isn't exactly a topological sorting, but I'm not expert mathematician, so I don't really know where to start looking, let alone how to implement this.

Personally, I'm working in C#, but if you know how to do it in any other language, I'd be happy to study your code and translate it to C#.

update

I can also have the following situation:

A <--------
|         |
--> B --> C
    |     |
    D     E

So, important, this doesn't have to be a tree. I can have any arbitrary graph. In fact, not all nodes have to be connected to one another.

Answer

Eric Lippert picture Eric Lippert · Jan 17, 2014

First off, it is conceptually easier if you have a graph such that you can ask "what do you depend on"? I'm going to assume that we have a graph where a directed edge from A to B means "A depends on B", which is the opposite of your statement.

I am somewhat confused by your question since a topo sort that ignores cycles is virtually the same as a regular topo sort. I'll develop the algorithm so that you can handle cycles as you see fit; perhaps that will help.

The idea of the sort is:

  • A graph is a collection of nodes such that each node has a collection of neighbours. As I said, if a node A has a neighbour B then A depends on B, so B must happen before A.

  • The sort takes a graph and produces a sorted list of nodes.

  • During the operation of the sort a dictionary is maintained which maps every node onto one of three values: alive, dead and undead. An alive node has yet to be processed. A dead node is already processed. An undead node is being processed; it's no longer alive but not yet dead.

  • If you encounter a dead node you can skip it; it's already in the output list.

  • If you encounter a live node then you process it recursively.

  • If you encounter an undead node then it is part of a cycle. Do what you like. (Produce an error if cycles are illegal, treat it as dead if cycles are legal, etc.)


function topoSort(graph) 
  state = []
  list = []
  for each node in graph
    state[node] = alive
  for each node in graph
    visit(graph, node, list, state)
  return list

function visit(graph, node, list, state)
  if state[node] == dead
    return // We've done this one already.
  if state[node] == undead
    return // We have a cycle; if you have special cycle handling code do it here.
  // It's alive. Mark it as undead.
  state[node] = undead
  for each neighbour in getNeighbours(graph, node)
    visit(graph, neighbour, list, state)
  state[node] = dead
  append(list, node);

Make sense?