Pseudocode to compare two trees

ianmayo picture ianmayo · Oct 11, 2013 · Viewed 10.7k times · Source

This is a problem I've encountered a few times, and haven't been convinced that I've used the most efficient logic.

As an example, presume I have two trees: one is a folder structure, the other is an in-memory 'model' of that folder structure. I wish to compare the two trees, and produce a list of nodes that are present in one tree and not the other - and vice versa.

Is there an accepted algorithm to handle this?

Answer

Jeremy West picture Jeremy West · Oct 11, 2013

Seems like you just want to do a pre-order traversal, essentially. Where "visiting" a node means checking for children that are in one version but not the other.

More precisely: start at the root. At each node, get a set of items in each of the two versions of the node. The symmetric difference of the two sets contains the items in one but not the other. Print/output those. The intersection contains the items that are common to both. For each item in the intersection (I assume you aren't going to look further into the items that are missing from one tree), call "visit" recursively on that node to check its contents. It's a O(n) operation, with a little recursion overhead.