Optimized SQL for tree structures

Seb Nilsson picture Seb Nilsson · Nov 25, 2008 · Viewed 30.8k times · Source

How would you get tree-structured data from a database with the best performance? For example, say you have a folder-hierarchy in a database. Where the folder-database-row has ID, Name and ParentID columns.

Would you use a special algorithm to get all the data at once, minimizing the amount of database-calls and process it in code?

Or would you use do many calls to the database and sort of get the structure done from the database directly?

Maybe there are different answers based on x amount of database-rows, hierarchy-depth or whatever?

Edit: I use Microsoft SQL Server, but answers out of other perspectives are interesting too.

Answer

Ned Batchelder picture Ned Batchelder · Nov 25, 2008

It really depends on how you are going to access the tree.

One clever technique is to give every node a string id, where the parent's id is a predictable substring of the child. For example, the parent could be '01', and the children would be '0100', '0101', '0102', etc. This way you can select an entire subtree from the database at once with:

SELECT * FROM treedata WHERE id LIKE '0101%';

Because the criterion is an initial substring, an index on the ID column would speed the query.