Flatten Adjacency List Hierarchy To A List Of All Paths

Aaron Hoffman picture Aaron Hoffman · Apr 19, 2009 · Viewed 10.3k times · Source

I have a Table that stores Hierarchical information using the Adjacency List model. (uses a self referential key - example below. This Table may look familiar):

category_id name                 parent
----------- -------------------- -----------
1           ELECTRONICS          NULL
2           TELEVISIONS          1
3           TUBE                 2
4           LCD                  2
5           PLASMA               2
6           PORTABLE ELECTRONICS 1
7           MP3 PLAYERS          6
8           FLASH                7
9           CD PLAYERS           6
10          2 WAY RADIOS         6

What is the best method to "flatten" the above data into something like this?

category_id lvl1        lvl2        lvl3        lvl4
----------- ----------- ----------- ----------- -----------
1           1           NULL        NULL        NULL
2           1           2           NULL        NULL
6           1           6           NULL        NULL
3           1           2           3           NULL
4           1           2           4           NULL
5           1           2           5           NULL
7           1           6           7           NULL
9           1           6           9           NULL
10          1           6           10          NULL
8           1           6           7           8

Each row is one "Path" through the Hierarchy, except there is a row for each node (not just each leaf node). The category_id column represents the current node and the "lvl" columns are its ancestors. The value for the current node must also be in the farthest right lvl column. The value in the lvl1 column will always represent the root node, values in lvl2 will always represent direct descendants of lvl1, and so on.

If possible the method to generate this output would be in SQL, and would work for n-tier hierarchies.

Answer

bobince picture bobince · Apr 19, 2009

To do multi-level queries across a simple adjacency-list invariably involves self-left-joins. It's easy to make a right-aligned table:

SELECT category.category_id,
    ancestor4.category_id AS lvl4,
    ancestor3.category_id AS lvl3,
    ancestor2.category_id AS lvl2,
    ancestor1.category_id AS lvl1
FROM categories AS category
    LEFT JOIN categories AS ancestor1 ON ancestor1.category_id=category.category_id
    LEFT JOIN categories AS ancestor2 ON ancestor2.category_id=ancestor1.parent
    LEFT JOIN categories AS ancestor3 ON ancestor3.category_id=ancestor2.parent
    LEFT JOIN categories AS ancestor4 ON ancestor4.category_id=ancestor3.parent;

To left-align it like your example is a bit more tricky. This comes to mind:

SELECT category.category_id,
    ancestor1.category_id AS lvl1,
    ancestor2.category_id AS lvl2,
    ancestor3.category_id AS lvl3,
    ancestor4.category_id AS lvl4
FROM categories AS category
    LEFT JOIN categories AS ancestor1 ON ancestor1.parent IS NULL
    LEFT JOIN categories AS ancestor2 ON ancestor1.category_id<>category.category_id AND ancestor2.parent=ancestor1.category_id
    LEFT JOIN categories AS ancestor3 ON ancestor2.category_id<>category.category_id AND ancestor3.parent=ancestor2.category_id
    LEFT JOIN categories AS ancestor4 ON ancestor3.category_id<>category.category_id AND ancestor4.parent=ancestor3.category_id
WHERE
    ancestor1.category_id=category.category_id OR
    ancestor2.category_id=category.category_id OR
    ancestor3.category_id=category.category_id OR
    ancestor4.category_id=category.category_id;

would work for n-tier hierarchies.

Sorry, arbitrary-depth queries are not possible in the adjacency-list model. If you are doing this kind of query a lot, you should change your schema to one of the other models of storing hierarchical information: full adjacency relation (storing all ancestor-descendent relationships), materialised path, or nested sets.

If the categories don't move around a lot (which is usually the case for a store like your example), I would tend towards nested sets.