Hierarchical Data in MySQL

kidcapt picture kidcapt · Jul 6, 2009 · Viewed 12k times · Source

I've got a sort of tree like thing going on in my MySQL database.

I have a database that has categories, and each category has a subcat. I'm keeping all the categories in one table, so the columns are like this:

*categories table*
id | name  | parent_id
1  | Toys  | 0
2  | Dolls | 1
3  | Bikes | 1

Each item in my database is assigned to one of those categories:

*items table*
item   | category_id
barbie | 2
schwinn| 3

The problem is if someone wants to see all TOYS (the parent category) what is the best way to fetch the info from the items database? The only way I know how is to do something like

SELECT * 
FROM items 
WHERE category_id = 2 
JOIN SELECT * 
     FROM items 
     WHERE category_id = 3
     etc... 

But if I had like 10 categories under Toys, then I'd have to do this join and query 10 times.

Is there a better way to handle this?

Answer

Jeff Meatball Yang picture Jeff Meatball Yang · Jul 6, 2009

You want to be given the parent ID:

So assume you are given

set @parentId = 1 /*toys*/

select 
  *
from
  Items i
inner join Categories c on c.id = i.categoryId
where
  c.parentId = @parentId

This will give you the items you want - with one major design flaw: it doesn't handle multiple levels of hierarchical categories.

Let's say you had this Categories table:

*Categories table*
id | name    | parentId
1  | Toys    | 0
2  | Dolls   | 1
3  | Bikes   | 1
4  | Models  | 2
5  | Act.Fig.| 2
6  | Mountain| 3
7  | BMX     | 3

And Items:

*items table*
item   | category_id
Barbie | 4
GIJoe  | 5
Schwinn| 6
Huffy  | 7

The only way to get all the relevant Items is do a self join:

select 
  *
from
  Items i 
inner join Categories c on c.id = i.categoryId
inner join Categories c2 on c.parentId = c2.id
where
  c2.parentId = @parentId

This pattern is not scalable - since you can have MULTIPLE levels of hierarchy.

One common way to deal with hierarchies is to build a "flattened" table: a row that links each node to ALL it's descendants.

In addition to a Categories table, you build a second table:

*CategoriesFlat table*  The Name column is here only for readability
id | name    | parentId
1  | Toys    | 1
-----------------
2  | Dolls   | 1
2  | Dolls   | 2
-----------------
4  | Models  | 1
4  | Models  | 2
4  | Models  | 4
5  | Act.Fig.| 1
5  | Act.Fig.| 2
5  | Act.Fig.| 5
-----------------
3  | Bikes   | 1
3  | Bikes   | 3
-----------------
6  | Mountain| 1
6  | Mountain| 3
6  | Mountain| 6
7  | BMX     | 1
7  | BMX     | 3
7  | BMX     | 7

So you can write:

select 
  *
from
  Items i
inner join CategoriesFlat c on c.id = i.categoryId
where
  c.parentId = @parentId

And get ALL the relevant Categories and Items.

Here's a great slideshow about SQL anti-patterns and solutions to them. (Hierarchical data in SQL is an anti-pattern, but don't be disheartened - we all run into this one)