Storing Directory Hierarchy in a Key-Value Data store

The Unknown picture The Unknown · Oct 24, 2009 · Viewed 13.5k times · Source

What is a clean/efficient method for storing the directory Hierarchy/tree in a Key-Value database (in my case MongoDB but any of them)?

For example a tree structure

- Cars 
   + Audi 
   + BMW
      - M5
   + Ford
- Color
   + Red
      - Apple
      - Cherry
   + Purple
- Funny

The method I am using now, each object links to it's parent

{ 
  dir: "red"
  parent-dir: "color"
}

This makes it very efficient/fast to insert and reorder any aspect of the tree (for example if I want to move Red and all it's children to the Cars directory).

But this method sucks when I want to all subdirectories and their children for a given directory recursively. To make it efficient to parse I can have a structure for example

{ 
  dir: "red"
  children: "audi, bmw, ford"
}

{ 
  dir: "bmw"
  children: "m5"
}

But if I want to modify the tree, a whole bunch of objects need to touched and modified.

Are there any other methods to storing a directory structure in a KV store?

Answer

Frunsi picture Frunsi · Dec 14, 2009

The method you currently use now is called adjacency list model.

Another model to store hierarchical data in a (relational) database is the nested set model. Its implementation in SQL databases is well known. Also see this article for the modified preorder tree traversal algorithm.

A very simple method: you could store a path per object - with those it should be easy to query trees in NOSQL databases:

{ path: "Color", ... }
{ path: "Color.Red", ... }
{ path: "Color.Red.Apple", ... }
{ path: "Color.Red.Cherry", ... }

When nodes will be removed or renamed some paths must be updated. But in general, this method looks promising. You just have to reserve a special character as separator. The storage space overhead should be negligible.

edit: this method is called materialized path

Finally, here is a comparison of different methods for hierarchical data in NOSQL databases.