I have a non-technical client who has some hierarchical product data that I'll be loading into a tree structure with Python. The tree has a variable number of levels, and a variable number nodes and leaf nodes at each level.
The client already knows the hierarchy of products and would like to put everything into an Excel spreadsheet for me to parse.
What format can we use that allows the client to easily input and maintain data, and that I can easily parse into a tree with Python's CSV? Going with a column for each level isn't without its hiccups (especially if we introduce multiple node types)
For future readers, I ended up using a column-based hierarchy where each row is the complete traversal to a leaf. So you end up with as many rows as there are leafs.
Electronics | Computers | Laptops
Electronics | Computers | Desktop
Electronics | Game Systems | Xbox
Electronics | Game Systems | PS3
Electronics | Game Systems | Wii
Electronics | MP3 Players | iPod Shuffle
Clothing | Menswear | Pants | Shorts
Clothing | Menswear | Pants | Pajamas
In the script, Python traverses row-by-row, cell-by-cell, keeping track of both the current row and the previous row. Since you traverse from left-to-right you go from root to leaf. If the current column in current row is ever different than the current column in the previous row, then we must have gone down a new branch, and we'll add a new node to our tree.