I am working with d3.js to visualise families of animals (organisms) (up to 4000 at a time) as a tree graph, though the data source could just as well be a directory listing, or list of namespaced objects. my data looks like:
json = {
organisms:[
{name: 'Hemiptera.Miridae.Kanakamiris'},
{name: 'Hemiptera.Miridae.Neophloeobia.incisa'},
{name: 'Lepidoptera.Nymphalidae.Ephinephile.rawnsleyi'},
... etc ...
]
}
my question is: I am trying to find the best way to convert the above data to the hierarchical parent / children data structure as is used by a number of the d3 visualisations such as treemap (for data example see flare.json in the d3/examples/data/ directory). Here is an example of the desired data structure:
{"name": "ROOT",
"children": [
{"name": "Hemiptera",
"children": [
{"name": "Miridae",
"children": [
{"name": "Kanakamiris", "children":[]},
{"name": "Neophloeobia",
"children": [
{"name": "incisa", "children":[] }
]}
]}
]},
{"name": "Lepidoptera",
"children": [
{"name": "Nymphalidae",
"children": [
{"name": "Ephinephile",
"children": [
{"name": "rawnsleyi", "children":[] }
]}
]}
]}
]}
}
EDIT: enclosed all the original desired data structure inside a ROOT
node, so as to conform with the structure of the d3 examples, which have only one master parent node.
I am looking to understand a general design pattern, and as a bonus I would love to see some solutions in either javascript, php, (or even python). javascript is my preference. In regards to php: the data I am actually using comes from a call to a database by a php script that encodes the results as json. database results in the php script is an ordered array (see below) if that is any use for php based answers.
Array
(
[0] => Array
(
['Rank_Order'] => 'Hemiptera'
['Rank_Family'] => 'Miridae'
['Rank_Genus'] => 'Kanakamiris'
['Rank_Species'] => ''
) ........
where:
'Rank_Order'
isParentOf 'Rank_Family'
isParentOf 'Rank_Genus'
isParentOf 'Rank_Species'
I asked a similar question focussed on a php solution here, but the only answer is not working on my server, and I dont quite understand what is going on, so I want to ask this question from a design pattern perspective, and to include reference to my actual use which is in javascript and d3.js.
The following is specific to the structure you've provided, it could be made more generic fairly easily. I'm sure the addChild function can be simplified. Hopefully the comments are helpful.
function toHeirarchy(obj) {
// Get the organisms array
var orgName, orgNames = obj.organisms;
// Make root object
var root = {name:'ROOT', children:[]};
// For each organism, get the name parts
for (var i=0, iLen=orgNames.length; i<iLen; i++) {
orgName = orgNames[i].name.split('.');
// Start from root.children
children = root.children;
// For each part of name, get child if already have it
// or add new object and child if not
for (var j=0, jLen=orgName.length; j<jLen; j++) {
children = addChild(children, orgName[j]);
}
}
return root;
// Helper function, iterates over children looking for
// name. If found, returns its child array, otherwise adds a new
// child object and child array and returns it.
function addChild(children, name) {
// Look for name in children
for (var i=0, iLen=children.length; i<iLen; i++) {
// If find name, return its child array
if (children[i].name == name) {
return children[i].children;
}
}
// If didn't find name, add a new object and
// return its child array
children.push({'name': name, 'children':[]});
return children[children.length - 1].children;
}
}