Alok
Alok

Reputation: 10594

Best approach to convert flat nested data to hierarchical data in python

I am representing category hierarchy in flat manner.
Category Hierarchy is

category1  
    category4
        category6  
    category5  
        category7
category2
category3

I am storing this as list using dictionary

d = [{'id': 1, 'name': 'category1', 'parent_category_id': None, 'level': 1},
     {'id': 2, 'name': 'category2', 'parent_category_id': None, 'level': 1},
     {'id': 3, 'name': 'category3', 'parent_category_id': None, 'level': 1},
     {'id': 4, 'name': 'category4', 'parent_category_id': 1, 'level': 2},
     {'id': 5, 'name': 'category5', 'parent_category_id': 1, 'level': 2},
     {'id': 7, 'name': 'category6', 'parent_category_id': 4, 'level': 3},
     {'id': 7, 'name': 'category7', 'parent_category_id': 5, 'level': 3}]

What can be best approach to convert this category list to hierarchical list like

[{'name': 'category1',
  'subcategory': [{'name': 'category4',
                   'subcategory': [{'name': 'category6', 'subcategory': []}]},
                  {'name': 'category5',
                   'subcategory': [{'name': 'category7', 'subcategory': []}]}]},
 {'name': 'category2', 'subcategory': []},
 {'name': 'category3', 'subcategory': []}]

Upvotes: 1

Views: 3547

Answers (3)

Mike Robins
Mike Robins

Reputation: 1773

Your problem is very similar to that I answered at: Calculating the Path from Parent Child Relationships

I note that you seem to have a lot of superfluous fields in your data structures. Essentially you could represent the information in the post by:

d = {1: {4: {6: None}, 5: {7: None}}, 2: None, 3: None}

Reworking the code for you.

ds = [{'id': 1, 'name': 'category1', 'parent_category_id': None, 'level': 1},
     {'id': 2, 'name': 'category2', 'parent_category_id': None, 'level': 1},
     {'id': 3, 'name': 'category3', 'parent_category_id': None, 'level': 1},
     {'id': 4, 'name': 'category4', 'parent_category_id': 1, 'level': 2},
     {'id': 5, 'name': 'category5', 'parent_category_id': 1, 'level': 2},
     {'id': 6, 'name': 'category6', 'parent_category_id': 4, 'level': 3},
     {'id': 7, 'name': 'category7', 'parent_category_id': 5, 'level': 3}]

e = {1: {4: {6: None}, 5: {7: None}}, 2: None, 3: None}

parents = set()
children = {}
for d in ds:
    c = str(d['id'])
    p = str(d['parent_category_id'])
    if p is not None:
        parents.add(p)
        children[c] = p

# recursively determine parents until child has no parent
def ancestors(p):
    return (ancestors(children[p]) if p in children else []) + [p]

# for each child that has no children print the geneology
for k in (set(children.keys()) - parents):
    print ' '.join(ancestors(k)[1:])

outputs:

3
2
1 5 7
1 4 6

To turn this into a nested dictionary I refer you to What is the best way to implement nested dictionaries?

Upvotes: 3

Mulan
Mulan

Reputation: 135357

We start with a way to make_tree given an index and a root node identity

def make_tree (index, root):
  if not root in index:
    return []
  else:
    return [ make_node (index, child) for child in index[root] ]

Now we need a way to make_node - this is where we convert to an element in your input data to an element of our output tree

def make_node (index, child):
  return \
    { 'name': child['name']
    , 'children': make_tree (index, child['id'])
    }

Now of course we need a way to make_index based on your input data. We use itertools groupby so that we can perform efficient lookup of all child nodes

from itertools import groupby

def make_index (nodes):
  return \
    { k: list (v)
        for (k,v) in
          groupby (nodes, lambda n: n['parent_category_id']) }

Lastly we write main to tie it all together. Note the data is not re-indexed or filtered for each iteration

def main (nodes, root = None):
  return make_tree (make_index (nodes), root)

Full program demonstration

from itertools import groupby

def make_tree (index, root):
  if not root in index:
    return []
  else:
    return [ make_node (index, child) for child in index[root] ]

def make_node (index, child):
  return \
    { 'name': child['name']
    , 'children': make_tree (index, child['id'])
    }

def make_index (nodes):
  return \
    { k: list (v)
        for (k,v) in
          groupby (nodes, lambda n: n['parent_category_id']) }

def main (nodes, root = None):
  return make_tree (make_index (nodes), root)

d = \
  [ {'id': 1, 'name': 'category1', 'parent_category_id': None, 'level': 1}
  , {'id': 2, 'name': 'category2', 'parent_category_id': None, 'level': 1}
  , {'id': 3, 'name': 'category3', 'parent_category_id': None, 'level': 1}
  , {'id': 4, 'name': 'category4', 'parent_category_id': 1, 'level': 2}
  , {'id': 5, 'name': 'category5', 'parent_category_id': 1, 'level': 2}
  , {'id': 7, 'name': 'category6', 'parent_category_id': 4, 'level': 3}
  , {'id': 7, 'name': 'category7', 'parent_category_id': 5, 'level': 3}
  ]

# get sub-tree of [None] from dataset [d]
print (main (d, None))

Program output

[ { 'name': 'category1'
  , 'children': [ { 'name': 'category4'
                  , 'children': [  { 'name': 'category6'
                                  , 'children': []
                                  }
                                ]
                  }
                  , { 'name': 'category5'
                    , 'children': [ { 'name': 'category7'
                                    , 'children': []
                                    }
                                  ]
                    }
                ]
  }
  , { 'name': 'category2', 'children': [] } 
  , { 'name': 'category3', 'children': [] }
]

Upvotes: 1

jakun
jakun

Reputation: 674

def flat_to_hierarchical(d, category_id=None):
    out = list()
    for item in filter(lambda item: item['parent_category_id']==category_id, d):
        out.append(dict(
            name = item['name'],
            subcategories = flat_to_hierarchical(d, item['id'])
        ))
    return out


print(flat_to_hierarchical(d))

Upvotes: 1

Related Questions