Reputation: 165
I am struggling to form a tree like dataframe with children rows right under their parents. What I am trying to do is something between merging on object_id x parent_id and concatenating along axis 0. So what I am looking for is implementation of the interlace function in the snippet below.
In[1]: parents = pd.DataFrame({'object_id':[1,2],
'parent_id':[0,0],
'position': [1,2]})
In[2]: parents
Out[2] object_id parent_id position
0 1 0 1
1 2 0 2
In[3]: children = pd.DataFrame({'object_id':[3,4,5],
'parent_id':[1,1,2],
'position': [1,2,1]})
In[4]: children
Out[4]: object_id parent_id position
0 3 1 1
1 4 1 2
2 5 2 1
In[5]: interlace(parent, children, on=('object_id', 'parent_id'))
Out[5]: object_id parent_id position
0 1 0 1
1 3 1 1
2 4 1 2
3 2 0 1
4 5 2 1
Is there an efficient way to do this in pandas? I think one could do something like
parents_with_children = []
for i, parentrow in parents.iteritems():
childrenrows = children[children.parent_id == parentrow.object_id]
parents_with_children.append(pd.concat([parentrow, childrenrows])
result = pd.concat(parents_with_children)
But I feel there should be and easier and probably more efficient way to do this.
edit: Rows with the same level and same parent need to stay sorted by their position.
Upvotes: 1
Views: 353
Reputation: 924
A possible solution:
children['sort_id']=children.parent_id
parents['sort_id']=parents.object_id
pd.concat([parents,children]).sort_values(['sort_id', 'parent_id']).drop('sort_id', 1)
Upvotes: 1