Reputation: 405
I'm trying to cycle through each row of a pandas dataframe to build a dictionary of member to parent items.
Each and every value of the dataframe is a member only one time. If a member has no parent, it's parent becomes 'none'.
As an example:
df = pd.DataFrame({'level 5': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i'},
'level 4': {0: 'g1', 1: 'g1', 2: 'g1', 3: 'g1', 4: 'g1', 5: 'g1', 6: 'g2', 7: 'g2', 8: 'g3'},
'level 3': {0: 'g4', 1: 'g4', 2: 'g4', 3: 'g4', 4: 'g4', 5: 'g4', 6: 'g4', 7: 'g4', 8: 'g6'},
'level 2': {0: 'g4', 1: 'g4', 2: 'g4', 3: 'g4', 4: 'g4', 5: 'g4', 6: 'g4', 7: 'g4', 8: 'g4'},
'level 1': {0: 'g5', 1: 'g5', 2: 'g5', 3: 'g5', 4: 'g5', 5: 'g5', 6: 'g5', 7: 'g5', 8: 'g5'}})
Which looks like:
level 5 level 4 level 3 level 2 level 1
0 a g1 g4 g4 g5
1 b g1 g4 g4 g5
2 c g1 g4 g4 g5
3 d g1 g4 g4 g5
4 e g1 g4 g4 g5
5 f g1 g4 g4 g5
6 g g2 g4 g4 g5
7 h g2 g4 g4 g5
8 i g3 g6 g4 g5
Note that all but the last row has two consecutive g4's for level 3 and level 2.
I would like to build a dictionary that looks like this:
output = {'a': 'g1', 'g1': 'g4', 'g4': 'g5', 'g5': 'none', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1', 'f': 'g1', 'g': 'g2', 'g2': 'g4', 'h': 'g2', 'i': 'g3', 'g3': 'g6', 'g6': 'g4'}
I've come close by applying a function to each row of df. But I can't accommodate the ragged hierarchy.
Upvotes: 0
Views: 56
Reputation: 30920
One approach
cols = df.columns
my_dict = {}
for key, value in zip(cols[:-1], cols[1:]):
my_dict.update(dict(zip(df[key], df[value])))
print(my_dict)
{'a': 'g1',
'b': 'g1',
'c': 'g1',
'd': 'g1',
'e': 'g1',
'f': 'g1',
'g': 'g2',
'h': 'g2',
'i': 'g3',
'g1': 'g4',
'g2': 'g4',
'g3': 'g6',
'g4': 'g5',
'g6': 'g4'}
if you want 'none' values yo can add at the end:
my_dict.update(dict(zip(df[value], ['none']*len(df))))
print(my_dict)
{'a': 'g1', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1',
'f': 'g1', 'g': 'g2', 'h': 'g2', 'i': 'g3', 'g1': 'g4', 'g2': 'g4',
'g3': 'g6', 'g4': 'g5', 'g6': 'g4', 'g5': 'none'}
Upvotes: 2