Reputation: 1602
I am trying to do a Multi level Grouping with end result in List and Dictionary building. So for example if my Dataframe looks like this blow
X Y Z
AAAA BBBB CCCCC
AAAA BBBB DDDDD
AAAA BBBB EEEEE
FFFF GGGG HHHHH
FFFF GGGG IIIII
JJJJ KKKK LLLLL
I am trying to merge Y and Z Columns into Dictionary based on grouping from X and Y. hence the end result be like
P Q
AAAA BBBB:[{CCCCC},{DDDDD},{EEEEE}]
FFFF GGGG:[{HHHHH},{IIIII}]
JJJJ KKKK:[{LLLLL}]
Is this something that is purely achievable by pandas Functions or do i have to loop and then then process ?
Upvotes: 2
Views: 187
Reputation: 23217
You can do it in 2 steps:
SeriesGroupBy.agg()
of Series Z
grouped by X
and Y
.Y
as key and list of sets created in step 1 as value by using df.apply()
with pd.Series()
Codes:
df2 = (df.groupby(['X', 'Y'], as_index=False)['Z']
.agg(lambda x: [{y} for y in x.values]))
df2 = df2.apply(lambda x: pd.Series({'P': x['X'], 'Q': {x['Y']: x['Z']}}), axis=1)
Result:
P Q
0 AAAA {'BBBB': [{'CCCCC'}, {'DDDDD'}, {'EEEEE'}]}
1 FFFF {'GGGG': [{'HHHHH'}, {'IIIII'}]}
2 JJJJ {'KKKK': [{'LLLLL'}]}
Alternatively, if you want to keep all columns in one place, you can also modify the 2nd step to use result_type='expand'
instead of using pd.Series()
, as follows:
df2 = (df.groupby(['X', 'Y'], as_index=False)['Z']
.agg(lambda x: [{y} for y in x.values]))
df2[['P', 'Q']] = df2.apply(lambda x: [x['X'], {x['Y']: x['Z']}], result_type='expand', axis=1)
Result:
X Y Z P Q
0 AAAA BBBB [{CCCCC}, {DDDDD}, {EEEEE}] AAAA {'BBBB': [{'CCCCC'}, {'DDDDD'}, {'EEEEE'}]}
1 FFFF GGGG [{HHHHH}, {IIIII}] FFFF {'GGGG': [{'HHHHH'}, {'IIIII'}]}
2 JJJJ KKKK [{LLLLL}] JJJJ {'KKKK': [{'LLLLL'}]}
Upvotes: 1
Reputation: 195543
You can group by X
and Y
column and the aggregate to list. Then create the dictionaries with .apply
:
df = df.groupby(["X", "Y"], as_index=False).agg(list).rename(columns={"X": "P"})
df["Q"] = df[["Y", "Z"]].apply(lambda x: {x["Y"]: x["Z"]}, axis=1)
print(df[["P", "Q"]])
Prints:
P Q
0 AAAA {'BBBB': ['CCCCC', 'DDDDD', 'EEEEE']}
1 FFFF {'GGGG': ['HHHHH', 'IIIII']}
2 JJJJ {'KKKK': ['LLLLL']}
Upvotes: 1