Matt
Matt

Reputation: 813

Getting back data in a Pandas DataFrame after a groupby

I have a DataFrame df with informations about elements, for instance:

df = pd.DataFrame([[1,45,12],[1,8,13],[1,3,4],[2,5,1],[2,7,3]], 
                  columns=['group', 'value1', 'value2'])

I have used something like dfGroups = df.groupby('group').apply(my_agg).reset_index(), so now I have dfgroup, say

dfGroup = pd.DataFrame([[1,4],[2,27]], 
                      columns=['group', 'valuegroup'])

Now I need to bring back the group information to the elements, to be able to build new columns taking into account elements data and group data. So to make it simple, let's say I need to build a valuegroupcolumn in df identical dfGroup data. So I would obtain

    group   value1  value2  valuegroup
0   1       45      12      4
1   1       8       13      4
2   1       3       4       4
3   2       5       1       27
4   2       7       3       27

What's the best way to do that? (If possible, something that would work with Python 2 and 3)

Upvotes: 1

Views: 80

Answers (2)

jezrael
jezrael

Reputation: 862511

I think you need transform:

df['new'] = df.groupby('group')['value'].transform(my_agg)

Or merge:

df = pd.merge(df, dfGroup, on='group', how='left')
print (df)
   group  value1  value2  valuegroup
0      1      45      12           4
1      1       8      13           4
2      1       3       4           4
3      2       5       1          27
4      2       7       3          27

Or if omit reset_index is possible use join:

dfGroups = df.groupby('group').apply(my_agg)
df = df.join(dfGroups, on='group')

Sample:

dfGroup = pd.DataFrame([4,27], 
                      columns=['valuegroup'], index=[1,2])

print (dfGroup)
   valuegroup
1           4
2          27

df = df.join(dfGroup, on='group')
print (df)
   group  value1  value2  valuegroup
0      1      45      12           4
1      1       8      13           4
2      1       3       4           4
3      2       5       1          27
4      2       7       3          27

Upvotes: 1

cstainbrook
cstainbrook

Reputation: 95

df.set_index('group', inplace=True)
dfGroup.set_index('group', inplace=True)
df['valuegroup'] = dfGroup['valuegroup']

Upvotes: 1

Related Questions