Reputation: 25
I have the following dataframe:
name style val1 val2 val3
23 sher D 2 5 6
56 sher C 3 2 4
34 David A 1 1 1
47 iamgo B 4 4 3
77 para A 6 4 2
120 moli A 7 2 5
86 para A 5 4 1
I want to create new dataframe using groupby "name" that will return the following:
style val1 val2 val3
name
sher D,C 3 5 4
David A 1 1 1
iamgo B 4 4 3
para A 6 4 2
moli A 7 2 5
for "style" I want to add the value in case its not the same value (like with "para"), for "val1" and "val2" the maximum value, for "val3" the minimum value, and reset the indexes. here's my code:
df.groupby('name').agg({
'style': sum,
'val1': max,
'val2': max,
'val3': min
})
output:
style val1 val2 val3
name
sher DC 3 5 4
David A 1 1 1
iamgo B 4 4 3
para AA 6 4 2
moli A 7 2 5
what I missing here?
Thanks,
Upvotes: 1
Views: 39
Reputation: 862406
Use join
function instead sum
:
df1 = df.groupby('name').agg({
'style': ','.join,
'val1': max,
'val2': max,
'val3': min
})
print (df1)
style val1 val2 val3
name
David A 1 1 1
iamgo B 4 4 3
moli A 7 2 5
para A,A 6 4 1
sher D,C 3 5 4
If need unique values convert values to sets:
df2 = df.groupby('name').agg({
'style': lambda x: ','.join(set(x)),
'val1': max,
'val2': max,
'val3': min
})
print (df2)
style val1 val2 val3
name
David A 1 1 1
iamgo B 4 4 3
moli A 7 2 5
para A 6 4 1
sher D,C 3 5 4
Upvotes: 1