Reputation: 2243
I have a dataframe that looks like:
>>> df = pd.DataFrame({'id':[1,1,1,1,1,2,2,2],'month':[1,1,2,2,2,1,2,2],'value1':[1,1,3,3,5,6,7,7], 'value2': [9,10,11,12,12,14,15,15], 'others': range(8)})
>>> df
id month value1 value2 others
0 1 1 1 9 0
1 1 1 1 10 1
2 1 2 3 11 2
3 1 2 3 12 3
4 1 2 5 12 4
5 2 1 6 14 5
6 2 2 7 15 6
7 2 2 7 15 7
I want to do perform a custom function whose input is a series on value1
and value2
:
def get_most_common(srs):
"""
Returns the most common value in a list. For ties, it returns whatever
value collections.Counter.most_common(1) gives.
"""
from collections import Counter
x = list(srs)
my_counter = Counter(x)
most_common_value = my_counter.most_common(1)[0][0]
return most_common_value
Expected result:
value1 value2
id month
1 1 1 9
2 3 12
2 1 6 14
2 7 15
The function was written like that because initially I only had to apply it to a single column (value1
) so df = df.groupby(['id,'month'])['value1'].apply(get_most_common)
worked. Now I have to apply it to two columns simultaneously.
Attempts:
df = df.groupby(['id,'month'])[['value1','value2']].apply(get_most_common)
gave:
id month
1 1 value1
2 value1
2 1 value1
2 value1
df = df.groupby(['id,'month'])[['value1','value2']].transform(get_most_common)
gave this
value1 value2
0 1 9
1 1 9
2 3 12
3 3 12
4 3 12
5 6 14
6 7 15
7 7 15
applymap
doesn't work.What am I missing here?
Upvotes: 2
Views: 367
Reputation: 862511
Use GroupBy.agg
- it run function for each column separately:
df = df.groupby(['id','month'])['value1','value2'].agg(get_most_common)
print (df)
value1 value2
id month
1 1 1 9
2 3 12
2 1 6 14
2 7 15
Upvotes: 1