Reputation: 7842
I have a dataframe:
df = pd.DataFrame({'col0':[[0,1],[1,0,0],[1,0],[1,0],[2,0]],
'col1':[5,4,3,2,1]})
ie:
col0 col1
0 [0, 1] 5
1 [1, 0, 0] 4
2 [1, 0] 3
3 [1, 0] 2
4 [2, 0] 1
I would like to group by values in col0
, and sum col1
values in the same group. I do:
df.groupby('col0').col1.sum()
but this gives TypeError: unhashable type: 'list'
. I do then:
df.groupby(df.col0.apply(frozenset)).col1.sum()
which gives:
col0
(0, 1) 14
(0, 2) 1
Name: col1, dtype: int64
Ie lists were converted into sets (frozenset
s to be exact), and then groupbyed. The number of elements and order of them did not matter (ie [1,0]
and [0,1]
belongs to the same group, so does [1,0]
and [1,0,0]
)
If order and number of elements also matter, how do I groupby then?
Desired output of groupbying col0
and summing col1
of above dataframe:
col0
[0, 1] 5
[1,0,0] 4
[1, 0] 5
[2,0] 1
Name: col1, dtype: int64
Upvotes: 1
Views: 81
Reputation: 20669
tuple
is immutable, can contain duplicates and maintains the order.
df['col0'] = df['col0'].apply(tuple)
df.groupby('col0', sort=False).sum() # sort=False for original order of col0
# col1
# col0
# (0, 1) 5
# (1, 0, 0) 4
# (1, 0) 5
# (2, 0) 1
Upvotes: 2
Reputation: 897
You can convert to string just for grouping:
import pandas as pd
df = pd.DataFrame({'col0':[[0,1],[1,0,0],[1,0],[1,0],[2,0]],
'col1':[5,4,3,2,1]})
df.groupby(df['col0'].astype(str)).sum()
Upvotes: 1