Reputation: 81
I have my data in pandas data frame as follows:
df = pd.DataFrame({'a':[1,2,3,3,4,4,4], 'b':[2,3,4,4,5,5,5]})
So the dataframe looks like this:
a b
0 1 2
1 2 3
2 3 4
3 3 4
4 4 5
5 4 5
6 4 5
The column 'a','b' combination here are: 12(1), 23(1), 34(2), 45(3). I am trying to select 4 and 5 and print them out because their combination has most occurrences (3 times).
My code is:
counts = df.groupby(['a','b']).size().sort_values(ascending=False)
print(counts)
Output:
a b
4 5 3
3 4 2
2 3 1
1 2 1
dtype: int64
But this only gives me a column [3,2,1,1]. This are the numbers combination counts. How can I access elements 4 and 5 individually so I can print them out?
Thanks in advance!
Upvotes: 7
Views: 7365
Reputation: 8816
Simplest one to use mode in pandas DataFrame. It'll give a most frequent values across the rows or columns:
>>> df
a b
0 1 2
1 2 3
2 3 4
3 3 4
4 4 5
5 4 5
6 4 5
>>> df.mode()
a b
0 4 5
Upvotes: 1
Reputation: 323226
Using idxmax
, even the result is inorder, you still can find the index of max value
df.groupby(['a','b']).size().idxmax()
Out[15]: (4, 5)
Upvotes: 3
Reputation: 164623
Pandas GroupBy
objects are indexed by grouper key(s). In the case of multiple keys, this will mean a MultiIndex
. You can just extract the first index of your result to give a tuple
representing the most common combination:
counts.index[0] # (4, 5)
Upvotes: 2