Binay
Binay

Reputation: 19

Doing groupby and finding the most common group

here is a sample of my dataset:

column1 column2
A apple
A banana
A pineapple
B apple
B banana
C grape
C banana
C apple
C mandarine
D apple
D banana

i want to apply the group by method and list out the most common groups that is present in column 1. For example, my output should produce (apple, banana) as the most common group.

Thank you!

Upvotes: 1

Views: 84

Answers (3)

Carmoreno
Carmoreno

Reputation: 1319

You can use DataFrame.GroupBy method and then, you can use count() method:

df = pd.DataFrame(columns=['column1', 'column2'], data =[['A',  'apple'],
['A',   'banana'],
['A',   'pineapple'],
['B',   'apple'],
['B',   'banana'],
['C',   'grape'],
['C',   'banana'],
['C',   'apple'],
['C',   'mandarine'],
['D',   'apple'],
['D',   'banana']])

df_result = df.groupby('column2').count()
df_result = df_result[df_result['column1'] == max(df_result['column1'])]

Output:

   column2  column1
0   apple      4
1   banana     4

Edit: According to the question, you want as output ('apple', 'banana'). So, continuing with the data above, you can do:

result_tuple = tuple((value for _, value in df_result['column2'].items()))

Output: ('apple', 'banana')

Upvotes: 1

BENY
BENY

Reputation: 323226

Something like mode

df.drop_duplicates().column2.mode()
Out[156]: 
0     apple
1    banana
dtype: object

Upvotes: 1

Scott Boston
Scott Boston

Reputation: 153460

Try this:

df.groupby('column2')['column1'].nunique()

Output:

column2
apple        4
banana       4
grape        1
mandarine    1
pineapple    1
Name: column1, dtype: int64

Then,

s = df.groupby('column2')['column1'].nunique()
s[s == s.max()].reset_index()

Output:

  column2  column1
0   apple        4
1  banana        4

Upvotes: 1

Related Questions