bartblons
bartblons

Reputation: 125

Python/Pandas - How to group by two columns and count rows with value from third column between two numbers

I need to group my data frame by two columns and after that count the occurrences of values from third column, which are between 1 and 20.

Data frame:

col1  col2  value
  a     b     1
  a     b     3
  a     b     22
  a     c     0
  a     c     3
  a     c     19

Result:

col1  col2  counter
 a     b      2
 a     c      2

My code:

counter = data_frame.groupby(['column1', 'column2'])[((data_frame['value'] >= 1) & (data_frame['value'] < 20))].sum()

Any ideas?

Upvotes: 4

Views: 2829

Answers (2)

jezrael
jezrael

Reputation: 862581

You need filter first by boolean indexing or query and then groupby with aggregating size:

df = data_frame[(data_frame['value'] >= 1) & (data_frame['value'] < 20)]
df = df.groupby(['col1', 'col2']).size().reset_index(name='counter')
print (df)
  col1 col2  counter
0    a    b        2
1    a    c        2

Or:

df = data_frame.query('value >= 1 & value < 20')
df = df.groupby(['col1', 'col2']).size().reset_index(name='counter')
print (df)
  col1 col2  counter
0    a    b        2
1    a    c        2

What is the difference between size and count in pandas?

Upvotes: 4

Tbaki
Tbaki

Reputation: 1003

You need to filter those value first, then you can use the groupby and the count functions like this :

df[(df.value<=20) & (df.value >= 1)].groupby(['col1','col2']).count().reset_index()

output :

    col1    col2    value
0   a       b       2
1   a       c       2


100 loops, best of 3: 2.5 ms per loop

Upvotes: 0

Related Questions