sjd
sjd

Reputation: 1401

Incremental count in pandas

I have the dataset as below. and want to get count of C2 group by C1 as shown below.

C1  C2
a   1
a   2
a   3
a   4
a   5
b   1
b   2
b   3
b   4
b   5
b   6
c   6
c   7

Expected result below. Count of b should be the items not present in a. Same for c ie not present in a or b

a   5
b   1
c   1

is there any logic i can use in pandas to get this. other than iterating through the records

Upvotes: 1

Views: 168

Answers (1)

DeepSpace
DeepSpace

Reputation: 81604

You can remove duplicates on 'C2' column and count the values that left on 'C1':

print(df.drop_duplicates('C2')['C1'].value_counts())
#  a    5
#  b    1
#  c    1
#  Name: C1, dtype: int64

The output will be a series. You can use to_frame if you want a dataframe:

df = df.drop_duplicates('C2')['C1'].value_counts().to_frame()
print(df)
#     C1
#  a   5
#  b   1
#  c   1

Caveat

  • value_counts sorts the output by default. In this case it doesn't make a difference, but you can pass sort=False so the lines (indexes) will stay in the same order as in the original dataframe.

Upvotes: 2

Related Questions