Tenryuubito
Tenryuubito

Reputation: 13

Python Pandas count function on condition and subset

i have a dataframe like this

F_Class  Product    Packages
Apple    Apple_A          1           
Apple    Apple_A          2           
Apple    Apple_A          1           
Apple    Apple_B          2           
Bananas  Banana_A      n.a.           
Bananas  Banana_A      n.a.           

I want to build the following count function to count the items in my dataframe like shown below.

The result should look like this:

F_Class  Product    Packages    Counter
Apple    Apple_A          1           1
Apple    Apple_A          2           3
Apple    Apple_A          1           4
Apple    Apple_B          2           2
Bananas  Banana_A      n.a.           1
Bananas  Banana_A      n.a.           2

Upvotes: 1

Views: 206

Answers (2)

SeaBean
SeaBean

Reputation: 23217

Use df.groupby() together with df.transform() as follows:

df['Counter'] = (df.groupby(['F_Class','Product'])['Packages']
                   .transform(lambda x: x.eq('2').add(1).cumsum()))


print(df)

   F_Class   Product Packages  Counter
0    Apple   Apple_A        1        1
1    Apple   Apple_A        2        3
2    Apple   Apple_A        1        4
3    Apple   Apple_B        2        2
4  Bananas  Banana_A     n.a.        1
5  Bananas  Banana_A     n.a.        2

If your values in column Packages are integer rather than string, modify '2' to 2:

df['Counter'] = (df.groupby(['F_Class','Product'])['Packages']
                   .transform(lambda x: x.eq(2).add(1).cumsum()))

Upvotes: 0

jezrael
jezrael

Reputation: 862691

If need sum by Packages numbers use DataFrameGroupBy.cumsum with replace missing values to 1:

df['Packages'] = pd.to_numeric(df['Packages'], errors='coerce')

df['Counter'] = (df.assign(Packages = df['Packages'].fillna(1).astype(int))
                   .groupby(['F_Class','Product'])['Packages'].cumsum())
print (df)
   F_Class   Product  Packages  Counter
0    Apple   Apple_A       1.0        1
1    Apple   Apple_A       2.0        3
2    Apple   Apple_A       1.0        4
3    Apple   Apple_B       2.0        2
4  Bananas  Banana_A       NaN        1
5  Bananas  Banana_A       NaN        2

Detail:

print (df.assign(Packages = df['Packages'].fillna(1).astype(int)))
   F_Class   Product  Packages
0    Apple   Apple_A         1
1    Apple   Apple_A         2
2    Apple   Apple_A         1
3    Apple   Apple_B         2
4  Bananas  Banana_A         1
5  Bananas  Banana_A         1

Upvotes: 1

Related Questions