dingaro
dingaro

Reputation: 2342

Sum values in one column based on specific values in other column

I have a DataFrame in Pandas for example:

df = pd.DataFrame("a":[0,0,1,1,0], "penalty":["12", "15","13","100", "22"])

and how can I sum values in column "penalty" but I would like to sum only these values from column "penalty" which have values 0 in column "a" ?

Upvotes: 1

Views: 76

Answers (2)

Natheer Alabsi
Natheer Alabsi

Reputation: 2870

Filter the rows which has 0 in column a and calculate the sum of penalty column.

import pandas as pd
data ={'a':[0,0,1,1,0],'penalty':[12, 15,13,100, 22]}
df = pd.DataFrame(data)
df[df.a == 0].penalty.sum()

Upvotes: 0

Celius Stingher
Celius Stingher

Reputation: 18377

You can filter your dataframe with this:

import pandas as pd
data ={'a':[0,0,1,1,0],'penalty':[12, 15,13,100, 22]}
df = pd.DataFrame(data)
print(df.loc[df['a'].eq(0), 'penalty'].sum())

This way you are selecting the column penalty from your dataframe where the column a is equal to 0. Afterwards, you are performing the .sum() operation, hence returning your expected output (49). The only change I made was remove the quotation mark so that the values for the column penalty were interpreted as integers and not strings. If the input are necessarily strings, you can simply change this with df['penalty'] = df['penalty'].astype(int)

Upvotes: 1

Related Questions