Shawn M
Shawn M

Reputation: 55

Python - Pandas DF - sum values in a column that match a condition in another column

I would like to sum values in one column based on a condition in another column. I can do this when the condition exists, but if it does not, I get an error. I need this to accept that condition doesn't exist and move on to the next step.

Example df:

import pandas as pd
technologies   = ({
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas","Hadoop","Spark","Python"],
    'Fee' :[22000,25000,23000,24000,26000,25000,25000,22000],
    'Duration':['30days','50days','55days','40days','60days','35days','55days','50days']
                })
df = pd.DataFrame(technologies, columns=['Courses','Fee','Duration'])
print(df)
Courses    Fee Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2   Hadoop  23000   55days
3   Python  24000   40days
4   Pandas  26000   60days
5   Hadoop  25000   35days
6    Spark  25000   55days
7   Python  22000   50days

for this example, I would like to sum the fee for all lines that have "55days"

duration = df.groupby('Duration')['Fee'].sum()["55days"]
print (df)
48000

# but if I choose a value that does not appear under Duration like "22days" I get an error

duration22 = df.groupby('Duration')['Fee'].sum()["22days"]

Can you please advise how I can code this so if the value "22days" happens not to exist on this run it does not fail or it just puts a 0 value in if null?

Upvotes: 0

Views: 465

Answers (1)

sitting_duck
sitting_duck

Reputation: 3720

You could do a pre-lookup check in the grouped index.

gd_sum = df.groupby('Duration')['Fee'].sum()

def dur_sum(k):
    return gd_sum[k] if k in gd_sum.index else 0


print(dur_sum('55days'))
48000

print(dur_sum('22days'))
0

Upvotes: 1

Related Questions