Conditional Sum of a column python pandas

Question

I'm very new to python and pandas and was looking for some help. I'm working off of a CSV, and trying to use pandas to calculate totals for each name based on the value of the middle column. I want the sum of 'count' for each name for 'GEN' and 'NPR' added together. This is my dataset as a CSV:

StartingCSV.csv:

Name, Specialty, Count
Smith, GEN, 1
Smith, INT, 2
Smith, NPR, 5
Smith, PSC, 4
Zane, GEN, 3
Zane, PSC, 4
Zane, NPR, 4
Charles, NPR, 4
Charles, AUD, 4

Desired output:

Smith: 6
Zane: 7
Charles: 4

This is what I have so far:

import csv
import pandas as pd
df = pd.read_csv("StartingCSV.csv")
newdf = df.groupby("Name")
newdf.apply(lambda x: x[x['Specialty'] == 'NPR']['Count'].sum())

This is the output I get:

Smith: 5
Zane: 4
Charles: 4

This returns the number of NPR for each name, but I can't figure out a way to ADD the NPR values to the GEN values for each name, to create the desired output as listed above. Trying to add an "or" after 'NPR' in the final line returns an error like this:

newdf.apply(lambda x: x[x['Specialty'] == 'NPR' or 'GEN']['Count'].sum())

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Any help is appreciated! (sorry if I'm way off base or if this is unclear)

Conditional Sum of a column python pandas

Answers (1)

Related Questions