PCMitchell
PCMitchell

Reputation: 91

Count and sum non-zero values by group Pandas

I'm trying to group items by ID then count the number of non-zero values by ID and assign that value to a new column. I also need to sum those PMN values by ID then drop duplicate ID values. For example:

d = {'ID': [156, 156, 156, 156, 157, 157, 157, 157],'PMN': [8.6, 8.4, 0, 10, 20, 0, 0, 5]}
df = pd.DataFrame(data=d)

I thought this might work for count and sum:

df['NodeDegree'] = df.groupby(['ID'])['PMN'].count()
df['UpdateFlow'] = df.groupby(['ID'])['PMN'].sum()
df.drop_duplicates(subset=['ID'], keep='last', inplace=True)

But it just return NaN for the NodeDegree values, and i'm not incorporating the !=0. Also, is it possible to make these changes in place?

Pointers please :)

Thanks!

Upvotes: 1

Views: 506

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195438

I hope I've understood you correctly:

x = df.groupby("ID")["PMN"].agg(
    NodeDegree=lambda x: x[x != 0].count(), UpdateFlow="sum"
)
print(x)

Prints:

     NodeDegree  UpdateFlow
ID                         
156           3        27.0
157           2        25.0

Upvotes: 1

Related Questions