Lukas Tomek
Lukas Tomek

Reputation: 96

Pandas sum and agg(sum) yields different values

I have a pandas dataset which i want to groupby and agg with sum function.

When I use just df['col1'].sum(), I get different result then after agg:

data_grouped = data[['col2','col3','col1']].groupby(['col2','col3'])['col1'].sum()

I have already tried using dropna=False, but I get the same result. The sum after agg is lower then just simple sum in dataset.

Where can be the mistake?

Upvotes: 0

Views: 242

Answers (1)

lareb
lareb

Reputation: 92

# Check for missing values in 'col1'
missing_values = data['col1'].isnull().sum()
print("Number of missing values in 'col1':", missing_values)

data_cleaned = data.dropna(subset=['col1'])
data_grouped = data_cleaned.groupby(['col2', 'col3'])['col1'].sum()

data_filled = data.fillna({'col1': 0})
data_grouped = data_filled.groupby(['col2', 'col3'])['col1'].sum()

data_grouped = data.groupby(['col2', 'col3'])['col1'].agg(np.nansum)

Upvotes: 1

Related Questions