Reputation: 5117
I am using the following source code:
import numpy as np
import pandas as pd
# Load data
data = pd.read_csv('C:/Users/user/Desktop/Daily_to_weekly.csv', keep_default_na=True)
print(data.shape[1])
# 18
# Create weekly data
# Agreggate by calculating the sum per store for every week
data_weekly = data.groupby(['STORE_ID', 'WEEK_NUMBER'], as_index=False).agg('sum')
print(data_weekly.shape[1])
# 17
As you may see for some reason a column is missing after the aggregation and this column is neither of the GroupBy columns ('STORE_ID', 'WEEK_NUMBER'
).
Why is this happening and how can I fix it?
Upvotes: 3
Views: 3862
Reputation: 31
I've run in to this problem numerous times before. The problem is panda's is dropping one of your columns because it has identified it as a "nuisance" column. This means that the aggregation you are attempting to do cannot be applied to it. If you wish to preserve this column I would recommend including it in the groupby.
Upvotes: 3