Reputation: 157
I am using the rolling
funtion of Pandas for create a Rolling Year Aggregation (SUM). But it is a liitle peculiar rolling, because y need that the rolling year (12 months) only sum
when a condition is true. The condition is based in one column called NIM/F&B
, so I need that the rolling funtion only works when the label of that column is "NIM"
.
This is how I start my code:
con_ = con_.groupby(["ANIO","MES NUMERO","CATEGORIA","DATE"]).sum().sort_values(by=["ANIO","MES NUMERO","CATEGORIA","DATE"], ascending = True)
#RY_12
f = lambda x:x.rolling(12).sum()
#RY_24
f = lambda x:x.rolling(24).sum()
con_ = con_.reset_index()
con_["NIM/F&B"] = np.where(con_['CATEGORIA'] == 'Soluble Coffee', "F&B",
np.where(con_['CATEGORIA'] == 'Milk Modifiers', "F&B",
np.where(con_['CATEGORIA'] == 'Pet Care', "NIM",
np.where(con_['CATEGORIA'] == 'Baking Chocolate', "F&B",
np.where(con_['CATEGORIA'] == 'Impulse Chocolate', "F&B",
np.where(con_['CATEGORIA'] == 'Culinary Solutions', "F&B",
np.where(con_['CATEGORIA'] == 'CPW', "NIM",
np.where(con_['CATEGORIA'] == 'Waters', "F&B",
np.where(con_['CATEGORIA'] == 'Creamers', "F&B",
np.where(con_['CATEGORIA'] == 'Dehydrated Bouillons', "F&B",
np.where(con_['CATEGORIA'] == 'Powder Milk', "F&B",
np.where(con_['CATEGORIA'] == 'M&D', "NIM",
np.where(con_['CATEGORIA'] == 'Infant Cereals', "NIM",
np.where(con_['CATEGORIA'] == 'Total Infant Formulas', "NIM",
np.where(con_['CATEGORIA'] == 'GUMS', "NIM",
np.where(con_['CATEGORIA'] == 'Portion Coffee', "F&B",
np.where(con_['CATEGORIA'] == 'R&G Coffee', "F&B",
np.where(con_['CATEGORIA'] == 'Snacks NH', "NIM",
"NIM"))))))))))))))))))
con_ = con_.set_index(["ANIO","MES NUMERO","CATEGORIA","DATE"])
con_.head(80)
Then my attempt to create what I need was as follows:
con_["F&B [$]"] = con_.loc[con_["NIM/F&B"] == "F&B",['VtasValor']].apply(f)
con_["F&B [Vol.]"] = con_.loc[con_["NIM/F&B"] == "F&B",['VtasUnidadesEQ']].apply(f)
And the output was like that:
As you can see is not the final goal for two reasons:
Could anybody help me to create correctly the code.
Thanks in advance!
Upvotes: 0
Views: 50
Reputation: 4543
It turns out that the rolling sends the numerical columns one by one to the function, so you can't access them in the apply function together to filter. So, just filter before or after the rolling, base on your need. Use:
import pandas as pd
df = pd.DataFrame({'Era':pd.date_range('2016','2022', 6), 'NIM/F':['NIM', 'NIM', 'NIM', 'F', 'F', 'F'], 'val':range(6)})
df2 = df[df['NIM/F'] == 'F']
df2.rolling(2).sum()
Output:
For the month question, set the date as index then use:
df.rolling('365D').sum()
Upvotes: 1