Rolling Year with Condition

I am using the rolling funtion of Pandas for create a Rolling Year Aggregation (SUM). But it is a liitle peculiar rolling, because y need that the rolling year (12 months) only sum when a condition is true. The condition is based in one column called NIM/F&B, so I need that the rolling funtion only works when the label of that column is "NIM".

This is how I start my code:

con_ = con_.groupby(["ANIO","MES NUMERO","CATEGORIA","DATE"]).sum().sort_values(by=["ANIO","MES NUMERO","CATEGORIA","DATE"], ascending = True)


#RY_12
f = lambda x:x.rolling(12).sum()

#RY_24
f = lambda x:x.rolling(24).sum()

con_  = con_.reset_index()

con_["NIM/F&B"] = np.where(con_['CATEGORIA'] == 'Soluble Coffee', "F&B",
                 np.where(con_['CATEGORIA'] == 'Milk Modifiers', "F&B",
                 np.where(con_['CATEGORIA'] == 'Pet Care', "NIM",
                 np.where(con_['CATEGORIA'] == 'Baking Chocolate', "F&B",
                 np.where(con_['CATEGORIA'] == 'Impulse Chocolate', "F&B",
                 np.where(con_['CATEGORIA'] == 'Culinary Solutions', "F&B",
                 np.where(con_['CATEGORIA'] == 'CPW', "NIM",
                 np.where(con_['CATEGORIA'] == 'Waters', "F&B",
                 np.where(con_['CATEGORIA'] == 'Creamers', "F&B",
                 np.where(con_['CATEGORIA'] == 'Dehydrated Bouillons', "F&B",
                 np.where(con_['CATEGORIA'] == 'Powder Milk', "F&B",
                 np.where(con_['CATEGORIA'] == 'M&D', "NIM",
                 np.where(con_['CATEGORIA'] == 'Infant Cereals', "NIM",
                 np.where(con_['CATEGORIA'] == 'Total Infant Formulas', "NIM", 
                 np.where(con_['CATEGORIA'] == 'GUMS', "NIM",    
                 np.where(con_['CATEGORIA'] == 'Portion Coffee', "F&B",                                                
                 np.where(con_['CATEGORIA'] == 'R&G Coffee', "F&B",
                 np.where(con_['CATEGORIA'] == 'Snacks NH', "NIM",
                 "NIM"))))))))))))))))))

con_ = con_.set_index(["ANIO","MES NUMERO","CATEGORIA","DATE"])

    
con_.head(80)

Then my attempt to create what I need was as follows:

con_["F&B [$]"]  = con_.loc[con_["NIM/F&B"] == "F&B",['VtasValor']].apply(f)
con_["F&B [Vol.]"]  = con_.loc[con_["NIM/F&B"] == "F&B",['VtasUnidadesEQ']].apply(f)

And the output was like that:

enter image description here

As you can see is not the final goal for two reasons:

  1. I need thas sum all the previous 12 months, not 12 rows
  2. It is sum all the values in the rolling without considering the condition

Could anybody help me to create correctly the code.

Thanks in advance!

Upvotes: 0

Views: 50

Answers (1)

keramat
keramat

Reputation: 4543

It turns out that the rolling sends the numerical columns one by one to the function, so you can't access them in the apply function together to filter. So, just filter before or after the rolling, base on your need. Use:

import pandas as pd
df = pd.DataFrame({'Era':pd.date_range('2016','2022', 6), 'NIM/F':['NIM', 'NIM', 'NIM', 'F', 'F', 'F'], 'val':range(6)})
df2 = df[df['NIM/F'] == 'F']
df2.rolling(2).sum()

Output:

enter image description here

For the month question, set the date as index then use:

df.rolling('365D').sum()

Upvotes: 1

Related Questions