GioC
GioC

Reputation: 45

How do I apply a function to transform a numeric variable based on the cumulative count of a string variable?

I am trying to transform a numerical variable based on the weekly count of a string variable using this basic function.

import numpy as np
import pandas as pd

def adstock(grps, rate):
    adstock = np.zeros(len(grps))
    adstock[0] = grps[0]
    for i in range(1, len(grps)):
        adstock[i] = grps[i] + rate * adstock[i - 1]
    return adstock

I.e., for each creative, I need the transformation to only affect the subsequent weeks and not the first week.

df = pd.DataFrame(
    {
        "Creative": [
            "Phone",
            "Phone",
            "Phone",
            "Yoga",
            "Yoga",
            "Yoga",
            "Yoga",
            "Grass",
            "Grass",
            "Grass",
            "Grass",
        ],
        "airing_week": [
            "2015-12-28",
            "2016-01-04",
            "2016-01-11",
            "2018-01-01",
            "2018-01-08",
            "2018-01-15",
            "2018-01-22",
            "2022-02-28",
            "2022-03-07",
            "2022-03-14",
            "2022-03-21",
        ],
        "week_num": [1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4],
        "grps": [
            38.5,
            67.13,
            50.15,
            43.11,
            28.61,
            9.04,
            4.02,
            28.83,
            28.81,
            36.91,
            37.79,
        ],
    }
)
df['adstock_grps']=adstock(df['grps'], .5)

df

Output:

Creative  airing_week week  grps    adstock_grps
0   Phone   2015-12-28  1   38.50   38.500000
1   Phone   2016-01-04  2   67.13   86.380000
2   Phone   2016-01-11  3   50.15   93.340000
3   Yoga    2018-01-01  1   43.11   89.780000
4   Yoga    2018-01-08  2   28.61   73.500000
5   Yoga    2018-01-15  3   9.04    45.790000
6   Yoga    2018-01-22  4   4.02    26.915000
7   Grass   2022-02-28  1   28.83   42.287500
8   Grass   2022-03-07  2   28.81   49.953750
9   Grass   2022-03-14  3   36.91   61.886875
10  Grass   2022-03-21  4   37.79   68.733437

But, as you see, the transformation is incorrect as it is linear and based on the index. I need the transformation to occur for each creative after the first week and not carry over to the other creatives. Many thanks!

Upvotes: 1

Views: 155

Answers (1)

Laurent
Laurent

Reputation: 13488

With the dataframe you provided, here is one way to do it by chunking it per Creative values, apply adstock, and concatenate chunks back:

df = pd.concat(
    [
        df.loc[df["Creative"] == creative, :]
        .reset_index(drop=True)
        .pipe(lambda df_: df_.assign(adstock_grps=adstock(df_["grps"], 0.5)))
        for creative in df["Creative"].unique()
    ]
).reset_index(drop=True)

print(df)
# Output
   Creative airing_week  week_num   grps  adstock_grps
0     Phone  2015-12-28         1  38.50      38.50000
1     Phone  2016-01-04         2  67.13      86.38000
2     Phone  2016-01-11         3  50.15      93.34000
3      Yoga  2018-01-01         1  43.11      43.11000
4      Yoga  2018-01-08         2  28.61      50.16500
5      Yoga  2018-01-15         3   9.04      34.12250
6      Yoga  2018-01-22         4   4.02      21.08125
7     Grass  2022-02-28         1  28.83      28.83000
8     Grass  2022-03-07         2  28.81      43.22500
9     Grass  2022-03-14         3  36.91      58.52250
10    Grass  2022-03-21         4  37.79      67.05125

Upvotes: 1

Related Questions