Fernanda F.
Fernanda F.

Reputation: 47

How to dynamically create columns based on multiple conditions

So I'm having the following problem:

I have a dataframe like the one bellow where time_diff_float is the time difference between each row and the row above in minutes. So, for example, I had value = 4 20 minutes after value = 1.

value | time_diff_float
1       NaN
4       20
3       13
2       55
5       08
7       15

First I have to check if the time difference between two rows is < 60 (one hour) and create a column using the formula rem = value (from row above) * lambda ** time difference between 2 rows . My lambda is a constant with the value of 0.97.

And then, if the time difference between each row and 2 rows above is still inferior to 60, I have to re-do the same thing comparing each row with 2 rows above. And then I have to do the same thing comparing 3 rows above and etc.

To do that I wrote the following code:

df.loc[df['time_diff_float'] < 60, 'rem_1'] = df['value'].shift() * (lambda_ ** (df['time_diff_float'] - 1))
df.loc[df['time_diff_float'] + df['time_diff_float'].shift() < 60, 'rem_2'] = df['value'].shift(2) * (lambda_ ** (df['time_diff_float'] + df['time_diff_float'].shift() - 1))
df.loc[df['time_diff_float'] + df['time_diff_float'].shift() + df['time_diff_float'].shift(2) < 60, 'rem_3'] = df['value'].shift(3) * (lambda_ ** (df['time_diff_float'] + df['time_diff_float'].shift() + df['time_diff_float'].shift(2) - 1))

My question is: since I have to re-do this at least 10 times (even more) with the real values I have, is there a way to create the "rem columns" dynamically?

Thanks in advance!

Upvotes: 0

Views: 501

Answers (1)

Bruno Mello
Bruno Mello

Reputation: 4618

You can save a mask of your data and then update it in every time of the loop:

n = 3
for i in range(1, n):
    if (i==1):
        mask = df['time_diff_float']
        df.loc[mask, 'rem_' +str(i)] = df['value'].shift() * (lambda_ ** (mask - 1))

    else:
        mask += df['time_diff_float'].shift(i-1)
        df.loc[mask < 60, 'rem_'+str(i)] = df['value'].shift(i) * (lambda_ ** (mask - 1))

Upvotes: 1

Related Questions