Reputation: 153
I currently a dataframe that contains a column called load
, and I want to create a column called calculated load
, that uses a simple formula on the column load, and a variable. However, I want the calculation to change the variable it uses when it sees the value 1 in a column called postition
, and uses that formula until it sees -1 in position, when the values start to rise again. Here is my current code:
import pandas as pd
s_falling = -4
s_rising = 2
x = 2
df = pd.DataFrame({"load": [1,2,4,6,2,4,7,4,8,3,4,7,3,3,6,4,7,4,3,2],
"position": [0,0.2,0.5,0.8,0.7,1,0.7,0.6,0.7,0.8,0.4,0.2,0,-0.5,-0.8,-1,-0.8,-0.9,-0.7,-0.6]})
df['calculated load'] = df['load'] + x * s_rising
print(df['calculated load'])
0 5
1 6
2 8
3 10
4 6
5 8
6 11
7 8
8 12
9 7
10 8
11 11
12 7
13 7
14 10
15 8
16 11
17 8
18 7
19 6
This works up to the position after 1, when the values start falling, I want to use this formula that swaps s_rising
for s_falling
, and continues to use this new variable with the formula iterating over the column, and then reverts back to the original formula using the variable s_rising
from the position after -1 is seen again:
df['calculated load'] = df['load'] + x *s_falling
The formula doesn't change, merely the variable being used within it.
I can't just check if the value after is less than or more than the previous value, as the values in position don't rise and fall perfectly. Ideally, this would be my desired output:
print(df['calculated load'])
0 5
1 6
2 8
3 10
4 6
5 8
6 3
7 0
8 4
9 -1
10 0
11 3
12 -1
13 -1
14 2
15 0
16 11
17 8
18 7
19 8
EDIT: Some very kind people have offered solutions, and I have realised that my question (designed to produce a small, reproducible example) was slightly off the mark. I have edited the question to reflect this.
Upvotes: 0
Views: 71
Reputation: 916
Check this & let me know if it's work.
x = 2
df = pd.DataFrame({"load": [1,2,4,6,2,4,7,4,8,3,4,7,3,3,6,4,7,4,3,2],
"position": [0,0.2,0.5,0.8,0.7,1,0.7,0.6,0.7,0.8,0.4,0.2,0,-0.5,-0.8,-1,-0.8,-0.9,-0.7,-0.6]})
for i, row in df.iterrows():
if df[df['position']==1.0].index[0]<=i<df[df['position']==-1.0].index[0]:
df.loc[i, 'calculated load'] = df.loc[i, 'load'] - x
else:
df.loc[i, 'calculated load'] = df.loc[i, 'load'] + x
print(df)
load position calculated load
0 1 0.0 3.0
1 2 0.2 4.0
2 4 0.5 6.0
3 6 0.8 8.0
4 2 0.7 4.0
5 4 1.0 2.0
6 7 0.7 5.0
7 4 0.6 2.0
8 8 0.7 6.0
9 3 0.8 1.0
10 4 0.4 2.0
11 7 0.2 5.0
12 3 0.0 1.0
13 3 -0.5 1.0
14 6 -0.8 4.0
15 4 -1.0 6.0
16 7 -0.8 9.0
17 4 -0.9 6.0
18 3 -0.7 5.0
19 2 -0.6 4.0
Upvotes: 1
Reputation: 120399
Without loop, you can use:
x1 = df['position'].eq(1).mul(-x).shift(fill_value=0)
x2 = df['position'].eq(-1).mul(x)
xm = (p1|p2).replace(0, np.nan).ffill().fillna(x).astype(int)
df['calculated load'] = df['load'] + xm
Output:
>>> df
load position calculated load
0 1 0.0 3
1 2 0.2 4
2 4 0.5 6
3 6 0.8 8
4 2 0.7 4
5 4 1.0 6
6 7 0.7 5
7 4 0.6 2
8 8 0.7 6
9 3 0.8 1
10 4 0.4 2
11 7 0.2 5
12 3 0.0 1
13 3 -0.5 1
14 6 -0.8 4
15 4 -1.0 6
16 7 -0.8 9
17 4 -0.9 6
18 3 -0.7 5
19 2 -0.6 4
Upvotes: 1
Reputation: 648
I believe this code is working, but it's not efficient because of itterrows()
. If someone find a way to vectorize it you can comment my answer.
import pandas as pd
x = 2
df = pd.DataFrame({"load": [1,2,4,6,2,4,7,4,8,3,4,7,3,3,6,4,7,4,3,2],
"position": [0,0.2,0.5,0.8,0.7,1,0.7,0.6,0.7,0.8,0.4,0.2,0,-0.5,-0.8,-1,-0.8,-0.9,-0.7,-0.6]})
increasing = True
list_increasing = []
for index, row in df.iterrows():
if increasing and row.position == 1:
increasing = False
elif not increasing and row.position == -1:
increasing = True
list_increasing.append(increasing)
df['increasing'] = list_increasing
def calculated_load(row):
if row.increasing:
return row.load + x
else:
return row.load - x
df['cal load'] = df.apply(calculated_load, axis=1)
Upvotes: 1