Reputation: 168
I have the following code in Python:
import numpy as np
import pandas as pd
colum1 = [1,2,3,4,5,6,7,8,9,10,11,12]
colum2 = [10,20,30,40,50,60,70,80,90,100,110,120]
df = pd.DataFrame({
'colum1' : colum1,
'colum2' : colum2
});
df.loc[df.colum1 == 1,'result'] = df['colum2']
for i in range(len(colum2)):
df.result = np.where(df.colum1>1, 5 - (df['colum2'] - df.result.shift(1)), df.result)
the result of df.result is:
colum1 colum2 result 0 1 10 10.0 1 2 20 -5.0 2 3 30 -30.0 3 4 40 -65.0 4 5 50 -110.0 5 6 60 -165.0 6 7 70 -230.0 7 8 80 -305.0 8 9 90 -390.0 9 10 100 -485.0 10 11 110 -590.0 11 12 120 -705.0
I would like to know if there is a method that allows me to obtain the same result without using a cycle for
Upvotes: 2
Views: 96
Reputation: 51175
Your operation is dependent on two things, the previous row in the DataFrame, and the difference between consecutive values in the DataFrame. That hints that the solution will require shift
and diff
. However, you want to add a small constant to the expanding sum, as well as actually subtract this from each row, not add it.
To set the pieces of the problem up, first create your shifted series, where you add 5
:
a = df.colum2.shift().add(5).cumsum().fillna(0)
Now you need the difference between elements in the Series, and fill missing results with their respective value in colum2
:
b = df.colum2.diff().fillna(df.colum2)
To get your final result, simply subtract a
from b
:
b - a
0 10.0
1 -5.0
2 -30.0
3 -65.0
4 -110.0
5 -165.0
6 -230.0
7 -305.0
8 -390.0
9 -485.0
10 -590.0
11 -705.0
Name: colum2, dtype: float64
Upvotes: 3