Reputation: 5152
I have the following DataFrames:
example = pd.DataFrame({"dirr":[1,0,-1,-1,1,-1,0],
"value": [125,130,80,8,150,251,18],
"result":[np.NaN for _ in range(7)]})
I would like to perform the following operation with cummin() and cummax() on it:
example["result"].apply(lambda x : x= example["value"].cummax() if example["dirr"]==1
else x= example["value"].cummin() if example["dirr"]==-1
else x= NaN if if example["dirr"]==0
)
this is returning : error: invalid syntax
.
Could anyone help me straightening that one up?
That would be the intended output:
example = pd.DataFrame({"dirr":[1,0,-1,-1,1,-1,0],
"value": [125,130,80,8,150,251,18],
"result":[125, NaN, 80, 8, 150, 8, NaN]})
EDIT:
So as per the answer of @su79eu7k the following function would do:
def calc(x):
if x['dirr'] == 1:
return np.diag(example["value"].cummax())
elif x['dirr'] == -1:
return np.diag(example["value"].cummin())
else:
return np.nan
I should be able to shove that into a lambda but still am blocked on the syntax error... which I still don't see?
example["result"]=example.apply(lambda x : np.diag(x["value"].cummax()) if x["dirr"]==1
else np.diag(x["value"].cummin()) if x["dirr"]==-1
else NaN if x["dirr"]==0
)
A final little nudge form you guys would be hugely appreciated.
Upvotes: 6
Views: 5962
Reputation: 294218
All numpy
v = example.value.values
d = example.dirr.values
mx = np.maximum.accumulate(v)
mn = np.minimum.accumulate(v)
example['result'] = np.where(d == 1, mx, np.where(d == -1, mn, np.nan))
example
dirr result value
0 1 125.0 125
1 0 NaN 130
2 -1 80.0 80
3 -1 8.0 8
4 1 150.0 150
5 -1 8.0 251
6 0 NaN 18
timing
Upvotes: 1
Reputation: 2544
I think it makes the most sense to use separate lines instead of an apply. If you do use the apply function, you should create a separate function and pass it through rather than making a three-line lambda.
example.loc[example['dirr'] == 1, 'result'] = \
example.loc[example['dirr'] == 1, 'value'].cummax()
example.loc[example['dirr'] == -1, 'result'] = \
example.loc[example['dirr'] == -1, 'value'].cummin()
>>> example
dirr result value
0 1 125.0 125
1 0 NaN 130
2 -1 80.0 80
3 -1 8.0 8
4 1 150.0 150
5 -1 8.0 251
6 0 NaN 18
Alternate apply
approach below.
current_max = 0
current_min = 9999
def func(df):
global current_max
global current_min
if df['dirr'] == 1:
current_max = max(current_max, df['value'])
return current_max
elif df['dirr'] == -1:
current_min = min(current_min, df['value'])
return current_min
else:
return np.nan
example['result'] = example.apply(func, axis=1)
Upvotes: 3
Reputation: 7306
I think @3novak's solution is simple and fast. But if you really want to use apply
function,
def calc(x):
if x['dirr'] == 1:
return example["value"].cummax()
elif x['dirr'] == -1:
return example["value"].cummin()
else:
return np.nan
example['result'] = np.diag(example.apply(calc, axis=1))
print example
dirr result value
0 1 125.0 125
1 0 NaN 130
2 -1 80.0 80
3 -1 8.0 8
4 1 150.0 150
5 -1 8.0 251
6 0 NaN 18
Upvotes: 2