Reputation: 332
I am trying to build a function that uses .shift() but it is giving me an error. Consider this:
In [40]:
data={'level1':[20,19,20,21,25,29,30,31,30,29,31],
'level2': [10,10,20,20,20,10,10,20,20,10,10]}
index= pd.date_range('12/1/2014', periods=11)
frame=DataFrame(data, index=index)
frame
Out[40]:
level1 level2
2014-12-01 20 10
2014-12-02 19 10
2014-12-03 20 20
2014-12-04 21 20
2014-12-05 25 20
2014-12-06 29 10
2014-12-07 30 10
2014-12-08 31 20
2014-12-09 30 20
2014-12-10 29 10
2014-12-11 31 10
A normal function works fine. To demonstrate I calculate the same result twice, using the direct and function approach:
In [63]:
frame['horizontaladd1']=frame['level1']+frame['level2']#works
def horizontaladd(x):
test=x['level1']+x['level2']
return test
frame['horizontaladd2']=frame.apply(horizontaladd, axis=1)
frame
Out[63]:
level1 level2 horizontaladd1 horizontaladd2
2014-12-01 20 10 30 30
2014-12-02 19 10 29 29
2014-12-03 20 20 40 40
2014-12-04 21 20 41 41
2014-12-05 25 20 45 45
2014-12-06 29 10 39 39
2014-12-07 30 10 40 40
2014-12-08 31 20 51 51
2014-12-09 30 20 50 50
2014-12-10 29 10 39 39
2014-12-11 31 10 41 41
But while directly applying shift works, in a function it doesn't work:
frame['verticaladd1']=frame['level1']+frame['level1'].shift(1)#works
def verticaladd(x):
test=x['level1']+x['level1'].shift(1)
return test
frame.apply(verticaladd)#error
results in
KeyError: ('level1', u'occurred at index level1')
I also tried applying to a single column which makes more sense in my mind, but no luck:
def verticaladd2(x):
test=x-x.shift(1)
return test
frame['level1'].map(verticaladd2)#error, also with apply
error:
AttributeError: 'numpy.int64' object has no attribute 'shift'
Why not call shift directly? I need to embed it into a function to calculate multiple columns at the same time, along axis 1. See related question Ambiguous truth value with boolean logic
Upvotes: 3
Views: 14526
Reputation: 83
Check if the values you are trying to shift is not an array. Then you need to convert the array to series. With this you will be able to shift the values. I was having same issues,now I am able to get the shift values.
This is my part of the code for your reference.
X = grouped['Confirmed_day'].values
X_series=pd.Series(X)
X_lag1 = X_series.shift(1)
Upvotes: 2
Reputation: 12801
Try passing the frame to the function, rather than using apply
(I am not sure why apply
doesn't work, even column-wise):
def f(x):
x.level1
return x.level1 + x.level1.shift(1)
f(frame)
returns:
2014-12-01 NaN
2014-12-02 39
2014-12-03 39
2014-12-04 41
2014-12-05 46
2014-12-06 54
2014-12-07 59
2014-12-08 61
2014-12-09 61
2014-12-10 59
2014-12-11 60
Freq: D, Name: level1, dtype: float64
Upvotes: 3
Reputation: 1835
I'm not entirely following along, but if frame['level1'].shift(1) works, then I can only imagine that frame['level1'] is not a numpy.int64 object while whatever you are passing into the verticaladd function is. Probably need to look at your types.
Upvotes: -1