Reputation: 13
I have the following df of values for various slices across time:
date A B C
0 2016-01-01 5 7 2
1 2016-01-02 6 12 15
...
2 2016-01-08 9 5 16
...
3 2016-12-24 5 11 13
4 2016-12-31 3 52 22
I would like to create a new dataframe that calculates the w-w change in each slice, by date. For example, I want the new table to be blank for all slices from jan 1 - jan 7. I want the value of jan 8 to be the jan 8 value for the given slice minus the value of the jan 1 value of that slice. I then want the value of jan 9 to be the jan 9 value for the given slice minus the value of the jan 2 slice. So and so forth, all the way down.
The example table would look like this:
date A B C
0 2016-01-01 0 0 0
1 2016-01-02 0 0 0
...
2 2016-01-08 4 -2 14
...
3 2016-12-24 4 12 2
4 2016-12-31 -2 41 9
You may assume the offset is ALWAYS 7. In other words, there are no missing dates.
Upvotes: 1
Views: 5525
Reputation: 40878
@Unatiel's answer is correct in this case, where there are no missing dates, and should be accepted.
But I wanted to post a modification here for cases with missing dates, for anyone interested. From the docs:
The
shift
method accepts afreq
argument which can accept aDateOffset
class or othertimedelta
-like object or also a offset alias
from pandas.tseries.offsets import Week
res = ((df - df.shift(1, freq=Week()).reindex(df.index))
.fillna(value=0)
.astype(int))
print(res)
A B
date
2016-01-01 0 0
2016-01-02 0 0
2016-01-03 0 0
2016-01-04 0 0
2016-01-05 0 0
2016-01-06 0 0
2016-01-07 0 0
2016-01-08 31 46
2016-01-09 4 20
2016-01-10 -51 -65
2016-01-11 56 5
2016-01-12 -51 24
.. ..
2016-01-20 34 -30
2016-01-21 -28 19
2016-01-22 24 8
2016-01-23 -28 -46
2016-01-24 -11 -60
2016-01-25 -34 -7
2016-01-26 -12 -28
2016-01-27 -41 42
2016-01-28 -2 48
2016-01-29 35 -51
2016-01-30 -8 62
2016-01-31 -6 -9
Upvotes: 5
Reputation: 1080
If we know offset is always 7 then use shift()
, here is a quick example showing how it works :
df = pandas.DataFrame({'x': range(30)})
df.shift(7)
x
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 0.0
8 1.0
9 2.0
10 3.0
11 4.0
12 5.0
...
So with this you can do :
df - df.shift(7)
x
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 7.0
8 7.0
...
In your case, don't forget to set_index('date')
before.
Upvotes: 2