Raaid Ahmad
Raaid Ahmad

Reputation: 13

How do I calculate week over week changes in Pandas?

I have the following df of values for various slices across time:

    date        A   B   C
0   2016-01-01  5   7   2
1   2016-01-02  6   12  15
...
2   2016-01-08  9   5   16
...
3   2016-12-24  5   11  13
4   2016-12-31  3   52  22

I would like to create a new dataframe that calculates the w-w change in each slice, by date. For example, I want the new table to be blank for all slices from jan 1 - jan 7. I want the value of jan 8 to be the jan 8 value for the given slice minus the value of the jan 1 value of that slice. I then want the value of jan 9 to be the jan 9 value for the given slice minus the value of the jan 2 slice. So and so forth, all the way down.

The example table would look like this:

    date        A   B   C
0   2016-01-01  0   0   0
1   2016-01-02  0   0   0
...
2   2016-01-08  4   -2  14
...
3   2016-12-24  4   12  2
4   2016-12-31  -2  41  9

You may assume the offset is ALWAYS 7. In other words, there are no missing dates.

Upvotes: 1

Views: 5525

Answers (2)

Brad Solomon
Brad Solomon

Reputation: 40878

@Unatiel's answer is correct in this case, where there are no missing dates, and should be accepted.

But I wanted to post a modification here for cases with missing dates, for anyone interested. From the docs:

The shift method accepts a freq argument which can accept a DateOffset class or other timedelta-like object or also a offset alias

from pandas.tseries.offsets import Week
res = ((df - df.shift(1, freq=Week()).reindex(df.index))
       .fillna(value=0)
       .astype(int))

print(res)
             A   B
date              
2016-01-01   0   0
2016-01-02   0   0
2016-01-03   0   0
2016-01-04   0   0
2016-01-05   0   0
2016-01-06   0   0
2016-01-07   0   0
2016-01-08  31  46
2016-01-09   4  20
2016-01-10 -51 -65
2016-01-11  56   5
2016-01-12 -51  24
        ..  ..
2016-01-20  34 -30
2016-01-21 -28  19
2016-01-22  24   8
2016-01-23 -28 -46
2016-01-24 -11 -60
2016-01-25 -34  -7
2016-01-26 -12 -28
2016-01-27 -41  42
2016-01-28  -2  48
2016-01-29  35 -51
2016-01-30  -8  62
2016-01-31  -6  -9

Upvotes: 5

Unatiel
Unatiel

Reputation: 1080

If we know offset is always 7 then use shift(), here is a quick example showing how it works :

df = pandas.DataFrame({'x': range(30)})
df.shift(7)
       x
0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    0.0
8    1.0
9    2.0
10   3.0
11   4.0
12   5.0
...

So with this you can do :

df - df.shift(7)
      x
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
6   NaN
7   7.0
8   7.0  
...

In your case, don't forget to set_index('date') before.

Upvotes: 2

Related Questions