Reputation: 69755
I have not too much experience with pandas, and I have the following DataFrame:
month A B
2/28/2017 0.7377573034 0
3/31/2017 0.7594787565 3.7973937824
4/30/2017 0.7508308808 3.7541544041
5/31/2017 0.7038814004 7.0388140044
6/30/2017 0.6920212254 11.0723396061
7/31/2017 0.6801610503 11.5627378556
8/31/2017 0.6683008753 10.6928140044
9/30/2017 0.7075915026 11.3214640415
10/31/2017 0.6989436269 7.6883798964
11/30/2017 0.6259514607 4.3816602247
12/31/2017 0.6119757303 3.671854382
1/31/2018 0.633 3.798
2/28/2018 0.598 4.784
3/31/2018 0.673 5.384
4/30/2018 0.673 1.346
5/31/2018 0.609 0
6/30/2018 0.609 0
7/31/2018 0.609 0
8/31/2018 0.609 0
9/30/2018 0.673 0
10/31/2018 0.673 0
11/30/2018 0.598 0
12/31/2018 0.598 0
I need to compute column C
which basically is column A
times column B
, but the value of column B
is the value of the previous year of the corresponding month. In addition, for values not having the corresponding month in the previous year, this value should be zero. To be more specific, this is what I expect C
to be:
C
0 # these values are zero because the corresponding month in the previous year is not in column A
0
0
0
0
0
0
0
0
0
0
0
0 # 0.598 * 0
2.5556460155552 # 0.673 * 3.7973937824
2.5265459139593 # 0.673 * 3.7541544041
4.2866377286796 # 0.609 * 7.0388140044
6.7430548201149 # 0.609 * 11.0723396061
7.0417073540604 # 0.609 * 11.5627378556
6.5119237286796 # 0.609 * 10.6928140044
7.6193452999295 # 0.673 * 11.3214640415
5.1742796702772 # 0.673 * 7.6883798964
2.6202328143706 # 0.598 * 4.3816602247
2.195768920436 # 0.598 * 3.671854382
How can I achieve this? I am sure there might be a way to do it not using a for loop. Thanks in advance.
Upvotes: 1
Views: 45
Reputation: 210832
In [73]: (df.drop('B',1)
...: .merge(df.drop('A',1)
...: .assign(month=df.month + pd.offsets.MonthEnd(12)),
...: on='month', how='left')
...: .eval("C = A * B", inplace=False)
...: .fillna(0)
...: )
...:
Out[73]:
month A B C
0 2017-02-28 0.737757 0.000000 0.000000
1 2017-03-31 0.759479 0.000000 0.000000
2 2017-04-30 0.750831 0.000000 0.000000
3 2017-05-31 0.703881 0.000000 0.000000
4 2017-06-30 0.692021 0.000000 0.000000
5 2017-07-31 0.680161 0.000000 0.000000
6 2017-08-31 0.668301 0.000000 0.000000
7 2017-09-30 0.707592 0.000000 0.000000
8 2017-10-31 0.698944 0.000000 0.000000
9 2017-11-30 0.625951 0.000000 0.000000
10 2017-12-31 0.611976 0.000000 0.000000
11 2018-01-31 0.633000 0.000000 0.000000
12 2018-02-28 0.598000 0.000000 0.000000
13 2018-03-31 0.673000 3.797394 2.555646
14 2018-04-30 0.673000 3.754154 2.526546
15 2018-05-31 0.609000 7.038814 4.286638
16 2018-06-30 0.609000 11.072340 6.743055
17 2018-07-31 0.609000 11.562738 7.041707
18 2018-08-31 0.609000 10.692814 6.511924
19 2018-09-30 0.673000 11.321464 7.619345
20 2018-10-31 0.673000 7.688380 5.174280
21 2018-11-30 0.598000 4.381660 2.620233
22 2018-12-31 0.598000 3.671854 2.195769
Explanation:
we can generate a helper DF like this (we have added 12 months to month
column and dropped A
column):
In [77]: df.drop('A',1).assign(month=df.month + pd.offsets.MonthEnd(12))
Out[77]:
month B
0 2018-02-28 0.000000
1 2018-03-31 3.797394
2 2018-04-30 3.754154
3 2018-05-31 7.038814
4 2018-06-30 11.072340
5 2018-07-31 11.562738
6 2018-08-31 10.692814
7 2018-09-30 11.321464
8 2018-10-31 7.688380
9 2018-11-30 4.381660
10 2018-12-31 3.671854
11 2019-01-31 3.798000
12 2019-02-28 4.784000
13 2019-03-31 5.384000
14 2019-04-30 1.346000
15 2019-05-31 0.000000
16 2019-06-30 0.000000
17 2019-07-31 0.000000
18 2019-08-31 0.000000
19 2019-09-30 0.000000
20 2019-10-31 0.000000
21 2019-11-30 0.000000
22 2019-12-31 0.000000
now we can merge it with the original DF (we don't need B
column in the original DF):
In [79]: (df.drop('B',1)
...: .merge(df.drop('A',1)
...: .assign(month=df.month + pd.offsets.MonthEnd(12)),
...: on='month', how='left'))
Out[79]:
month A B
0 2017-02-28 0.737757 NaN
1 2017-03-31 0.759479 NaN
2 2017-04-30 0.750831 NaN
3 2017-05-31 0.703881 NaN
4 2017-06-30 0.692021 NaN
5 2017-07-31 0.680161 NaN
6 2017-08-31 0.668301 NaN
7 2017-09-30 0.707592 NaN
8 2017-10-31 0.698944 NaN
9 2017-11-30 0.625951 NaN
10 2017-12-31 0.611976 NaN
11 2018-01-31 0.633000 NaN
12 2018-02-28 0.598000 0.000000
13 2018-03-31 0.673000 3.797394
14 2018-04-30 0.673000 3.754154
15 2018-05-31 0.609000 7.038814
16 2018-06-30 0.609000 11.072340
17 2018-07-31 0.609000 11.562738
18 2018-08-31 0.609000 10.692814
19 2018-09-30 0.673000 11.321464
20 2018-10-31 0.673000 7.688380
21 2018-11-30 0.598000 4.381660
22 2018-12-31 0.598000 3.671854
then using .eval("C = A * B", inplace=False)
we cann generate a new column "on the fly"
Upvotes: 1