Reputation: 1613
I have two dataframes:
df1 = pd.DataFrame.from_dict({'Date': {1: '2021-01-02 00:00:00',
2: '2021-01-03 00:00:00',
3: '2021-01-04 00:00:00',
4: '2021-01-05 00:00:00',
5: '2021-01-06 00:00:00',
6: '2021-01-07 00:00:00',
7: '2021-01-08 00:00:00',
8: '2021-01-09 00:00:00',
9: '2021-01-10 00:00:00',
10: '2021-01-11 00:00:00',
11: '2021-01-12 00:00:00',
12: '2021-01-13 00:00:00',
13: '2021-01-14 00:00:00',
14: '2021-01-15 00:00:00',
15: '2021-01-16 00:00:00',
16: '2021-01-17 00:00:00',
17: '2021-01-18 00:00:00',
18: '2021-01-19 00:00:00',
19: '2021-01-20 00:00:00',
20: '2021-01-21 00:00:00',
21: '2021-01-22 00:00:00',
22: '2021-01-23 00:00:00',
23: '2021-01-24 00:00:00',
24: '2021-01-25 00:00:00',
25: '2021-01-26 00:00:00',
26: '2021-01-27 00:00:00',
27: '2021-01-28 00:00:00',
28: '2021-01-29 00:00:00',
29: '2021-01-30 00:00:00',
30: '2021-01-31 00:00:00',
31: '2021-02-01 00:00:00',
32: '2021-02-02 00:00:00',
33: '2021-02-03 00:00:00',
34: '2021-02-04 00:00:00',
35: '2021-02-05 00:00:00',
36: '2021-02-06 00:00:00',
37: '2021-02-07 00:00:00',
38: '2021-02-08 00:00:00',
39: '2021-02-09 00:00:00',
40: '2021-02-10 00:00:00',
41: '2021-02-11 00:00:00',
42: '2021-02-12 00:00:00',
43: '2021-02-13 00:00:00',
44: '2021-02-14 00:00:00',
45: '2021-02-15 00:00:00',
46: '2021-02-16 00:00:00',
47: '2021-02-17 00:00:00',
48: '2021-02-18 00:00:00',
49: '2021-02-19 00:00:00',
50: '2021-02-20 00:00:00',
51: '2021-02-21 00:00:00',
52: '2021-02-22 00:00:00',
53: '2021-02-23 00:00:00',
54: '2021-02-24 00:00:00',
55: '2021-02-25 00:00:00',
56: '2021-02-26 00:00:00',
57: '2021-02-27 00:00:00',
58: '2021-02-28 00:00:00',
59: '2021-03-01 00:00:00',
60: '2021-03-02 00:00:00',
61: '2021-03-03 00:00:00',
62: '2021-03-04 00:00:00',
63: '2021-03-05 00:00:00',
64: '2021-03-06 00:00:00',
65: '2021-03-07 00:00:00',
66: '2021-03-08 00:00:00',
67: '2021-03-09 00:00:00',
68: '2021-03-10 00:00:00',
69: '2021-03-11 00:00:00',
70: '2021-03-12 00:00:00',
71: '2021-03-13 00:00:00',
72: '2021-03-14 00:00:00',
73: '2021-03-15 00:00:00',
74: '2021-03-16 00:00:00',
75: '2021-03-17 00:00:00',
76: '2021-03-18 00:00:00',
77: '2021-03-19 00:00:00',
78: '2021-03-20 00:00:00',
79: '2021-03-21 00:00:00',
80: '2021-03-22 00:00:00',
81: '2021-03-23 00:00:00',
82: '2021-03-24 00:00:00',
83: '2021-03-25 00:00:00',
84: '2021-03-26 00:00:00',
85: '2021-03-27 00:00:00',
86: '2021-03-28 00:00:00',
87: '2021-03-29 00:00:00',
88: '2021-03-30 00:00:00',
89: '2021-03-31 00:00:00',
90: '2021-04-01 00:00:00',
91: '2021-04-02 00:00:00',
92: '2021-04-03 00:00:00',
93: '2021-04-04 00:00:00',
94: '2021-04-05 00:00:00',
95: '2021-04-06 00:00:00',
96: '2021-04-07 00:00:00',
97: '2021-04-08 00:00:00',
98: '2021-04-09 00:00:00',
99: '2021-04-10 00:00:00',
100: '2021-04-11 00:00:00',
101: '2021-04-12 00:00:00',
102: '2021-04-13 00:00:00',
103: '2021-04-14 00:00:00'},
'Col': {1: 818,
2: 712,
3: 684,
4: 642,
5: 630,
6: 273,
7: 327,
8: 365,
9: 486,
10: 890,
11: 1003,
12: 13060,
13: 7456,
14: 2897,
15: 1550,
16: 1018,
17: 1006,
18: 1302,
19: 1239,
20: 5023,
21: 3132,
22: 1448,
23: 1120,
24: 2862,
25: 2042,
26: 1390,
27: 1281,
28: 746,
29: 482,
30: 514,
31: 642,
32: 596,
33: 1002,
34: 869,
35: 1053,
36: 1377,
37: 12167,
38: 10190,
39: 4252,
40: 2648,
41: 2536,
42: 1617,
43: 1776,
44: 1235,
45: 3080,
46: 1299,
47: 2344,
48: 1739,
49: 1028,
50: 525,
51: 1181,
52: 1609,
53: 861,
54: 844,
55: 622,
56: 469,
57: 801,
58: 854,
59: 1875,
60: 1520,
61: 823,
62: 985,
63: 1072,
64: 538,
65: 599,
66: 2332,
67: 1098,
68: 743,
69: 6889,
70: 2107,
71: 916,
72: 617,
73: 634,
74: 589,
75: 566,
76: 2231,
77: 1087,
78: 772,
79: 857,
80: 1080,
81: 660,
82: 649,
83: 956,
84: 704,
85: 680,
86: 532,
87: 539,
88: 608,
89: 2750,
90: 1618,
91: 589,
92: 339,
93: 451,
94: 470,
95: 419,
96: 555,
97: 1152,
98: 890,
99: 512,
100: 411,
101: 407,
102: 773,
103: 2116}})
df2 = pd.DataFrame.from_dict({'Date': {0: '2021-02-07 00:00:00',
1: '2021-02-08 00:00:00',
2: '2021-02-22 00:00:00',
3: '2021-03-01 00:00:00',
4: '2021-03-18 00:00:00',
5: '2021-04-08 00:00:00',
6: '2021-04-12 00:00:00'},
'Titles': {0: '\xa0Interview with Le Journal du Dimanche ',
1: '\xa0European\xa0Parliament plenary\xa0debate on the ECB Annual Report',
2: '\xa0Investing in our climate, social and economic resilience',
3: '\xa0The coronavirus crisis and SMEs',
4: '\xa0Hearing of the Committee on Economic and Monetary Affairs of the European Parliament ',
5: '\xa0IMFC Statement ',
6: '\xa0Interview with CNBC'},
'Author': {0: 'Christine Lagarde',
1: 'Christine Lagarde',
2: 'Christine Lagarde',
3: 'Christine Lagarde',
4: 'Christine Lagarde',
5: 'Christine Lagarde',
6: 'Christine Lagarde'},
'Time': {0: '19:00',
1: '17:15',
2: '15:30',
3: '17:10',
4: '09:00',
5: nan,
6: nan}})
What I want to do is: if Date
in df2
is equal to Date
in df1
, then leave the value of Col
in df1
as it stands; if Date
in df2
is different from Date
in df1
, then set the values of Col
in df1
equal to 0. For example:
# this is the desired output: df1
Date Col
2021-01-01 0
2021-01-02 0
2021-01-03 0
. .
. .
. .
. .
2021-02-07 12167
.... ....
Can anyone help me with that?
Thanks!
Upvotes: 1
Views: 40
Reputation: 14949
Do you want this -
df1.loc[~df1['Date'].isin(df2.Date.to_numpy()), 'Col'] = 0
Will work without to_numpy() -
df1.loc[~df1['Date'].isin(df2.Date), 'Col'] = 0
Upvotes: 3