Reputation: 1754
Input
df1
id date v1
a 2020-1-1 1
a 2020-1-2 2
b 2020-1-4 10
b 2020-1-22 30
c 2020-2-4 10
c 2020-2-22 30
df2
id date v1
a 2020-1-3 1
b 2020-1-7 12
b 2020-1-22 13
c 2020-2-10 15
c 2020-2-22 60
Goal
id date v1 v2
a 2020-1-1 1 0
a 2020-1-2 2 0
a 2020-1-3 0 1
b 2020-1-4 10 0
b 2020-1-7 0 12
b 2020-1-22 30 13
c 2020-2-4 10 0
c 2020-2-10 0 15
c 2020-2-22 30 60
The details:
id
, the date is unique.df
based on id
, each id
contains all date values from two dataframedf1
and df2
, it returns original values, while the date only in one of df1
and df2
, it returns original value and 0 if there is no value on the date.Try
I have searched merge, concat document but I could not find the answers.
Upvotes: 1
Views: 33
Reputation: 862671
First convert columns to datetimes for correct ordering by to_datetime
, then DataFrame.merge
with outer join and rename column v1
for df2
for avoid v1_x
and v1_y
columns in output, replace missing values by DataFrame.fillna
, sorting output by DataFrame.sort_values
:
df1['date'] = pd.to_datetime(df1['date'])
df2['date'] = pd.to_datetime(df2['date'])
df = (df1.merge(df2.rename(columns={'v1':'v2'}), on=['id','date'], how='outer')
.fillna(0)
.sort_values(['id','date']))
print (df)
id date v1 v2
0 a 2020-01-01 1.0 0.0
1 a 2020-01-02 2.0 0.0
6 a 2020-01-03 0.0 1.0
2 b 2020-01-04 10.0 0.0
7 b 2020-01-07 0.0 12.0
3 b 2020-01-22 30.0 13.0
4 c 2020-02-04 10.0 0.0
8 c 2020-02-10 0.0 15.0
5 c 2020-02-22 30.0 60.0
Upvotes: 1