Reputation: 2012
I have 2 pandas dataframes like this:
date value
20100101 100
20100102 150
date value
20100102 150.01
20100103 180
The expected output should be:
date value
20100101 100
20100102 150
20100103 180
The 2nd dataframe always contains newest value that I'd like to add into the 1st dataframe. However, the value
on the same day may differ slightly between the two dataframes. I would like to ignore the same dates and focus on adding the new date and value into the 1st dataframe.
I've tried outer join
in pandas, but it gives me two columns value_x
and value_y
because the value are not essentially the same on same dates. Any solution to this?
Upvotes: 1
Views: 438
Reputation: 862581
I believe need concat
with drop_duplicates
:
df = pd.concat([df1,df2]).drop_duplicates('date', keep='last')
print (df)
date value
0 20100101 100.00
0 20100102 150.01
1 20100103 180.00
df = pd.concat([df1,df2]).drop_duplicates('date', keep='first')
print (df)
date value
0 20100101 100.0
1 20100102 150.0
1 20100103 180.0
Upvotes: 3