Reputation: 1494
I have two dataframes:
a.head()
AAPL SPY date
0 1000000.000000 1000000.000000 2010-01-04
1 921613.643818 969831.805642 2010-02-04
2 980649.393244 1000711.933790 2010-03-04
3 980649.393244 1000711.933790 2010-04-04
4 1232535.257461 1059090.504583 2010-05-04
and
b.head()
date test
0 2010-01-26 22:17:44 990482.664854
1 2010-03-09 22:37:17 998565.699784
2 2010-03-12 02:11:23 989957.374785
3 2010-04-05 18:01:37 994315.860439
4 2010-04-06 11:06:50 987887.723816
After I set the index for a
and b
(set_index('date')
), I can use the pandas plot()
function to create a nice plot with the date as the x-axis and the various columns as y-values. What I want to do is plot two dataframes with different indices on the same figure. As you can see from a
and b
, the indices are different, and I want to plot them on the same figure.
I tried merge
and concat
to join the dataframes together, but the resulting plot is not what I'd like because those functions insert numpy.NaN
in places where the date is not the same, which makes discontinuities in my plots. I can use pd.fillna()
but this is not what I'd like, since I'd rather it just connect the points together rather than drop down to 0.
Upvotes: 0
Views: 693
Reputation: 109520
Assuming you want the same time scale on the x-axis, then you will need timestamps as the index for for a
and b
before concatenating the columns.
You can then use interpolation to fill in the missing data, optionally with ffill()
as an additional operation if you want to fill forward past the last observed data point.
df = pd.concat([a, b.set_index('date')], axis=1)
df.interpolate(method='time').plot() # interpolate(method='time').ffill()
Upvotes: 1