David
David

Reputation: 1494

Plotting dataframes on same plot

I have two dataframes:

a.head()

             AAPL             SPY       date
0  1000000.000000  1000000.000000 2010-01-04
1   921613.643818   969831.805642 2010-02-04
2   980649.393244  1000711.933790 2010-03-04
3   980649.393244  1000711.933790 2010-04-04
4  1232535.257461  1059090.504583 2010-05-04

and

b.head()

                 date           test
0 2010-01-26 22:17:44  990482.664854
1 2010-03-09 22:37:17  998565.699784
2 2010-03-12 02:11:23  989957.374785
3 2010-04-05 18:01:37  994315.860439
4 2010-04-06 11:06:50  987887.723816

After I set the index for a and b (set_index('date')), I can use the pandas plot() function to create a nice plot with the date as the x-axis and the various columns as y-values. What I want to do is plot two dataframes with different indices on the same figure. As you can see from a and b, the indices are different, and I want to plot them on the same figure.

I tried merge and concat to join the dataframes together, but the resulting plot is not what I'd like because those functions insert numpy.NaN in places where the date is not the same, which makes discontinuities in my plots. I can use pd.fillna() but this is not what I'd like, since I'd rather it just connect the points together rather than drop down to 0.

Upvotes: 0

Views: 693

Answers (1)

Alexander
Alexander

Reputation: 109520

Assuming you want the same time scale on the x-axis, then you will need timestamps as the index for for a and b before concatenating the columns.

You can then use interpolation to fill in the missing data, optionally with ffill() as an additional operation if you want to fill forward past the last observed data point.

df = pd.concat([a, b.set_index('date')], axis=1)
df.interpolate(method='time').plot()  # interpolate(method='time').ffill()

Upvotes: 1

Related Questions