Alberto
Alberto

Reputation: 145

Plotting two pandas time-series on the same axes with matplotlib - unexpected behavior

I'm working with two timeseries (df1 and df2) and when I try to plot the on the same x-axis, with different y axis I get unexpected behavior.

Below the code and data.

dates1 = ['2021-08-26', '2021-08-27', '2021-08-30', '2021-08-31',
               '2021-09-01', '2021-09-02', '2021-09-03', '2021-09-07',
               '2021-09-08', '2021-09-09', '2021-09-10', '2021-09-13',
               '2021-09-14', '2021-09-15', '2021-09-16', '2021-09-17',
               '2021-09-20', '2021-09-21', '2021-09-22', '2021-09-23',
               '2021-09-24', '2021-09-27', '2021-09-28', '2021-09-29',
               '2021-09-30', '2021-10-01', '2021-10-04', '2021-10-05',
               '2021-10-06', '2021-10-07', '2021-10-08']


dates2 = ['2021-08-29', '2021-09-05', '2021-09-12', '2021-09-19',
               '2021-09-26']

y1 = np.random.randn(len(dates1)).cumsum()
y2 = np.random.randn(len(dates2)).cumsum()

df1 = pd.DataFrame({'date':pd.to_datetime(dates1), 'y1':y1})
df1.set_index('date', inplace=True)

df2 = pd.DataFrame({'date':pd.to_datetime(dates2), 'y2':y2})
df2.set_index('date', inplace=True)

When plotting the two datasets together either I see no plot (first plot) or I see the y data resampled in some way I don't understand (second plot). If I plot the data separately there is no issue (third & fourth plots).

fig, axs = plt.subplots(1,4, figsize=[12,4])

df1.plot(ax=axs[0])
df2.plot(ax=axs[0], secondary_y=True)

df2.plot(ax=axs[1])
df1.plot(ax=axs[1], secondary_y=True)

df1.y1.plot(ax=axs[2])
df2.y2.plot(ax=axs[3])

plt.tight_layout()

enter image description here

Upvotes: 2

Views: 2608

Answers (1)

Trenton McKinney
Trenton McKinney

Reputation: 62373

  • pandas bug: #43972
  • The issue is how pandas deals with the xticks for different spans of datetimes.
    • Currently dates2 is less than one month. As you can see on the plots with pandas.DataFrame.plot, when the span is less than a month, the format is different. If dates2 spans at least a month, the issue doesn't occur. (e.g. dates2 = ['2021-08-29', '2021-09-05', '2021-09-12', '2021-09-19', '2021-09-26', '2021-09-29']).
  • Using secondary_y=True affects how pandas manages the ticks, because axs[0] plots correctly if secondary_y=True is removed.
    • I don't know why df1 will work if df2 is first as in axs[1], but df2 won't work when df1 is first.
fig, axs = plt.subplots(1, 4, figsize=[15, 6], sharey=False, sharex=False)
axs = axs.flatten()

df1.plot(ax=axs[0])
print(f'axs[0]: {axs[0].get_xticks()}')
ax4 = axs[0].twiny() 
df2.plot(ax=ax4, color='tab:orange')
print(f'ax4: {ax4.get_xticks()}')

df2.plot(ax=axs[1], color='tab:orange')
print(f'axs[1]: {axs[1].get_xticks()}')
df1.plot(ax=axs[1], secondary_y=True)
print(f'axs[1]: {axs[1].get_xticks()}')

df1.y1.plot(ax=axs[2])
print(f'axs[2]: {axs[2].get_xticks()}')

df2.y2.plot(ax=axs[3])
print(f'axs[3]: {axs[3].get_xticks()}')

plt.tight_layout()

[output]:
axs[0]: [18871. 18878. 18885. 18892. 18901. 18908.]
ax4: [2696 2697 2700]
axs[1]: [2696 2697 2700]  # after plotting df2
axs[1]: [2696 2697 2701 2702]  # after plotting df1
axs[2]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[3]: [2696 2697 2700]
  • Note the difference in the printed xticks, which are the locations on the axis for each tick.

enter image description here


fig, axs = plt.subplots(2, 2, figsize=[20, 12], sharey=False, sharex=False)
axs = axs.flatten()

axs[0].plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'axs[0]: {axs[0].get_xticks()}')
ax4 = axs[0].twinx()
ax4.plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'ax4: {ax4.get_xticks()}')

axs[1].plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'axs[1]: {axs[1].get_xticks()}')
ax5 = axs[1].twinx()
ax5.plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'ax5: {ax5.get_xticks()}')

axs[2].plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'axs[2]: {axs[2].get_xticks()}')
axs[3].plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'axs[3]: {axs[3].get_xticks()}')

[output]:
axs[0]: [18871. 18878. 18885. 18892. 18901. 18908.]
ax4: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[1]: [18868. 18871. 18875. 18879. 18883. 18887. 18891. 18895.]
ax5: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[2]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[3]: [18868. 18871. 18875. 18879. 18883. 18887. 18891. 18895.]

enter image description here

Upvotes: 3

Related Questions