Reputation: 145
I'm working with two timeseries (df1 and df2) and when I try to plot the on the same x-axis, with different y axis I get unexpected behavior.
Below the code and data.
dates1 = ['2021-08-26', '2021-08-27', '2021-08-30', '2021-08-31',
'2021-09-01', '2021-09-02', '2021-09-03', '2021-09-07',
'2021-09-08', '2021-09-09', '2021-09-10', '2021-09-13',
'2021-09-14', '2021-09-15', '2021-09-16', '2021-09-17',
'2021-09-20', '2021-09-21', '2021-09-22', '2021-09-23',
'2021-09-24', '2021-09-27', '2021-09-28', '2021-09-29',
'2021-09-30', '2021-10-01', '2021-10-04', '2021-10-05',
'2021-10-06', '2021-10-07', '2021-10-08']
dates2 = ['2021-08-29', '2021-09-05', '2021-09-12', '2021-09-19',
'2021-09-26']
y1 = np.random.randn(len(dates1)).cumsum()
y2 = np.random.randn(len(dates2)).cumsum()
df1 = pd.DataFrame({'date':pd.to_datetime(dates1), 'y1':y1})
df1.set_index('date', inplace=True)
df2 = pd.DataFrame({'date':pd.to_datetime(dates2), 'y2':y2})
df2.set_index('date', inplace=True)
When plotting the two datasets together either I see no plot (first plot) or I see the y data resampled in some way I don't understand (second plot). If I plot the data separately there is no issue (third & fourth plots).
fig, axs = plt.subplots(1,4, figsize=[12,4])
df1.plot(ax=axs[0])
df2.plot(ax=axs[0], secondary_y=True)
df2.plot(ax=axs[1])
df1.plot(ax=axs[1], secondary_y=True)
df1.y1.plot(ax=axs[2])
df2.y2.plot(ax=axs[3])
plt.tight_layout()
Upvotes: 2
Views: 2608
Reputation: 62373
dates2
is less than one month. As you can see on the plots with pandas.DataFrame.plot
, when the span is less than a month, the format is different. If dates2
spans at least a month, the issue doesn't occur. (e.g. dates2 = ['2021-08-29', '2021-09-05', '2021-09-12', '2021-09-19', '2021-09-26', '2021-09-29']
).secondary_y=True
affects how pandas manages the ticks, because axs[0]
plots correctly if secondary_y=True
is removed.
df1
will work if df2
is first as in axs[1]
, but df2
won't work when df1
is first.fig, axs = plt.subplots(1, 4, figsize=[15, 6], sharey=False, sharex=False)
axs = axs.flatten()
df1.plot(ax=axs[0])
print(f'axs[0]: {axs[0].get_xticks()}')
ax4 = axs[0].twiny()
df2.plot(ax=ax4, color='tab:orange')
print(f'ax4: {ax4.get_xticks()}')
df2.plot(ax=axs[1], color='tab:orange')
print(f'axs[1]: {axs[1].get_xticks()}')
df1.plot(ax=axs[1], secondary_y=True)
print(f'axs[1]: {axs[1].get_xticks()}')
df1.y1.plot(ax=axs[2])
print(f'axs[2]: {axs[2].get_xticks()}')
df2.y2.plot(ax=axs[3])
print(f'axs[3]: {axs[3].get_xticks()}')
plt.tight_layout()
[output]:
axs[0]: [18871. 18878. 18885. 18892. 18901. 18908.]
ax4: [2696 2697 2700]
axs[1]: [2696 2697 2700] # after plotting df2
axs[1]: [2696 2697 2701 2702] # after plotting df1
axs[2]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[3]: [2696 2697 2700]
xticks
, which are the locations on the axis for each tick.matplotlib.pyplot.plot
treats the dataframe datetime index the same.fig, axs = plt.subplots(2, 2, figsize=[20, 12], sharey=False, sharex=False)
axs = axs.flatten()
axs[0].plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'axs[0]: {axs[0].get_xticks()}')
ax4 = axs[0].twinx()
ax4.plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'ax4: {ax4.get_xticks()}')
axs[1].plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'axs[1]: {axs[1].get_xticks()}')
ax5 = axs[1].twinx()
ax5.plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'ax5: {ax5.get_xticks()}')
axs[2].plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'axs[2]: {axs[2].get_xticks()}')
axs[3].plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'axs[3]: {axs[3].get_xticks()}')
[output]:
axs[0]: [18871. 18878. 18885. 18892. 18901. 18908.]
ax4: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[1]: [18868. 18871. 18875. 18879. 18883. 18887. 18891. 18895.]
ax5: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[2]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[3]: [18868. 18871. 18875. 18879. 18883. 18887. 18891. 18895.]
Upvotes: 3