Reputation: 3241
I am working with a dataframe containing data of 1 week.
y
ds
2017-08-31 10:15:00 1.000000
2017-08-31 10:20:00 1.049107
2017-08-31 10:25:00 1.098214
...
2017-09-07 10:05:00 99.901786
2017-09-07 10:10:00 99.950893
2017-09-07 10:15:00 100.000000
I create a new index by combining the weekday and time i.e.
y
dayIndex
4 - 10:15 1.000000
4 - 10:20 1.049107
4 - 10:25 1.098214
...
4 - 10:05 99.901786
4 - 10:10 99.950893
4 - 10:15 100.000000
The plot of this data is the following: The plot is correct as the labels reflect the data in the dataframe. However, when zooming in, the labels do not seem correct as they no longer correspond to their original values: What is causing this behavior?
Here is the code to reproduce this:
import datetime
import numpy as np
import pandas as pd
dtnow = datetime.datetime.now()
dindex = pd.date_range(dtnow , dtnow + datetime.timedelta(7), freq='5T')
data = np.linspace(1,100, num=len(dindex))
df = pd.DataFrame({'ds': dindex, 'y': data})
df = df.set_index('ds')
df = df.resample('5T').mean()
df['dayIndex'] = df.index.strftime('%w - %H:%M')
df= df.set_index('dayIndex')
df.plot()
Upvotes: 2
Views: 1533
Reputation: 339052
"What is causing this behavior?"
The formatter of an axes of a pandas dates plot is a matplotlib.ticker.FixedFormatter
(see e.g.
print plt.gca().xaxis.get_major_formatter()
). "Fixed" means that it formats the i
th tick (if shown) with some constant string.
When zooming or panning, you shift the tick locations, but not the format strings.
In short: A pandas date plot may not be the best choice for interactive plots.
Solution
A solution is usually to use matplotlib formatters directly. This requires the dates to be datetime
objects (which can be ensured using df.index.to_pydatetime()
).
import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates
dtnow = datetime.datetime.now()
dindex = pd.date_range(dtnow , dtnow + datetime.timedelta(7), freq='110T')
data = np.linspace(1,100, num=len(dindex))
df = pd.DataFrame({'ds': dindex, 'y': data})
df = df.set_index('ds')
df.index.to_pydatetime()
df.plot(marker="o")
plt.gca().xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%w - %H:%M'))
plt.show()
Upvotes: 2