Reputation: 1349
I want to plot a dataframe using matplotlib.
Why do I get an error here when plotting the dataframe?
The ds column should only contain datetime values. This is at least what I expect.
import pandas as pd
import numpy as np
import datetime
import matplotlib.pyplot as plt
np.random.seed(42)
start = pd.datetime(2000, 1, 1, 0, 0, 1)
ds = start
value = 10.0
df = pd.DataFrame(columns=["ds", "y"])
for runner in range(5):
df.loc[len(df)] = [ds, value]
value = value * (1 + np.random.normal(0, 0.01)) + 1
ds = ds + datetime.timedelta(minutes=1)
df.head()
ds y
0 2000-01-01 00:00:01 10.000000
1 2000-01-01 00:01:01 11.049671
2 2000-01-01 00:02:01 12.034394
3 2000-01-01 00:03:01 13.112339
4 2000-01-01 00:04:01 14.312044
plt.plot(df, "-o", markersize=2)
plt.show()
The end of the stacktrace shows:
`File "/home/user/anaconda3/lib/python3.6/site-packages/matplotlib/dates.py", line 1026, in viewlim_to_dt
.format(vmin))
ValueError: view limit minimum -36495.50013946759 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units`
Upvotes: 1
Views: 403
Reputation: 339340
It's not really clear what plot
is supposed to plot in case you supply a single argument, which is a multi-column dataframe. So it interpretes it as to plot each column as function of the dataframe index. Your first column are datetimes, your second column are floats. I don't think it makes sense trying to plot them on the same scale. Instead you probably want to use the first column as x axis and the second as y axis values.
You could do so by making the first column the index (effectively plotting a single column then)
plt.plot(df.set_index("ds"), "-o", markersize=2)
Or you could supply each column to the first two arguments of plot
,
plt.plot(df.ds, df.y, "-o", markersize=2)
You can also use pandas directly for plotting, resulting in a slightly different x axis formatting
df.set_index("ds").plot(marker="o", markersize=2)
Upvotes: 2