Reputation: 142
I have a pandas dataframe which contains some sar output that I would like to plot in matplotlib. Sample data is below.
>>> cpu_data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 70 entries, 0 to 207
Data columns (total 8 columns):
00:00:01 70 non-null datetime64[ns]
CPU 70 non-null object
%user 70 non-null float64
%nice 70 non-null float64
%system 70 non-null float64
%iowait 70 non-null float64
%steal 70 non-null float64
%idle 70 non-null float64
dtypes: float64(6), object(2)
memory usage: 4.4+ KB
>>> cpu_data
00:00:01 CPU %user %nice %system %iowait %steal %idle
0 00:10:01 all 0.30 0.00 0.30 0.06 0.0 99.34
3 00:20:01 all 0.09 0.00 0.13 0.00 0.0 99.78
6 00:30:01 all 0.07 0.00 0.11 0.00 0.0 99.81
9 00:40:01 all 0.08 0.00 0.11 0.00 0.0 99.80
12 00:50:01 all 0.08 0.00 0.13 0.00 0.0 99.79
15 01:00:04 all 0.09 0.00 0.13 0.00 0.0 99.77
18 01:10:01 all 0.27 0.00 0.28 0.00 0.0 99.46
21 01:20:01 all 0.09 0.00 0.11 0.00 0.0 99.79
24 01:30:04 all 0.12 0.00 0.13 0.01 0.0 99.74
27 01:40:01 all 0.08 0.00 0.11 0.01 0.0 99.80
30 01:50:01 all 0.09 0.00 0.13 0.01 0.0 99.77
I want to plot using the timestamps as the x-axis. I have written the following code.
import pandas as pd
import os
import matplotlib.pyplot as plt
import matplotlib.dates as md
import dateutil
import matplotlib.dates as mdates
cpu_data[cpu_data.columns[0]] = [dateutil.parser.parse(s) for s in cpu_data[cpu_data.columns[0]]]
plt.subplots_adjust(bottom=0.2)
plt.xticks( rotation=25 )
ax=plt.gca()
ax.xaxis_date()
xfmt = md.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
cpu_data.plot(ax=ax)
plt.show()
But I get the following error
ValueError: view limit minimum -5.1000000000000005 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units
This doesn't make any sense because I manually converted all of the time stamp strings to datetime objects
cpu_data[cpu_data.columns[0]] = [dateutil.parser.parse(s) for s in cpu_data[cpu_data.columns[0]]]
But they don't appear to be the correct data type
2018-09-30 00:10:01 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2018-09-30 00:20:01 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2018-09-30 00:30:01 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2018-09-30 00:40:01 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2018-09-30 00:50:01 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2018-09-30 01:00:01 <class 'pandas._libs.tslibs.timestamps.Timestamp'>
I have no idea how to fix this. I have tried manually setting the x-axis to start on a datetime object value using plt.xlim(cpu_data[cpu_data.columns[0]].iloc[0])
but this produces the same error. I really am lost here. Any guidance would be appreciated. I can provide more information if it would help.
EDIT:
I think the dates are not the correct data type (as indicated by the error). It seems like pandas keeps converting the data in the time column (column 0) to on object of type pandas._libs.tslibs.timestamps.Timestamp
. I think it should be a datetime object as matplotlib complains about.
Upvotes: 5
Views: 7314
Reputation: 142
For those interested, this is how I ended up plotting the data using matplotlib
# Plot cpu
plt.figure(1)
plt.subplots_adjust(bottom=0.2)
plt.xticks(rotation=25)
ax=plt.gca()
ax.xaxis_date()
xfmt = md.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(xfmt)
plt.title(f'CPU usage on {remote_host}')
lines = plt.plot(dates, cpu_data[cpu_data.columns[2:]])
ax.legend(lines, [str(col) for col in list(cpu_data.columns[2:])])
plot.show()
Upvotes: 2