Reputation: 1151
I have a log which describes my home ADSL speeds. Log entries are in the following format, where the fields are datetime;level;downspeed;upspeed;testhost:
2020-01-06 18:09:45;INFO;211.5;29.1;0;host:spd-pub-rm-01-01.fastwebnet.it
2020-01-06 18:14:39;WARNING;209.9;28.1;0;host:spd-pub-rm-01-01.fastwebnet.it
2020-01-08 10:51:27;INFO;211.6;29.4;0;host:spd-pub-rm-01-01.fastwebnet.it
(for a full sample file -> https://www.dropbox.com/s/tfmj9ozxe5millx/test.log?dl=0 for you to download for the code below)
I wish to plot a matplot figure with the download speeds on the left axis, the upload speeds (which are on a smaller and lower range of values) and have the shortened datetimes under the x tick marks possibly at 45 degrees angle.
"""Plots the adsl-log generated log."""
import matplotlib.pyplot as plt
# import matplotlib.dates as mdates
import pandas as pd
# set field delimiter and set column names which will also cause reading from row 1
data = pd.read_csv("test.log", sep=';', names=[
'datetime', 'severity', 'down', 'up', 'loss', 'server'])
# we need to filter out ERROR records (with 0 speeds)
indexNames = data[data['severity'] == 'ERROR'].index
data.drop(indexNames, inplace=True)
# convert datetime pandas objecti to datetime64
data['datetime'] = pd.to_datetime(data['datetime'])
# use a dataframe with just the data I need; cleaner
speeds_df = data[['datetime', 'down', 'up']]
speeds_df.info() # this shows datetime column is really a datetime64 value now
# now let's plot
fig, ax = plt.subplots()
y1 = speeds_df.plot(ax=ax, x='datetime', y='down', grid=True, label="DL", legend=True, linewidth=2,ylim=(100,225))
y2 = speeds_df.plot(ax=ax, x='datetime', y='up', secondary_y=True, label="UL", legend=True, linewidth=2, ylim=(100,225))
plt.show()
I am now obtaining the plot I need but would appreciate some clarification about the roles of the ax, y1 and y2 axes in the above code.
Upvotes: 1
Views: 3734
Reputation: 107567
First, assigning y1 and y2 objects is unnecessary as you will never use them later on. Also, legend=True
is the default.
Per matplotlib.pyplot.subplots docs, the return of ax
is:
ax : axes.Axes object or array of Axes objects
Per pandas.DataFrame.plot, the ax
argument:
ax : matplotlib axes object, default None
Therefore, you are first initializing an array of axes objects (defaulting to one item, nrow=1
and nrow=2
), and then assigning it/them according to the pandas plots. Now, normally, you would be overwriting the assignment of ax with ax=ax
, but since you employ a secondary y-axis, plots overlay with each other:
# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(figsize=(8,4))
# ASSIGN AXES OBJECTS ACCORDINGLY
speeds_df.plot(ax=axs, x='datetime', y='down', grid=True, label="DL", linewidth=2, ylim=(100,225))
speeds_df.plot(ax=axs, x='datetime', y='up', secondary_y=True, label="UL", linewidth=2, ylim=(100,225))
plt.show()
To illustrate how axes objects can be extended, see below with multiple (non-overlaid) plots.
Example of multiple subplots using nrows=2
:
# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(nrows=2, figsize=(8,4))
# ASSIGN AXES OBJECTS WITH INDEXING AND NO Y LIMITS
speeds_df.plot(ax=axs[0], x='datetime', y='down', grid=True, label="DL", linewidth=2)
plt.subplots_adjust(hspace = 1)
speeds_df.plot(ax=axs[1], x='datetime', y='up', label="UL", linewidth=2)
plt.show()
Example of multiple plots using ncols=2
:
# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(ncols=2, figsize=(12,4))
# ASSIGN AXES OBJECTS WITH INDEXING AND NO Y LIMITS
speeds_df.plot(ax=axs[0], x='datetime', y='down', grid=True, label="DL", linewidth=2)
speeds_df.plot(ax=axs[1], x='datetime', y='up', label="UL", linewidth=2)
plt.show()
You can even use subplots=True
after setting date/time field as index:
# INITIALIZE FIG DIMENSION AND AXES OBJECTS
fig, axs = plt.subplots(figsize=(8,4))
# ASSIGN AXES OBJECT PLOTTING ALL COLUMNS
speeds_df.set_index('datetime').plot(ax=axs, subplots=True, grid=True, label="DL", linewidth=2)
plt.show()
Upvotes: 3
Reputation: 1151
So thanks to @Parfait I hope I understood things correctly. Here the working code:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
###### Prepare the data to plot
# set field delimiter and set column names which will also cause reading from row 1
data = pd.read_csv('test.log', sep=';', names=[
'datetime', 'severity', 'down', 'up', 'loss', 'server'])
# we need to filter out ERROR records (with 0 speeds)
indexNames = data[data['severity'] == 'ERROR'].index
data.drop(indexNames, inplace=True)
# convert datetime pandas object to datetime64
data['datetime'] = pd.to_datetime(data['datetime'])
# use a dataframe with just the data I need; cleaner
speeds_df = data[['datetime', 'down', 'up']]
# now plot the graph
fig, ax = plt.subplots()
color = 'tab:green'
ax.set_xlabel('thislabeldoesnotworkbutcolordoes', color=color)
ax.tick_params(axis='x', labelcolor=color)
color = 'tab:red'
speeds_df.plot(ax=ax, x='datetime', y='down', label="DL", legend=True, linewidth=2, color=color)
ax.set_ylabel('DL', color=color)
ax.tick_params(axis='y', labelcolor=color)
color = 'tab:blue'
ax2 = speeds_df.plot(ax=ax, x='datetime', y='up', secondary_y=True, label="UL", legend=True, linewidth=2, color=color)
ax2.set_ylabel('UL', color=color)
ax2.tick_params(axis='y', labelcolor=color)
# using ylim in the plot command params does not work the same
# cannot show a grid since the two scales are different
ax.set_ylim(10, 225)
ax2.set_ylim(15, 50)
plt.show()
What I still don't get is: a) why the x-axis label only seems to honour the color but not the string value :( b) why the ylim=(n,m) parameters in the df plot does not work well and I have to use the ax.set_ylim constructs instead
Upvotes: 0