Reputation: 133
I have a line chart showing the number of incidents every year, and a secondary line presently showing the average number of incidents. I note in my chart 2020 was particularly busy, so I'd like to add a second or third line showing the first and second standard deviation, to visualise how many standard deviations 2020 was above the average.
Below is my code but for some reason it cannot be plotted:
y = year_group['Count']
x = year_group['year']
y_mean = [np.mean(y)]*len(x)
y_std = np.std(y)
fig,ax = plt.subplots()
data_line = ax.plot(x,y, label='Data', marker='o')
mean_line = ax.plot(x,y_mean, label='Mean', linestyle='--')
std_line = ax.plot(x,y_std,label = 'Standard deviation')
plt.show()
This is the error:
ValueError: x and y must have same first dimension, but have shapes (8,) and (1,)
I understand this error but I wanted to know how to achieve my goals. This is my current output:
Upvotes: 1
Views: 1379
Reputation: 908
You can use hlines
. Here's some sample code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
y = np.random.randint(40, 60, 61)
x = pd.date_range(start='2021-03-01', end='2021-04-30', freq='D')
y_mean = np.mean(y)
y_std = np.std(y)
fig,ax = plt.subplots()
data_line = ax.plot(x,y, label='Data', marker='o')
mean_line = plt.hlines(y=y_mean, xmin=x[0], xmax=x[-1], colors='green', linestyles='--', lw=2, label='Mean')
mean_line = plt.hlines(y=[y_mean-y_std, y_mean+y_std], xmin=x[0], xmax=x[-1], colors='red', linestyles='--', lw=2, label='Std Dev')
plt.show()
Edit: Expanding a bit, you're code is missing two pieces:
y_std = [np.std(y)]*len(x)
y_mean +- n*y_std
Upvotes: 2