Reputation: 692
I am trying to add trendline to bar plot which is plotted by plotly
Code:
import plotly.express as px
fig = px.bar(count, x="date", y="count",trendline="ols")
fig.update_layout(
xaxis_title="Date",
yaxis_title = "Count"
)
fig.show()
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-129-8b01de219d3c> in <module>
----> 1 fig = px.bar(count, x="date", y="count",trendline="ols")
2
3 fig.update_layout(
4 xaxis_title="Date",
5 yaxis_title = "Count"
TypeError: bar() got an unexpected keyword argument 'trendline'
Here is the data
How can I add a trendline successfully to this plot?
Upvotes: 4
Views: 15816
Reputation: 61084
px.bar
has no trendline
method. Since you're trying trendline="ols"
I'm guessing you'd like to create a linear trendline. And looking at your data, a linear trendline might just not be the best description of your dataset:
So you'll have to add a trendline yourself. You can still have your bar chart using go.Bar
, but maybe consider displaying the trendline as a line and not more bars.
A closer look into scikit or statsmodels should be well worth your while regarding non-linear trends. One simple approach is to estimate a log-linear trend after a recoding of your dataset. You'll see that this 'captures' the exponential increase of your variable
better than a simple linear trend does:
But is that good enough? I'll let that decision be up to you. And as I've said, you should take a closer look at the linked resources.
Code for plot 1:
from sklearn.linear_model import LinearRegression
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import datetime
# data
df=pd.DataFrame({'date': {0: '12.10.2019',
1: '13.10.2019',
2: '14.10.2019',
3: '15.10.2019',
4: '16.10.2019',
5: '17.10.2019',
6: '18.10.2019',
7: '19.10.2019',
8: '20.10.2019',
9: '21.10.2019',
10: '22.10.2019',
11: '23.10.2019',
12: '24.10.2019',
13: '25.10.2019',
14: '26.10.2019',
15: '27.10.2019',
16: '28.10.2019',
17: '29.10.2019',
18: '30.10.2019',
19: '31.10.2019',
20: '01.11.2019',
21: '02.11.2019',
22: '03.11.2019',
23: '04.11.2019',
24: '05.11.2019',
25: '06.11.2019',
26: '07.11.2019',
27: '08.11.2019',
28: '09.11.2019',
29: '10.11.2019',
30: '11.11.2019',
31: '12.11.2019',
32: '13.11.2019',
33: '14.11.2019',
34: '15.11.2019',
35: '16.11.2019',
36: '17.11.2019',
37: '18.11.2019',
38: '19.11.2019',
39: '20.11.2019',
40: '21.11.2019',
41: '22.11.2019',
42: '23.11.2019',
43: '24.11.2019',
44: '25.11.2019',
45: '26.11.2019',
46: '27.11.2019',
47: '28.11.2019',
48: '29.11.2019',
49: '30.11.2019',
50: '01.12.2019',
51: '02.12.2019',
52: '03.12.2019',
53: '04.12.2019',
54: '05.12.2019',
55: '06.12.2019',
56: '07.12.2019',
57: '08.12.2019',
58: '09.12.2019',
59: '10.12.2019',
60: '11.12.2019',
61: '12.12.2019',
62: '13.12.2019',
63: '14.12.2019',
64: '15.12.2019',
65: '16.12.2019',
66: '17.12.2019',
67: '18.12.2019',
68: '19.12.2019',
69: '20.12.2019',
70: '21.12.2019',
71: '22.12.2019',
72: '23.12.2019',
73: '24.12.2019',
74: '25.12.2019',
75: '26.12.2019',
76: '27.12.2019',
77: '28.12.2019',
78: '29.12.2019',
79: '30.12.2019',
80: '31.12.2019',
81: '01.01.2020',
82: '02.01.2020',
83: '03.01.2020',
84: '04.01.2020',
85: '05.01.2020',
86: '06.01.2020',
87: '07.01.2020',
88: '08.01.2020',
89: '09.01.2020',
90: '10.01.2020',
91: '11.01.2020',
92: '12.01.2020',
93: '13.01.2020',
94: '14.01.2020',
95: '15.01.2020',
96: '16.01.2020',
97: '17.01.2020',
98: '18.01.2020',
99: '19.01.2020',
100: '20.01.2020',
101: '21.01.2020',
102: '22.01.2020',
103: '23.01.2020',
104: '24.01.2020',
105: '25.01.2020',
106: '26.01.2020',
107: '27.01.2020',
108: '28.01.2020',
109: '29.01.2020',
110: '30.01.2020',
111: '31.01.2020'},
'count': {0: 19,
1: 12,
2: 13,
3: 18,
4: 13,
5: 19,
6: 15,
7: 14,
8: 12,
9: 6,
10: 15,
11: 15,
12: 12,
13: 17,
14: 13,
15: 14,
16: 11,
17: 11,
18: 11,
19: 9,
20: 14,
21: 15,
22: 11,
23: 13,
24: 14,
25: 14,
26: 16,
27: 16,
28: 17,
29: 13,
30: 14,
31: 14,
32: 12,
33: 6,
34: 14,
35: 12,
36: 16,
37: 15,
38: 19,
39: 18,
40: 17,
41: 17,
42: 17,
43: 17,
44: 19,
45: 15,
46: 20,
47: 21,
48: 19,
49: 18,
50: 22,
51: 21,
52: 21,
53: 18,
54: 21,
55: 23,
56: 22,
57: 17,
58: 25,
59: 28,
60: 24,
61: 26,
62: 23,
63: 23,
64: 22,
65: 26,
66: 25,
67: 24,
68: 24,
69: 24,
70: 24,
71: 27,
72: 26,
73: 28,
74: 28,
75: 29,
76: 34,
77: 31,
78: 38,
79: 37,
80: 34,
81: 45,
82: 43,
83: 44,
84: 49,
85: 47,
86: 54,
87: 49,
88: 57,
89: 62,
90: 65,
91: 55,
92: 67,
93: 69,
94: 72,
95: 45,
96: 89,
97: 87,
98: 90,
99: 121,
100: 140,
101: 173,
102: 163,
103: 171,
104: 183,
105: 165,
106: 189,
107: 201,
108: 230,
109: 290,
110: 311,
111: 321}})
Y=df['count']
X=df.index
# regression
reg = LinearRegression().fit(np.vstack(X), Y)
df['bestfit'] = reg.predict(np.vstack(X))
# plotly figure setup
fig=go.Figure()
fig.add_trace(go.Bar(name='X vs Y', x=X, y=Y.values))
fig.add_trace(go.Scatter(name='line of best fit', x=X, y=df['bestfit'], mode='lines'))
# plotly figure layout
fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y')
fig.show()
Code for plot 2:
from sklearn.linear_model import LinearRegression
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import datetime
# data
df=pd.DataFrame({'date': {0: '12.10.2019',
1: '13.10.2019',
2: '14.10.2019',
3: '15.10.2019',
4: '16.10.2019',
5: '17.10.2019',
6: '18.10.2019',
7: '19.10.2019',
8: '20.10.2019',
9: '21.10.2019',
10: '22.10.2019',
11: '23.10.2019',
12: '24.10.2019',
13: '25.10.2019',
14: '26.10.2019',
15: '27.10.2019',
16: '28.10.2019',
17: '29.10.2019',
18: '30.10.2019',
19: '31.10.2019',
20: '01.11.2019',
21: '02.11.2019',
22: '03.11.2019',
23: '04.11.2019',
24: '05.11.2019',
25: '06.11.2019',
26: '07.11.2019',
27: '08.11.2019',
28: '09.11.2019',
29: '10.11.2019',
30: '11.11.2019',
31: '12.11.2019',
32: '13.11.2019',
33: '14.11.2019',
34: '15.11.2019',
35: '16.11.2019',
36: '17.11.2019',
37: '18.11.2019',
38: '19.11.2019',
39: '20.11.2019',
40: '21.11.2019',
41: '22.11.2019',
42: '23.11.2019',
43: '24.11.2019',
44: '25.11.2019',
45: '26.11.2019',
46: '27.11.2019',
47: '28.11.2019',
48: '29.11.2019',
49: '30.11.2019',
50: '01.12.2019',
51: '02.12.2019',
52: '03.12.2019',
53: '04.12.2019',
54: '05.12.2019',
55: '06.12.2019',
56: '07.12.2019',
57: '08.12.2019',
58: '09.12.2019',
59: '10.12.2019',
60: '11.12.2019',
61: '12.12.2019',
62: '13.12.2019',
63: '14.12.2019',
64: '15.12.2019',
65: '16.12.2019',
66: '17.12.2019',
67: '18.12.2019',
68: '19.12.2019',
69: '20.12.2019',
70: '21.12.2019',
71: '22.12.2019',
72: '23.12.2019',
73: '24.12.2019',
74: '25.12.2019',
75: '26.12.2019',
76: '27.12.2019',
77: '28.12.2019',
78: '29.12.2019',
79: '30.12.2019',
80: '31.12.2019',
81: '01.01.2020',
82: '02.01.2020',
83: '03.01.2020',
84: '04.01.2020',
85: '05.01.2020',
86: '06.01.2020',
87: '07.01.2020',
88: '08.01.2020',
89: '09.01.2020',
90: '10.01.2020',
91: '11.01.2020',
92: '12.01.2020',
93: '13.01.2020',
94: '14.01.2020',
95: '15.01.2020',
96: '16.01.2020',
97: '17.01.2020',
98: '18.01.2020',
99: '19.01.2020',
100: '20.01.2020',
101: '21.01.2020',
102: '22.01.2020',
103: '23.01.2020',
104: '24.01.2020',
105: '25.01.2020',
106: '26.01.2020',
107: '27.01.2020',
108: '28.01.2020',
109: '29.01.2020',
110: '30.01.2020',
111: '31.01.2020'},
'count': {0: 19,
1: 12,
2: 13,
3: 18,
4: 13,
5: 19,
6: 15,
7: 14,
8: 12,
9: 6,
10: 15,
11: 15,
12: 12,
13: 17,
14: 13,
15: 14,
16: 11,
17: 11,
18: 11,
19: 9,
20: 14,
21: 15,
22: 11,
23: 13,
24: 14,
25: 14,
26: 16,
27: 16,
28: 17,
29: 13,
30: 14,
31: 14,
32: 12,
33: 6,
34: 14,
35: 12,
36: 16,
37: 15,
38: 19,
39: 18,
40: 17,
41: 17,
42: 17,
43: 17,
44: 19,
45: 15,
46: 20,
47: 21,
48: 19,
49: 18,
50: 22,
51: 21,
52: 21,
53: 18,
54: 21,
55: 23,
56: 22,
57: 17,
58: 25,
59: 28,
60: 24,
61: 26,
62: 23,
63: 23,
64: 22,
65: 26,
66: 25,
67: 24,
68: 24,
69: 24,
70: 24,
71: 27,
72: 26,
73: 28,
74: 28,
75: 29,
76: 34,
77: 31,
78: 38,
79: 37,
80: 34,
81: 45,
82: 43,
83: 44,
84: 49,
85: 47,
86: 54,
87: 49,
88: 57,
89: 62,
90: 65,
91: 55,
92: 67,
93: 69,
94: 72,
95: 45,
96: 89,
97: 87,
98: 90,
99: 121,
100: 140,
101: 173,
102: 163,
103: 171,
104: 183,
105: 165,
106: 189,
107: 201,
108: 230,
109: 290,
110: 311,
111: 321}})
Y=np.log(df['count'])
X=df.index
# log regression
df_log=pd.DataFrame({'X':df.index,
'Y': np.log(df['count'])})
df_log.set_index('X', inplace = True)
reg = LinearRegression().fit(np.vstack(df_log.index), df_log['Y'])
df_log['bestfit'] = reg.predict(np.vstack(df_log.index))
df_new=pd.DataFrame({'X':df.index,
'Y':np.exp(df['count']),
'trend':np.exp(df_log['bestfit'])})
df_new.set_index('X', inplace=True)
# plotly figure setup
fig=go.Figure()
fig.add_trace(go.Bar(name='X vs Y', x=df_new.index, y=df['count']))
fig.add_trace(go.Scatter(name='line of best fit', x=df_new.index, y=df_new['trend'], mode='lines'))
# plotly figure layout
fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y')
fig.show()
Upvotes: 8
Reputation: 368
barplots do not support trendline
in plotly.express. I think the cleanest way to do it would be fitting a separate linear regression model using sklearn (Plotly: How to plot a regression line using plotly?)
Another (maybe easier) way would be getting the trendline from a scatter plot and adding it to the barchart.
import plotly.graph_objects as go
help_fig = px.scatter(df, x="sepal_width", y="sepal_length", trendline="ols")
x_trend = help_fig["data"][1]['x']
y_trend = help_fig["data"][1]['y']
fig.add_trace(go.Line(x=x_trend, y=y_trend))
I hope I could help you,
Best wishes
Sören
Upvotes: 2