Reputation: 197
I'm trying to plot a line graph using Plotly + Streamlit, but grouping data by month.
I have this data:
id date grade
A 2020-01-01 100
B 2020-01-11 200
C 2020-01-21 500
D 2020-01-21 300
E 2020-02-01 100
F 2020-02-01 200
I want to plot the grade mean by month. This is what I got so far:
df = pd.to_datetime(df["date"], format="%m/%d/%y %I:%M%p")
df_grouped = df.groupby(
by=[df.index.month, df.index.year]
)
df.grade.mean().plot()
The problem is that I don't know how to do the same thing (or something close to) using plotly and on streamlit.
Any ideas?
Upvotes: 2
Views: 1137
Reputation: 61184
You haven't specified what kind of bar type you'd like, but I'm guessing a go.Bar()
will do. But we can change that. For grouping and aggregating I would use the following approach:
df['date'] = pd.to_datetime(df["date"])
df['months'] = df['date'].dt.month_name()
df_months = df.groupby(['months']).agg('mean').reset_index()
new_order = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
df_months['months'] = pd.Categorical(df_months['months'], categories=new_order, ordered=True)
df_months = df_months.sort_values('months')
Why so complicated? Because I'm assuming you'd like to have month names on your x-axis. And after grouping and aggregating values by month name, the order of the months can become messed up. The somewhat laborious approach above makes sure that doesn't happen, and that you can en up with this bar chart with the months in correct order:
import pandas as pd
import plotly.graph_objects as go
df = pd.DataFrame({'id': {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'E', 5: 'F'},
'date': {0: '2020-01-01',
1: '2020-01-11',
2: '2020-01-21',
3: '2020-01-21',
4: '2020-02-01',
5: '2020-02-01'},
'grade': {0: 100, 1: 200, 2: 500, 3: 300, 4: 100, 5: 200}})
df['date'] = pd.to_datetime(df["date"])
df['months'] = df['date'].dt.month_name()
df_months = df.groupby(['months']).agg('mean').reset_index()
new_order = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
df_months['months'] = pd.Categorical(df_months['months'], categories=new_order, ordered=True)
df_months = df_months.sort_values('months')
fig=go.Figure(go.Bar(x=df_months.months, y=df_months.grade))
fig.show()
Upvotes: 1