Reputation: 71
So I have a table in Pandas dataframe (python) where I want to plot one column with labels from another column over a time column.
For example:
fruit | fruit_count | datestamp
apple 20 03-2018
kiwi 10 03-2018
mango 35 03-2018
apple 16 04-2018
kiwi 18 04-2018
mango 40 04-2018
. . .
. . .
apple 50 03-2020
kiwi 70 03-2020
mango 120 03-2020
Basically it would be one plot where the x-axis is the datestamp (03-2018, 04-2018, ..., 03-2020) and there would be 3 line plots - one for apple, kiwi, and mango with 3 corresponding labels.
Currently, I try to do it by just parsing the unique fruit names from the dataframe
fruits = list(set(fruit_df['fruit'].tolist()))
and then I loop through and plot each one
for fruit in fruits:
fruit_df[fruit_df['fruit'] == fruit].plot(x='datestamp', y='fruit_count')
Is there a better way to do this which would do this all in one line and would plot everything on one graph instead of 3 different ones.
Upvotes: 2
Views: 6796
Reputation: 1267
In case, the combination of datestamp
and fruit
are non-unique
:
fruit_df.groupby(['datestamp', 'fruit'])['fruit_count'].sum().unstack().plot(kind='bar')
In case, the combination is unique
, this should also work:
df.pivot(index='datestamp', columns='fruit', values='fruit_count').plot(kind='bar')
Upvotes: 0
Reputation: 59569
You have a few options. If you really want a one-line solution you'll want seaborn
, or to reshape your data using pivot
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
N = 20
df = pd.DataFrame({'fruit': ['apple', 'kiwi', 'mango']*N,
'date_stamp': np.repeat(pd.date_range('2010-01-01', freq='1M', periods=N), 3),
'fruit_count': np.random.randint(1,100, N*3)})
You use hue
to specify the groups.
sns.lineplot(data=df, hue='fruit', x='date_stamp', y='fruit_count')
Similar to your current implementation, but you can use groupby
to split into the sub-Frames.
fig, ax = plt.subplots()
for fruit, gp in df.groupby('fruit'):
gp.plot(x='date_stamp', y='fruit_count', ax=ax, label=fruit)
Pivot before plotting, then you just need a single plot call
df.pivot(index='date_stamp', columns='fruit', values='fruit_count').plot()
Axes and labeling slightly different between methods. This is the groupby output.
Upvotes: 2