Florent Georges
Florent Georges

Reputation: 2327

Several lines on the same diagram with Pandas plot() grouping

I have a CSV with 3 data sets, each coresponding to a line to plot. I use Pandas plot() grouping to group the entries for the 3 lines. This generates 3 separate diagrams, but I would like to plot all 3 lines on the same diagram.

The CSV:

shop,timestamp,sales
north,2023-01-01,235
north,2023-01-02,147
north,2023-01-03,387
north,2023-01-04,367
north,2023-01-05,197
south,2023-01-01,235
south,2023-01-02,98
south,2023-01-03,435
south,2023-01-04,246
south,2023-01-05,273
east,2023-01-01,197
east,2023-01-02,389
east,2023-01-03,87
east,2023-01-04,179
east,2023-01-05,298

The code (tested in Jupyter Lab):

import pandas as pd

csv = pd.read_csv('./tmp/sample.csv')
csv.timestamp = pd.to_datetime(csv.timestamp)

csv.plot(x='timestamp', by='shop')

This gives the following:

result

Any idea how to render them 3 on one single diagram?

Upvotes: 0

Views: 58

Answers (3)

semmyk-research
semmyk-research

Reputation: 389

Plot using the ax keyword. df_csv.groupby('shop').plot(x='timestamp', ax=plt.gca())

Working code below.

## load libraries
import pandas as pd
import matplotlib.pyplot as plt

## load dataset
df_csv = pd.read_csv('datasets/SO_shop_timestamp_sale.csv')

## check dataset
df_csv.head(3)
df_csv.describe()
df_csv.shape

## ensure data type
df_csv.timestamp = pd.to_datetime(df_csv.timestamp)
df_csv.sales = pd.to_numeric(df_csv.sales)

## Pandas plot of sales against timestamp grouped by shop, using `ax` keyword to subplot.
df_csv.groupby('shop').plot(x='timestamp', ax=plt.gca())

## Pandas plot of timestamp and sales grouped by shop, use `ax` keyword to plot on combined axes.
df_csv.groupby('shop').plot(x='timestamp', kind='kde', ax=plt.gca())

Sales subplots over timestamp, groups by shop

Subplot of shops (east, north, south)

Upvotes: 0

semmyk-research
semmyk-research

Reputation: 389

[Seaborn alternative (to the native Pandas.Dataframe.plot answer]
This is posted as an alternate 'answer'; for clarity and not to lump them together.
Seaborn plots the sales per shop (designated by the hue) against the timestamp (formatted as days).

## import seaborn
import seaborn as sns
## data formater
import matplotlib.dates as mdates


## plot timestamp on horizontal (formated to days), sales on vertical
## with hue set to shop, seaborn plots sales per shop
ax = sns.lineplot(data=df_csv, x='timestamp', y='sales', hue='shop')

## set datetime to days. Ensure this is set AFTER setting ax
ax.xaxis.set_major_locator(locator=mdates.DayLocator())

enter image description here

Upvotes: 1

Corralien
Corralien

Reputation: 120509

You can create manually your subplot:

import matplotlib.pyplot as plt

fig, ax = plt.subplots()
for name, df in csv.groupby('shop'):
    df.plot(x='timestamp', y='sales', label=name, ax=ax)
ax.set_title('Sales')
plt.show()

enter image description here

Upvotes: 1

Related Questions