Elliot
Elliot

Reputation: 189

Pandas Multiple Time Series Plots Single Data Frame

Given the following pandas DataFrame, what is the idiomatic way to display a time series plot with discrete time buckets for each name category?

Name Type
time datetime64
name object
value float64

My current solution requires a loop that extracts a DataFrame and computes a max aggregation for each name category.

import pandas as pd
import matplotlib.pyplot as plt

raw_df = pd.read_csv('...')
raw_df['time'] = raw_df['time'].astype('datetime64[ns]')

names = raw_df['name'].unique()
names.sort()

fig, ax = plt.subplots()

for name in names:
    df = raw_df.query(f'name == "{name}"').set_index('time').resample('10T').max()
    df.plot(ax = ax, label=name)

Upvotes: 0

Views: 471

Answers (1)

user11989081
user11989081

Reputation: 8654

You could use:

df.pivot(index='time', values='value', columns='name').resample('10Y').max().plot(subplots=True)

Example:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'name': np.append(np.repeat('a', 50), np.repeat('b', 50)),
    'time': pd.to_datetime(2 * pd.date_range(start='1950-01-01', periods=50, freq='Y').values.tolist()),
    'value': np.append(np.cumsum(np.random.lognormal(0, 1, 50)), np.cumsum(np.random.lognormal(0, 1, 50)))
})

df.pivot(index='time', values='value', columns='name').resample('10Y').max().plot(subplots=True)

enter image description here

See also this answer.

Upvotes: 1

Related Questions