Reputation: 189
Given the following pandas
DataFrame
, what is the idiomatic way to display a time series plot with discrete time buckets for each name category?
Name | Type |
---|---|
time | datetime64 |
name | object |
value | float64 |
My current solution requires a loop that extracts a DataFrame
and computes a max
aggregation for each name category.
import pandas as pd
import matplotlib.pyplot as plt
raw_df = pd.read_csv('...')
raw_df['time'] = raw_df['time'].astype('datetime64[ns]')
names = raw_df['name'].unique()
names.sort()
fig, ax = plt.subplots()
for name in names:
df = raw_df.query(f'name == "{name}"').set_index('time').resample('10T').max()
df.plot(ax = ax, label=name)
Upvotes: 0
Views: 471
Reputation: 8654
You could use:
df.pivot(index='time', values='value', columns='name').resample('10Y').max().plot(subplots=True)
Example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'name': np.append(np.repeat('a', 50), np.repeat('b', 50)),
'time': pd.to_datetime(2 * pd.date_range(start='1950-01-01', periods=50, freq='Y').values.tolist()),
'value': np.append(np.cumsum(np.random.lognormal(0, 1, 50)), np.cumsum(np.random.lognormal(0, 1, 50)))
})
df.pivot(index='time', values='value', columns='name').resample('10Y').max().plot(subplots=True)
See also this answer.
Upvotes: 1