Emre
Emre

Reputation: 6226

Faceted plots of a multi-indexed DataFrame

How can I plot the three time series for each channel (inapp, email, push) with hue varying by 'enabled' using pandas, and seaborn? Note that the columns are MultiIndexed. I want the plots to share the y axis and have a common legend to indicate the value of 'enabled'.

|---------|---------------|--------------|---------------|
| channel | inapp         | email        | push          |
| enabled | true  | false | false | true | false | true  |
|---------|-------|-------|-------|------|-------|-------|
| 0       | 0     | 80    | 28    | 0    | 5     | 0     |
| 1       | 2     | 80    | 28    | 3    | 5     | 233   |
| 2       | 4     | 80    | 28    | 7    | 5     | 587   |
| 3       | 5     | 80    | 28    | 12   | 5     | 882   |
| 4       | 7     | 86    | 28    | 16   | 5     | 1292  |
|---------|-------|-------|-------|------|-------|-------|

Upvotes: 2

Views: 1678

Answers (2)

andrew_reece
andrew_reece

Reputation: 21264

Here's another way, using Paul H's .stack() approach (although I also couldn't figure it out with FacetGrid):

import pandas as pd
from matplotlib import pyplot as plt

enabled = [True, False]
channel =['inapp','email','push']
values = [0,2,4,5,7,80,80,80,80,86,28,28,28,28,28,
          0,3,7,12,16,5,5,5,5,5,0,233,587,882,1292]
values = np.array(values).reshape((5,6), order='F')

columns = pd.MultiIndex.from_product([channel,enabled], names=("channel","enabled"))
df = pd.DataFrame(values, columns=columns)

fig, ax = plt.subplots(1,3,sharey=True)

for i, (key, group) in enumerate(df.stack(level='channel').reset_index(level=1).groupby('channel')):
    group.plot(label=key, title=key, ax=ax[i])

UPDATE:
Here's a more compact version, using unstack() and factorplot().
The rename line is only in there for plot clarity, it can be removed.

df = (df.unstack('enabled')
        .reset_index()
        .rename(columns={'level_2':'time',0:'value'})
)
sns.factorplot(data=df, x='time', y='value', hue='enabled', col='channel')

timeseries plot

Upvotes: 1

andrew_reece
andrew_reece

Reputation: 21264

Seaborn may not be necessary.
Here's code that builds the data frame you've specified:

import pandas as pd

enabled = [True, False]
channel =['inapp','email','push']
values = [0,2,4,5,7,80,80,80,80,86,28,28,28,28,28,
          0,3,7,12,16,5,5,5,5,5,0,233,587,882,1292]
values = np.array(values).reshape((5,6), order='F')

columns = pd.MultiIndex.from_product([channel,enabled], 
                                     names=("channel","enabled"))
df = pd.DataFrame(values, columns=columns)

channel inapp       email        push      
enabled True  False True  False True  False
0           0    80    28     0     5     0
1           2    80    28     3     5   233
2           4    80    28     7     5   587
3           5    80    28    12     5   882
4           7    86    28    16     5  1292  

Assuming the time series you're referring to is comprised of the index values 0-4, and if it's acceptable to create subplots with pyplot, the following code will meet your specifications:

from matplotlib import pyplot as plt  

fig, ax = plt.subplots(1, 3, sharey=True)
for i, col in enumerate(channel):
    df.T.xs(col).T.plot(ax=ax[i], xticks=df.index, title=col)

panel plot

Granted, the transposition is a bit gymnastic. There may be Pandas-fu way to achieve the same effect, using groupby(), but I played around with it a bit and couldn't figure out a solution. Hope this helps.

Upvotes: 1

Related Questions