Eli S
Eli S

Reputation: 1463

Time series plot of categorical or binary variables in pandas or matplotlib

I have data that represent a time series of categorical variables. I want to display the transitions in categories below a traditional line plot of related continuous time series to show off context as time evolves. I'd like to know the best way to do this. My attempt was in terms of Rectangles. The appearance is a bit weird, and importantly the axis labels for the x axis don't render as dates.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
from pandas.plotting import register_matplotlib_converters
import matplotlib.dates as mdates
register_matplotlib_converters()

t0 = pd.DatetimeIndex(["2017-06-01 00:00","2017-06-17 00:00","2017-07-03 00:00","2017-08-02 00:00","2017-08-09 00:00","2017-09-01 00:00"])
t1 = pd.DatetimeIndex(["2017-06-01 00:00","2017-08-15 00:00","2017-09-01 00:00"])
df0 = pd.DataFrame({"cat":[0,2,1,2,0,1]},index = t0)
df1 =  pd.DataFrame({"op":[0,1,0]},index=t1)

# Create new plot
fig,ax = plt.subplots(1,figsize=(8,3))

data_layout = {
  "cat" : {0: ('bisque','Low'),
           1: ('lightseagreen','Medium'),
           2: ('rebeccapurple','High')},
  "op" : {0: ('darkturquoise','Open'),
           1: ('tomato','Close')}
           }


vars =("cat","op")


dfs = [df0,df1]

all_ticks = []
leg = []

for j,(v,d) in enumerate(zip(vars,dfs)):
    dvals = d[v][:].astype("d")
    normal = mpl.colors.Normalize(vmin=0, vmax=2.)    

    colors = plt.cm.Set1(0.75*normal(dvals.as_matrix()))
    handles = []
    for i in range(d.count()-1):
        s = d[v].index.to_pydatetime()
        level = d[v][i]
        base = d[v].index[i]
        w = s[i+1] - s[i]
        patch=mpl.patches.Rectangle((base,float(j)),width=w,color=data_layout[v][level][0],height=1,fill=True)
        ax.add_patch(patch)

    for lev in data_layout[v]:    
        print data_layout[v][level]
        handles.append(mpl.patches.Patch(color=data_layout[v][lev][0],label=data_layout[v][lev][1]))
    all_ticks.append(j+0.5)

    leg.append( plt.legend(handles=handles,loc = (3-3*j+1)))

plt.axhline(y=1.,linewidth=3,color="gray")
plt.xlim(pd.Timestamp(2017,6,1).to_pydatetime(),pd.Timestamp(2017,9,1).to_pydatetime())
plt.ylim(0,2)
ax.add_artist(leg[0])  # two legends on one axis
ax.format_xdata = mdates.DateFormatter('%Y-%m-%d')   # This fails
plt.yticks(all_ticks,vars)
plt.show()

which produces this with no dates and has jittery lines:mock up. How do I fix this? Is there a better way entirely?

Upvotes: 0

Views: 1598

Answers (1)

Francesco Zambolin
Francesco Zambolin

Reputation: 601

This is a way to display dates on x-axis:

In your code substitute the line that fails with this one:

ax.xaxis.set_major_formatter((mdates.DateFormatter('%Y-%m-%d')))

But I don't remember how it should look like, can you show us the end-result again?

Upvotes: 1

Related Questions