Travis Black
Travis Black

Reputation: 715

seaborn, how to plot by columns, without including missing data

I have a dataframe with keys of 'time_period', 'teamID', 'winrate'.

I am trying to plot the relationship between teamID and winrate, for each time period.

I do this with:

g = sns.catplot(x="teamID", y="winrate", kind="bar", col='time_period', 
col_wrap=1, data=df, height=5, aspect=2.5)

Works great, but here is the problem. Each time period does not necessarily contain the same teams. But, each output graph still uses every team in the database as a label for the X axis. For each period, my plot has a bunch of barless positions on the X axis because those teams don't exist within that period.

Is there a way to make each period's plot only show the teams that are applicable to that period?

Thank you.

Upvotes: 0

Views: 290

Answers (1)

Stef
Stef

Reputation: 30589

I don't think this is possible. Catplot was designed to compare data across categories, so displaying the gaps is the intended behavior. Instead you could make regular subplots like that:

df = pd.DataFrame({'teamID': [1,2,3,2,3,4], 'time_period': [2018,2018,2018,2019,2019,2019], 'winrate': [.8, .7, .9, .85, .8, .95]})
f, ax = plt.subplots(1, len(grp))
grp = df.groupby('time_period')
for i,g in enumerate(grp):
    sns.barplot(g[1].teamID.to_list(), g[1].winrate.to_list(), ax=ax[i])
    ax[i].title.set_text(g[0])

enter image description here

Please note that equal colors correspond to different teams across subplots, so you may want to include e.g.color=sns.color_palette()[0] to set all bars to the same color.

Upvotes: 1

Related Questions