Rolo
Rolo

Reputation: 45

Matplotlib / Seaborn Countplot with different Categories in one Plot

I have two series with different lengths and amount of variables and want to plot how often each variable (Name) occurs per series. I want a grey countplot for series 1 and a red countplot for series 2, and I want them to be shown on top of each other. However, since series 2 is missing 'Nancy' it is also cutting series 1 count of 'Nancy'. How do i get a full overlay of the two series inkluding a bar for Nancy?

import matplotlib.pyplot as plt
import seaborn as sns

ser1 = pd.Series( ['tom','tom','bob','bob','nancy'])
ser2 = pd.Series( ['tom','bob'])

fig = plt.figure()
sns.countplot(x=ser1, color='grey')
sns.countplot(x=ser2, color='red')
plt.show()

enter image description here

Edit: Changing to the following will cause problems again. How do I make Matplotlib recognize that the two series have the same categorical values that are being counted?

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

ser1 = pd.Series( ['tom','tom','bob','bob','nancy','zulu'])
ser2 = pd.Series( ['tom','nancy'])

ser1 = ser1.astype('category')
ser2 = ser2.astype('category')

fig = plt.figure()
ax = sns.countplot(x=ser2, color='red', zorder=2)
sns.countplot(x=ser1, color='grey')

plt.show()

Upvotes: 2

Views: 3222

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339122

You may store the setting for the first plot and restore them after having plotted the second plot.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

ser1 = pd.Series( ['tom','tom','bob','bob','nancy','zulu'])
ser2 = pd.Series( ['tom','bob'])

fig = plt.figure()
ax = sns.countplot(x=ser1, color='grey')
ticks = ax.get_xticks()
ticklabels = ax.get_xticklabels()
lim = ax.get_xlim()

sns.countplot(x=ser2, color='red')
ax.set_xlim(lim)
ax.set_xticks(ticks)
ax.set_xticklabels(ticklabels)
plt.show()

enter image description here

The other option could be to plot the second plot first but set the zorder to a higher value, such that those bars appear in front of the later plot.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

ser1 = pd.Series( ['tom','tom','bob','bob','nancy','zulu'])
ser2 = pd.Series( ['tom','bob'])

fig = plt.figure()
ax = sns.countplot(x=ser2, color='red', zorder=2)
sns.countplot(x=ser1, color='grey')

plt.show()

In the more general case you need to use the order arument.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

ser1 = pd.Series( ['tom','tom','bob','bob','nancy', 'nancy' ,'zulu'])
ser2 = pd.Series( ['tom','nancy'])
order = ser1.append(ser2).unique()

fig = plt.figure()
ax = sns.countplot(x=ser2, color='red', order=order, zorder=2)
sns.countplot(x=ser1, color='grey', order=order)

plt.show()

In case you would rather use matplotlib's categoricals to create the plot, this would look as follows:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

ser1 = pd.Series( ['tom','tom','bob','bob','nancy', 'nancy' ,'zulu'])
ser2 = pd.Series( ['tom','nancy'])

u1, counts1 = np.unique(ser1.values, return_counts=True)
u2, counts2 = np.unique(ser2.values, return_counts=True)

fig, ax = plt.subplots()
ax.bar(u1,counts1, color='grey')
ax.bar(u2,counts2, color='red')

plt.show()

Upvotes: 5

Related Questions