Eric O
Eric O

Reputation: 181

Center Seaborn Colorbar Labels

I'm looking at discrete data, but the colorbars in seaborn look like they're only set up for continuous variables.

My code produces the chart I want, but the labels on the colorbar don't line up with their respective colors.

import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df = pd.DataFrame(np.random.randint(0,6,size=(52, 7)))
colors = {0:'#05926f', 1:'#99cc33', 2:'#F26419', 3:'#F6AE2D', 4:'#06AED5', 5:'#3baa5d'}
cmap = mpl.colors.ListedColormap(list(colors.values()))
fig, ax = plt.subplots(figsize=(3,8))
ax = sns.heatmap(df, annot=True, cmap=cmap, linewidths=.05)
plt.plot()

I've tried adding cbar_kws={"anchor": (0.5, 0.5)} as a argument to the sns.heatmap call, but it throws an error saying it doesn't like the anchor keyword. But the seaborn documentation says I should be able to use arguments from pyplot.colorbar, but no luck.Nevermind, see comment below.

I'm not sure how to center that label. Am I missing something obvious?

I'd also like to change the labels from # to Group-# if there's an easy solution to that.

Thank you!

Upvotes: 1

Views: 1080

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339765

You probably want to use a boundary norm. The problem is that matplotlib would not know which value you want to have corresponding to which color. To give this information, a BoundaryNorm can be used, specifying the bin edges of the colors. As in this example you have the integers 0,1,2,3,4,5 as the values, your bin edges are best chosen to be -0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, such that the values lie in the center of the bins.

import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline

df = pd.DataFrame(np.random.randint(0,6,size=(52, 7)))
colors = {0:'#05926f', 1:'#99cc33', 2:'#F26419', 3:'#F6AE2D', 4:'#06AED5', 5:'#3baa5d'}

colorrange = range(len(list(colors.keys())))
colorlist = [colors[i] for i in colorrange]
cmap = mpl.colors.ListedColormap(colorlist)
bounds = np.array(range(len(list(colors.keys()))+1))-0.5
norm = mpl.colors.BoundaryNorm(bounds, len(colorrange))

fig, ax = plt.subplots(figsize=(4.2,8))
fig.subplots_adjust(right=0.8)
ax = sns.heatmap(df, annot=True, cmap=cmap, norm=norm, 
                 cbar_kws={'format': 'Group-%g'}, linewidths=.05)
plt.show()

enter image description here

Code for when the data range does not start at 0, but is still N successive integers:

import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline

df = pd.DataFrame(np.random.randint(1,7,size=(52, 7)))
colors = {6:'#05926f', 1:'#99cc33', 2:'#F26419', 3:'#F6AE2D', 4:'#06AED5', 5:'#3baa5d'}

colorrange = sorted(list(colors.keys()))
colorlist = [colors[i] for i in colorrange]
cmap = mpl.colors.ListedColormap(colorlist)
bounds = np.array(colorrange+[max(colorrange)+1])-0.5
norm = mpl.colors.BoundaryNorm(bounds, len(colorrange))

fig, ax = plt.subplots(figsize=(4.2,8))
fig.subplots_adjust(right=0.8)
ax = sns.heatmap(df, annot=True, cmap=cmap, norm=norm, 
                 cbar_kws={'format': 'Group-%g'}, linewidths=.05)
plt.show()

Upvotes: 1

Related Questions