Reputation: 181
I'm looking at discrete data, but the colorbars in seaborn look like they're only set up for continuous variables.
My code produces the chart I want, but the labels on the colorbar don't line up with their respective colors.
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
df = pd.DataFrame(np.random.randint(0,6,size=(52, 7)))
colors = {0:'#05926f', 1:'#99cc33', 2:'#F26419', 3:'#F6AE2D', 4:'#06AED5', 5:'#3baa5d'}
cmap = mpl.colors.ListedColormap(list(colors.values()))
fig, ax = plt.subplots(figsize=(3,8))
ax = sns.heatmap(df, annot=True, cmap=cmap, linewidths=.05)
plt.plot()
I've tried adding Nevermind, see comment below.cbar_kws={"anchor": (0.5, 0.5)}
as a argument to the sns.heatmap call, but it throws an error saying it doesn't like the anchor keyword. But the seaborn documentation says I should be able to use arguments from pyplot.colorbar, but no luck.
I'm not sure how to center that label. Am I missing something obvious?
I'd also like to change the labels from #
to Group-#
if there's an easy solution to that.
Thank you!
Upvotes: 1
Views: 1080
Reputation: 339765
You probably want to use a boundary norm. The problem is that matplotlib would not know which value you want to have corresponding to which color. To give this information, a BoundaryNorm
can be used, specifying the bin edges of the colors. As in this example you have the integers 0,1,2,3,4,5
as the values, your bin edges are best chosen to be -0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5
, such that the values lie in the center of the bins.
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline
df = pd.DataFrame(np.random.randint(0,6,size=(52, 7)))
colors = {0:'#05926f', 1:'#99cc33', 2:'#F26419', 3:'#F6AE2D', 4:'#06AED5', 5:'#3baa5d'}
colorrange = range(len(list(colors.keys())))
colorlist = [colors[i] for i in colorrange]
cmap = mpl.colors.ListedColormap(colorlist)
bounds = np.array(range(len(list(colors.keys()))+1))-0.5
norm = mpl.colors.BoundaryNorm(bounds, len(colorrange))
fig, ax = plt.subplots(figsize=(4.2,8))
fig.subplots_adjust(right=0.8)
ax = sns.heatmap(df, annot=True, cmap=cmap, norm=norm,
cbar_kws={'format': 'Group-%g'}, linewidths=.05)
plt.show()
Code for when the data range does not start at 0, but is still N successive integers:
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline
df = pd.DataFrame(np.random.randint(1,7,size=(52, 7)))
colors = {6:'#05926f', 1:'#99cc33', 2:'#F26419', 3:'#F6AE2D', 4:'#06AED5', 5:'#3baa5d'}
colorrange = sorted(list(colors.keys()))
colorlist = [colors[i] for i in colorrange]
cmap = mpl.colors.ListedColormap(colorlist)
bounds = np.array(colorrange+[max(colorrange)+1])-0.5
norm = mpl.colors.BoundaryNorm(bounds, len(colorrange))
fig, ax = plt.subplots(figsize=(4.2,8))
fig.subplots_adjust(right=0.8)
ax = sns.heatmap(df, annot=True, cmap=cmap, norm=norm,
cbar_kws={'format': 'Group-%g'}, linewidths=.05)
plt.show()
Upvotes: 1