Reputation: 306
I'm getting something weird with the legend in a seaborn jointplot. I want to plot some quantity y
as function of a quantity x
for 8 different datasets. These datasets have only two columns for x
and y
and a different number of rows. First of all I concatenate all rows of all datasets using numpy
y = np.concatenate(((data1[:,1]), (data2[:,1]), (data3[:,1]), (data4[:,1]),(data5[:,1]), (data6[:,1]), (data7[:,1]), (data8[:,1])), axis=0)
x = np.concatenate(((data1[:,0]), (data2[:,0]), (data3[:,0]), (data4[:,0]), (data5[:,0]), (data6[:,0]), (data7[:,0]), (data8[:,0])), axis=0)
Then I create the array of values which I will use for the parameter "hue" in the jointplot, which will distinguish the several datasets in the legend/colors. I do this by assigning at every dataset one number from 1 to 8,which is repeated for every row of the cumulative dataset:
indexes = np.concatenate((np.ones(len(data1[:,0])), 2*np.ones(len(data2[:,0])), 3*np.ones(len(data3[:,0])), 4*np.ones(len(data4[:,0])), 5*np.ones(len(data5[:,0])), 6*np.ones(len(data6[:,0])), 7*np.ones(len(data7[:,0])), 8*np.ones(len(data8[:,0]))), axis=0)
Then I create the dataset:
all_together = np.column_stack((x, y, indexes))
df = pd.DataFrame(all_together, columns = ['x','y','Dataset'])
So now I can create the jointplot. This is simply done by:
g = sns.jointplot(y="y", x="x", data=df, hue="Dataset", palette='turbo')
handles, labels = g.ax_joint.get_legend_handles_labels()
g.ax_joint.legend(handles=handles, labels=['data1', 'data2', 'data3', 'data4', 'data5', 'data6', 'data7', 'data8'], fontsize=10)
At this point, the problem is: all points are getting plotted (at least I think), but the legend only shows: data1, data2, data3, data4 and data5. I don't understand why it is not showing also the other three labels, and in this way the plot is difficult to read. I have checked and the cumulative dataset df
has the correct shape. Any ideas?
Upvotes: 0
Views: 640
Reputation: 80279
You can add legend='full'
to obtain a full legend. By default, sns.jointplot
uses sns.scatterplot
for the central plot. The keyword parameters which aren't used by jointplot
are sent to scatterplot
. The legend parameter can be "auto", "brief", "full", or False.
From the docs:
If “brief”, numeric hue and size variables will be represented with a sample of evenly spaced values. If “full”, every group will get an entry in the legend. If “auto”, choose between brief or full representation based on number of levels. If False, no legend data is added and no legend is drawn.
The following code is tested with seaborn 0.11.2:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
N = 200
k = np.repeat(np.arange(1, 9), N // 8)
df = pd.DataFrame({'x': 5 * np.cos(2 * k * np.pi / 8) + np.random.randn(N),
'y': 5 * np.sin(2 * k * np.pi / 8) + np.random.randn(N),
'Dataset': k})
g = sns.jointplot(y="y", x="x", data=df, hue="Dataset", palette='turbo', legend='full')
plt.show()
Upvotes: 2