Simon
Simon

Reputation: 13

Creating a custom legend in plotnine

I'm having trouble in plotnine customising the legend beyond what's possible through aes()

I have the following code:

import pandas as pd
from plotnine import *

data1 = {'dilution': [2.000000, 2.477121, 2.954243, 3.431364, 3.908485, 4.385606, 4.862728, 5.339849, 5.816970, 2.000000, 2.477121, 2.954243, 3.431364, 3.908485, 4.385606, 4.862728, 5.339849, 5.816970],
'variable': ["mouse 1", "mouse 1", "mouse 1", "mouse 1", "mouse 1", "mouse 1", "mouse 1", "mouse 1", "mouse 1", "mouse 2", "mouse 2", "mouse 2", "mouse 2", "mouse 2", "mouse 2", "mouse 2", "mouse 2", "mouse 2"],
'value': [547.180708, 495.883622, 439.109089, 277.819313, 115.926188, 42.041189, 15.276367, 11.696537, 2.280014, 269.398164, 233.667531, 215.410352, 169.512070, 102.877518, 36.860550, 13.960504, 4.891481, -3.465304]}
df1 = pd.DataFrame.from_dict(data1)
data2 = {'dilution': [2.0, 2.0, 2.0],
'value': [-7.873768, -3.926121, 4.170833] }
df2 = pd.DataFrame.from_dict(data2)

data3 = {'dilution': [3.90309, 3.90309],
'value': [756.715198, 540.613828],
'variable': ["mouse 1", "mouse 2"]}
df3 = pd.DataFrame.from_dict(data3)

g = (ggplot(df1)
+ geom_line(aes(x='dilution', y='value', color='variable'), data=df1, size=1.0)
+ geom_point(aes(x='dilution', y='value', color='variable'), data=df1, size=1.0)
+ geom_point(aes(x='dilution', y='value'), data=df2, size=3.0)
+ geom_point(aes(x='dilution', y='value', color='variable'), data=df3, size=2.0, shape='s')
+ scale_x_continuous( )
)
print(g)

which produces the following graph:

example plotnine with black data points

As you can see, the datapoint from df2 do not appear in the legend. I would like a single black point in the legend to represent all the points from df2. I can display it in the legend if I change data2 as follows:

data2 = {'dilution': [2.0, 2.0, 2.0],
'value': [-7.873768, -3.926121, 4.170833],
'type': ['test', 'test', 'test']}

and then map it to the aesthetics as follows: geom_point(aes(x='dilution', y='value', color='type'), data=df2, size=3.0)

but then the points are no longer black, and I can't seem to change the points back to black again. Adding in a color='black' argument doesn't work:

example plotting with coloured legend points

Is there a better solution to keeping all the datapoint of df2 black while only appearing once in the legend as a black point?

Secondly, is there a way of adding into the legend a single black square to represent all the datapoints from df3?

Upvotes: 1

Views: 4068

Answers (1)

has2k1
has2k1

Reputation: 2375

The legend is automatic. The only way you can influence it is by changing the data, the aes mapping or the scale parameters. The problem is you are trying to create layers with different mappings and yet expect them to share a legend.

Is there a better solution to keeping all the datapoint of df2 black while only appearing once in the legend as a black point?

The solution is to manipulate the data into a single coherent whole, or make sure that the different dataframes have similar columns that are mapped to the same aesthetics (you seem to have done this already with the second df2). Then if you want to control the colours in the legend, you should use a manual scale.

+ scale_color_manual(['red', 'cyan', 'black'])

Secondly, is there a way of adding into the legend a single black square to represent all the datapoints from df3?

There is no way to do this.

The key takeaway is, the legend is a guide to understanding the data and if you have the urge to manipulate what items show up in it then the data has most-likely not been properly organised. Also, if you want to label "special" points use annotations.

Upvotes: 2

Related Questions