Reputation: 2525
I am using plotnine to make a plot with multiple lines in it. The pandas dataframe looks like this:
df
TIMESTAMP TEMP RANK TIME
0 2011-06-01 00:00:00 24.3 1.0 0.000000
1 2011-06-01 00:05:00 24.5 1.0 0.083333
2 2011-06-01 00:10:00 24.2 1.0 0.166667
3 2011-06-01 00:15:00 24.1 1.0 0.250000
4 2011-06-01 00:20:00 24.2 1.0 0.333333
5 2011-06-01 00:25:00 24.3 1.0 0.416667
6 2011-06-01 00:30:00 24.4 1.0 0.500000
7 2011-06-01 00:35:00 24.5 1.0 0.583333
8 2011-06-01 00:40:00 24.4 1.0 0.666667
9 2011-06-01 00:45:00 24.4 1.0 0.750000
10 2011-07-01 00:00:00 24.3 2.0 0.000000
11 2011-07-01 00:05:00 24.5 2.0 0.083333
12 2011-07-01 00:10:00 24.2 2.0 0.166667
13 2011-07-01 00:15:00 24.1 2.0 0.250000
14 2011-07-01 00:20:00 24.2 2.0 0.333333
15 2011-07-01 00:00:00 24.3 2.0 0.000000
16 2011-08-01 00:05:00 24.5 3.0 0.083333
17 2011-08-01 00:10:00 24.2 3.0 0.166667
18 2011-08-01 00:15:00 24.1 3.0 0.250000
19 2011-08-01 00:20:00 24.2 3.0 0.333333
20 2011-08-01 00:25:00 24.4 3.0 0.416667
I want to plot TIME
on the x-axis and TEMP
on the y-axis. I also want to draw different lines based on the rank.
Here is how I am doing that:
ggplot()
+ geom_line(aes(x='TIME', y='TEMP', color='RANK', group='RANK'), data=df[df['RANK']<11])
+ scale_x_continuous(breaks=[4*x for x in range(7)])
How do I change legend of the ranks on the right? I want it to be discrete so that each color will represent a rank/date.
I don't know how to change this. I tried using scale_fill_continuous or scale_fill_discrete but was unsuccessful:
ggplot()
+ geom_line(aes(x='TIME', y='TEMP', color='RANK', group='RANK'), data=df[df['RANK']<11])
+ scale_x_continuous(breaks=[4*x for x in range(7)])
+ scale_fill_discrete(breaks=[x for x in range(1, 11)])
I get UserWarning: Cannot generate legend for the 'fill' aesthetic. Make sure you have mapped a variable to it
"variable to it".format(output))
I get the same error if I use scale_fill_continuous(breaks=[x for x in range(1, 11)])
.
I also tried scale_fill_manual(values=['blue', 'red', 'green', 'orange', 'purple', 'pink', 'black', 'yellow', 'cyan', 'magenta'])
but I am not sure how to get it to work.
EDIT # 1
I understand now that this is because my RANK variable is float64 type and it needs to be of some other data type but question is which one? Because if I convert it to categorical, I get the error:
TypeError: Unordered Categoricals can only compare equality or not
Upvotes: 1
Views: 777
Reputation: 2525
Okay, so I figured the solution to the problem. As noted in the question, the attribute that I am using to group the geom_line() is a float64. This is the reason why the grouping legend is continuous.
So, to fix this, I did the following:
d.RANK = d.RANK.astype('category', ordered=True)
Which also fixed the error as noted under Edit 1.
d.RANK = d.RANK.astype('str')
works as well.
Upvotes: 1