Akshay Gaur
Akshay Gaur

Reputation: 2525

How do change the legend on the right of the ggplot from continuous to discrete?

I am using plotnine to make a plot with multiple lines in it. The pandas dataframe looks like this:

df

     TIMESTAMP              TEMP    RANK   TIME
0    2011-06-01 00:00:00    24.3    1.0    0.000000
1    2011-06-01 00:05:00    24.5    1.0    0.083333
2    2011-06-01 00:10:00    24.2    1.0    0.166667
3    2011-06-01 00:15:00    24.1    1.0    0.250000
4    2011-06-01 00:20:00    24.2    1.0    0.333333
5    2011-06-01 00:25:00    24.3    1.0    0.416667
6    2011-06-01 00:30:00    24.4    1.0    0.500000
7    2011-06-01 00:35:00    24.5    1.0    0.583333
8    2011-06-01 00:40:00    24.4    1.0    0.666667
9    2011-06-01 00:45:00    24.4    1.0    0.750000
10    2011-07-01 00:00:00    24.3    2.0    0.000000
11    2011-07-01 00:05:00    24.5    2.0    0.083333
12    2011-07-01 00:10:00    24.2    2.0    0.166667
13    2011-07-01 00:15:00    24.1    2.0    0.250000
14    2011-07-01 00:20:00    24.2    2.0    0.333333
15    2011-07-01 00:00:00    24.3    2.0    0.000000
16    2011-08-01 00:05:00    24.5    3.0    0.083333
17    2011-08-01 00:10:00    24.2    3.0    0.166667
18    2011-08-01 00:15:00    24.1    3.0    0.250000
19    2011-08-01 00:20:00    24.2    3.0    0.333333
20    2011-08-01 00:25:00    24.4    3.0    0.416667

I want to plot TIME on the x-axis and TEMP on the y-axis. I also want to draw different lines based on the rank.

Here is how I am doing that:

ggplot()
+ geom_line(aes(x='TIME', y='TEMP', color='RANK', group='RANK'), data=df[df['RANK']<11])
+ scale_x_continuous(breaks=[4*x for x in range(7)])

enter image description here

How do I change legend of the ranks on the right? I want it to be discrete so that each color will represent a rank/date.

I don't know how to change this. I tried using scale_fill_continuous or scale_fill_discrete but was unsuccessful:

ggplot()
+ geom_line(aes(x='TIME', y='TEMP', color='RANK', group='RANK'), data=df[df['RANK']<11])
+ scale_x_continuous(breaks=[4*x for x in range(7)])
+ scale_fill_discrete(breaks=[x for x in range(1, 11)])

I get UserWarning: Cannot generate legend for the 'fill' aesthetic. Make sure you have mapped a variable to it "variable to it".format(output))

I get the same error if I use scale_fill_continuous(breaks=[x for x in range(1, 11)]).

I also tried scale_fill_manual(values=['blue', 'red', 'green', 'orange', 'purple', 'pink', 'black', 'yellow', 'cyan', 'magenta']) but I am not sure how to get it to work.

EDIT # 1

I understand now that this is because my RANK variable is float64 type and it needs to be of some other data type but question is which one? Because if I convert it to categorical, I get the error:

TypeError: Unordered Categoricals can only compare equality or not

Upvotes: 1

Views: 777

Answers (1)

Akshay Gaur
Akshay Gaur

Reputation: 2525

Okay, so I figured the solution to the problem. As noted in the question, the attribute that I am using to group the geom_line() is a float64. This is the reason why the grouping legend is continuous.

So, to fix this, I did the following:

d.RANK = d.RANK.astype('category', ordered=True)

Which also fixed the error as noted under Edit 1.

d.RANK = d.RANK.astype('str') works as well.

Upvotes: 1

Related Questions