Reputation: 1257
I'm trying to plot a scatter plot with Matplotlib, but i'm having troubles setting colors.
Here is my code:
colors = [(141, 0, 248, 0.4) if x >= 150 and x < 200 else
(0, 244, 248, 0.4) if x >= 200 and x < 400 else
(255, 255, 0, 0.7) if x >= 400 and x < 600 else
(255, 140, 0, 0.8) if x >= 600 else (255, 0, 0, 0.8) for x in MyData.Qty]
print(len(colors))
ax1.scatter(MyData.Date, MyData.Rate, s=20, c=colors, marker='_')
Basically, i have a column called Qty
on my dataframe, and according to that value, the colors is chosen. If Qty is bigger than x, the color will be red and so on, for example.
The previous code will give me the following error:
'c' argument has 2460 elements, which is inconsistent with 'x' and 'y' with size 615.
And i have no idea why does that happen, because if i try the following code, it will work without any problem:
colors = ['red' if x >= 150 and x < 200 else
'yellow' if x >= 200 and x < 400 else
'green' if x >= 400 and x < 600 else
'blue' if x >= 600 else 'purple' for x in MyData.Qty]
Here is a sample of my data:
Date Rate Qty
0 18 140 207.435145
0 18 141 155.019884
0 18 178 1222.215201
0 18 230 256.010358
0 19 9450 1211.310384
The following will work too:
colors = [(1,1,0,0.8) if x>1000 else (1,0,0,0.4) for x in MyData.Qty]
Upvotes: 0
Views: 1398
Reputation: 8790
Someone commented (and then deleted) referring to the documentation, but here is the part they were referring to (from plt.scatter
):
Note that c should not be a single numeric RGB or RGBA sequence because that is indistinguishable from an array of values to be colormapped. If you want to specify the same RGB or RGBA value for all points, use a 2-D array with a single row. Otherwise, value- matching will have precedence in case of a size matching with x and y.
But it seems that in addition, from here that matplotlib
is expecting the RGB values to be from 0 to 1, rather than 0 to 255. So I just added two lines to a) explicitly convert colors
as a numpy
2D array and b) divide the RGB values by 255 (leaving the alpha value untouched).
import matplotlib.pyplot as plt
import numpy as np
fig1, ax1 = plt.subplots()
colors = [(141, 0, 248, 0.4) if x >= 150 and x < 200 else
(0, 244, 248, 0.4) if x >= 200 and x < 400 else
(255, 255, 0, 0.7) if x >= 400 and x < 600 else
(255, 140, 0, 0.8) if x >= 600 else (255, 0, 0, 0.8) for x in MyData['Qty']]
#addition to convert colors
colors = np.array(colors)
colors[:,:3] /= 255
ax1.scatter(MyData['Date'], MyData["Rate"], s=20, c=colors, marker='_')
Removing the scaling (but still converting to 2D array), you will get the same error as you originally experienced, so I guess when it doesn't recognize 0 to 1 scaled RGB values, it tries to just interpret the flattened array and you get the 4x values problem.
Upvotes: 2