Reputation: 3494
Problem:
I have two columns of data (x
and y
points), and a third column with labels (values 0
or 1
). I want to plot x
and y
on a scatter plot, and color them according to whether the label is 0
or 1
, and I want a colorbar on the right of the plot.
Here is my data: https://www.dropbox.com/s/ffta3wgrl2vvcpw/data.csv?dl=0
Note: I know that since there are only two labels I will only get two colors despite using a colorbar; but this dataset is just used as an example here.
What I've done so far
import matplotlib.pyplot as plt
import csv
import matplotlib as m
#read in the data
with open('data.csv', 'rb') as infile:
data=[]
r = csv.reader(infile)
for row in r:
data.append(row)
col1, col2, col3 = [el for el in zip(*data)]
#I'd like to have a colormap going from red to green:
cdict = {
'red' : ( (0.0, 0.25, 0), (0.5, 1, 1), (1., 0.0, 1.)),
'green': ( (0.0, 0.0, 0.0), (0.5, 0.0, 0.0), (1., 1.0, 1.0)),
'blue' : ( (0.0, 0.0, 0.0), (1, 0.0, 0.0), (1., 0.0, 0.0))}
cm = m.colors.LinearSegmentedColormap('my_colormap', cdict)
# I got the following line from an example I saw; it works for me,
# but I don't really know how it works as an input to colorbar,
# and would like to know.
formatter = plt.FuncFormatter(lambda i, *args: ['0', '1'][int(i)])
plt.figure()
plt.scatter(col1, col2, c=col3)
plt.colorbar(ticks=[0, 1], format=formatter, cmap=cm)
The above code doesn't work because of the call to plt.colorbar
.
How can I make it work (what is missing), and is this the best way to do it?
The documentation on what the ticks
parameter is is incomprehensible to me. What is it exactly?
Documentation: http://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure.colorbar
Upvotes: 1
Views: 672
Reputation: 69203
You need to pass col3
to scatter as an array of floats, not a tuple, and not ints
So, this should work:
import matplotlib.pyplot as plt
import csv
import matplotlib as m
import numpy as np
#read in the data
with open('data.csv', 'rb') as infile:
data=[]
r = csv.reader(infile)
for row in r:
data.append(row)
col1, col2, col3 = [el for el in zip(*data)]
#I'd like to have a colormap going from red to green:
cdict = {
'red' : ( (0.0, 1.0, 1.0), (0.5, 0.0, 0.0), (1.0, 0.0, 0.0)),
'green': ( (0.0, 0.0, 0.0), (0.5, 0.0, 0.0), (1.0, 1.0, 1.0)),
'blue' : ( (0.0, 0.0, 0.0), (1.0, 0.0, 0.0), (1.0, 0.0, 0.0))}
cm = m.colors.LinearSegmentedColormap('my_colormap', cdict)
#I got the following line from an example I saw; it works for me, but I don't really know how it works as an input to colorbar, and would like to know.
formatter = plt.FuncFormatter(lambda i, *args: ['0', '1'][int(i)])
plt.figure()
plt.scatter(col1, col2, c=np.asarray(col3,dtype=np.float32),lw=0,cmap=cm)
plt.colorbar(ticks=[0, 1], format=formatter, cmap=cm)
As for ticks
, you are passing a list of where you want ticks on the colorbar. So, in your example, you have a tick at 0 and a tick at 1.
I've also fixed your cmap, to go from red to green. You need to tell scatter to use the cmap too.
Upvotes: 1