gem9911
gem9911

Reputation: 1

How to change both the shape and colour of an individual scatter point in Python Matplotlib?

I am currently trying to import some data in a table into python to create a plot of one variable against another. I also want to group each of the data point by two of the other variable in the same table.

One of the variables (the one I want to assign colour to) only has 3 options. The other variable (the one I want to assign the shape to) only has 5. Both of which I can easily group the data into. The issue just comes with plotting, as not all of the groups contain all 3 options of the "colour" variable. I can get the scatter plot to show shapes or colours easily, it is when I combine them that I have an issue.

At the moment I can make it so that the colour is plotted, but there are two sets of shapes for each data point: one that is the correct shape, and the other just a standard point. If I remove what is causing the double points however, the colours are not correct.

This is my current code (with example data), I have given the colour variable letters, but the real data is as simplistic:

import matplotlib.pyplot as plt
import numpy as np

r = np.array([600, 2000, 980, 1770, 920, 1100, 220])
t = np.array([2.7, 12.67, 10.54, 1.3, 16.1, 0.92, 13.56])
spectra_type = np.array(['A', 'A', 'B', 'A', 'C', 'B', 'A'])
spectra_num = np.array([{'A': 0, 'B': 1, 'C': 2}[i] for i in spectra_type])

i = np.array(['Shape1','Shape2','Shape3','Shape4','Shape5','Shape2','Shape4'])
shape1 = np.where(i=='Shape1')[0]
shape2 = np.where(i=='Shape2')[0]
shape3 = np.where(i=='Shape3')[0]
shape4 = np.where(i=='Shape4')[0]
shape5 = np.where(i=='Shape5')[0]


plt.figure('fig 1')
plt.xlabel('x')
plt.ylabel('y')

plt.scatter(t[shape1], r[shape1], c=spectra_num[shape1], marker='D', label='Shape1')
plt.scatter(t[shape2], r[shape2], c=spectra_num[shape2], marker='^', label='Shape2')
plt.scatter(t[shape3], r[shape3], c=spectra_num[shape3], marker='o', label='Shape3')
plt.scatter(t[shape4], r[shape4], c=spectra_num[shape4], marker='s', label='Shape4')
plt.scatter(t[shape5], r[shape5], c=spectra_num[shape5], marker='*', label='Shape5')

first_legend = plt.legend(loc='upper left')
plt.gca().add_artist(first_legend)

scatter = plt.scatter(t, r, c=spectra_num)
plt.legend(handles=scatter.legend_elements()[0], labels=['A', 'B', 'C'], title='Colour')

This gives me the following graph, as you can see the shapes are all there but are overlayed with another "regular" shape.

Example plot from data

Any advice would be much appreciated!

Upvotes: 0

Views: 66

Answers (3)

gboffi
gboffi

Reputation: 25093

My take

Everything is pretty standard, except how I compute the handles for the legend, and how I place the legend outside of the Axes using a new (Matplotlib 3.7) feature of Figure.legend() loc keyword argument.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
np.random.seed(20250227)

N = 80 # no. of points
N1 = 3 # no. of different properties in category 1
N2 = 5 # no. of different properties in category 

names1 = 'Ear Eye Nose'.split()
names2 = 'Africa Asia Europe N.America S.America'.split()

d1 = dict(zip(range(N1), ['C'+str(i) for i in range(N1)]))
markers = list(Line2D.filled_markers)
np.random.shuffle(markers)
markers = markers[:N2]
d2 = dict(zip(range(N2), markers))

# fake data
x, y = np.random.rand(2, N)
cat1 = np.random.randint(N1, size=N)
cat2 = np.random.randint(N2, size=N)

fig = plt.figure(figsize=(6, 6), layout='constrained')
for c2 in range(N2):
    marker = markers[c2]
    x2 = [xx for xx, cc in zip(x, cat2) if cc==c2]
    y2 = [yy for yy, cc in zip(y, cat2) if cc==c2]
    colors = [d1[color] for color, cc in zip(cat1, cat2) if cc==c2]    
    plt.scatter(x2, y2,
                color=colors,
                marker=marker,
                )
plt.gca().set_aspect(1)
plt.xlim((-0.05, 1.05));
plt.ylim((-0.05, 1.05));
handles = [Line2D([], [],
                  color=d1[c1],
                  marker=d2[c2],
                  lw=0,
                  label=f'({names1[c1]}, {names2[c2]})'
                 )
           for c2 in range(N2) for c1 in range(N1)]
fig.legend(handles=handles, ncols=5, loc='outside upper center', fontsize='x-small',
           title='Cat1 is mapped to different colors, Cat2 to different shapes')
plt.show()

Upvotes: 1

Matt Pitkin
Matt Pitkin

Reputation: 6482

I would recommend that you use a package like seaborn, in particular, the scatterplot function, which will simplify things for you a lot. By putting the data into a dictionary, your example can be reduced to:

import seaborn as sns

shape_markers = {
    "Shape1": "D",
    "Shape2": "^",
    "Shape3": "o",
    "Shape4": "s",
    "Shape5": "*",
}

colours = {
    "A": "C0",
    "B": "C1",
    "C": "C2",
}

data = {
    "r": [600, 2000, 980, 1770, 920, 1100, 220],
    "t": [2.7, 12.67, 10.54, 1.3, 16.1, 0.92, 13.56],
    "spectra": ["A", "A", "B", "A", "C", "B", "A"],
    "shape": ["Shape1", "Shape2", "Shape3", "Shape4", "Shape5", "Shape2", "Shape4"],
}

ax = sns.scatterplot(
    data,
    x="t",
    y="r",
    hue="spectra",
    palette=colours,
    style="shape",
    markers=shape_markers,
)

ax.figure.show()

enter image description here

Upvotes: 0

pippo1980
pippo1980

Reputation: 3096

only way I found using your code, I had to modify some part of the input.

I guess you could have done the same the other way round:

import matplotlib.pyplot as plt
import numpy as np

r = np.array([600, 2000, 980, 1770, 920, 1100, 220])
t = np.array([2.7, 12.67, 10.54, 1.3, 16.1, 0.92, 13.56])

spectra_type = np.array(['red', 'red', 'blue', 'red', 'yellow', 'blue', 'red'])
spectra_num = np.array([{'red': 0, 'blue': 1, 'yellow': 2}[i] for i in spectra_type])

print(spectra_num)

i = np.array(['Shape1','Shape2','Shape3','Shape4','Shape5','Shape2','Shape4'])
shape1 = np.where(i=='Shape1')[0]
shape2 = np.where(i=='Shape2')[0]
shape3 = np.where(i=='Shape3')[0]
shape4 = np.where(i=='Shape4')[0]
shape5 = np.where(i=='Shape5')[0]

print(shape1, type(shape1))

print(spectra_num[shape1])
print(spectra_num[shape2])
print(spectra_num[shape3])
print(spectra_num[shape4])
print(spectra_num[shape5])


plt.figure('fig 1')
plt.xlabel('x')
plt.ylabel('y')

plt.scatter(t[shape1], r[shape1], c=spectra_type[shape1], marker='D', label='Shape1')
plt.scatter(t[shape2], r[shape2], c=spectra_type[shape2], marker='^', label='Shape2')
plt.scatter(t[shape3], r[shape3], c=spectra_type[shape3], marker='o', label='Shape3')
plt.scatter(t[shape4], r[shape4], c=spectra_type[shape4], marker='s', label='Shape4')
plt.scatter(t[shape5], r[shape5], c=spectra_type[shape5], marker='*', label='Shape5')

first_legend = plt.legend(loc='upper center')
first_legend.legend_handles[0].set_facecolor('black')
first_legend.legend_handles[1].set_facecolor('black')
first_legend.legend_handles[2].set_facecolor('black')
first_legend.legend_handles[3].set_facecolor('black')
first_legend.legend_handles[4].set_facecolor('black')


plt.gca().add_artist(first_legend)

red= plt.Circle((0, 0), 0.1, color='red')
blue= plt.Circle((0, 0), 0.1, color='blue')
yellow= plt.Circle((0, 0), 0.1, color='yellow')

plt.legend(handles= [red, blue, yellow ], labels=['red', 'blue', 'yellow'], title='Colour')


# scatter = plt.scatter(t, r, c=spectra_num)
# plt.legend(handles=scatter.legend_elements()[0], labels=['A', 'B', 'C'], title='Colour')

output:

enter image description here

I guess there is more than one way to do it, but most if not all of them can not be achieved superposing two different set of scatter plots (5 of them being plt.scatter lines and the last one scatter = .. line).

Maybe someone more knowledgeable will step in

Upvotes: 0

Related Questions