FatHippo
FatHippo

Reputation: 197

How do I label a specific point in a scatter plot with a unique ID?

I am creating an interactive graph for a layout that looks a lot like this:

Sample layout

Each point has a unique ID and is usually part of a group. Each group has their own color so I use multiple scatter plots to create the entire layout. I need the following to occur when I click on a single point:

  1. On mouse click, retrieve the ID of the selected point.
  2. Plug the ID into a black box function that returns a list of nearby* IDs.
  3. Highlight the points of the IDs in the returned list.

    *It is possible for some of the IDs to be from different groups/plots.

How do I:

  1. Associate each point with an ID and return the ID when the point is clicked?
  2. Highlight other points in the layout when all I know is their IDs?
  3. Re-position individual points while maintaining their respective groups i.e. swapping positions with points that belong to different groups/plots.

I used pyqtgraph before switching over to matplotlib so I first thought of creating a dictionary of IDs and their point objects. After experimenting with pick_event, it seems to me that the concept of point objects does not exist in matplotlib. From what I've learned so far, each point is represented by an index and only its PathCollection can return information about itself e.g. coordinates. I also learned that color modification of a specific point is done through its PathCollection whereas in pyqtgraph I can do it through a point object e.g. point.setBrush('#000000').

Upvotes: 1

Views: 1952

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339102

I am still convinced that using a single scatter plot would be the much better option. There is nothing in the question that would contradict that.

You can merge all your data in a single DataFrame, with columns group, id, x, y, color. The part in the code below which says "create some dataset" does create such a DataFrame

   group    id  x  y       color
0      1  AEBB  0  0   palegreen
1      3  DCEB  1  0        plum
2      0  EBCC  2  0  sandybrown
3      0  BEBE  3  0  sandybrown
4      3  BEBB  4  0        plum

Note that each group has its own color. One can then create a scatter from it, using the colors from the color column.

A pick event is registered as in this previous question and once a point is clicked, which is not already black, the id from the DataFrame corresponding to the selected point is obtained. From the id, other ids are generated via the "blackbox function" and for each id obtained that way the respective index of the point in the dataframe is determined. Because we have single scatter this index is directly the index of the point in the scatter (PathCollection) and we can paint it black.

import numpy as np; np.random.seed(1)
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors

### create some dataset
x,y = np.meshgrid(np.arange(20), np.arange(20))
group = np.random.randint(0,4,size=20*20)
l = np.array(np.meshgrid(list("ABCDE"),list("ABCDE"),
                     list("ABCDE"),list("ABCDE"))).T.reshape(-1,4)
ide = np.random.choice(list(map("".join, l)), size=20*20, replace=False)
df = pd.DataFrame({"id" : ide, "group" : group ,
                   "x"  : x.flatten(), "y"  : y.flatten() }) 
colors = ["sandybrown", "palegreen", "paleturquoise", "plum"]
df["color"] = df["group"]
df["color"].update(df["color"].map(dict(zip(range(4), colors ))))
print df.head()

### plot a single scatter plot from the table above
fig, ax = plt.subplots()
scatter = ax.scatter(df.x,df.y, facecolors=df.color, s=64, picker=4)


def getOtherIDsfromID(ID):
    """ blackbox function: create a list of other IDs from one ID """
    l = [np.random.permutation(list(ID)) for i in range(5)]
    return list(set(map("".join, l)))


def select_point(event):
    if event.mouseevent.button == 1:
        facecolor = scatter._facecolors[event.ind,:]
        if (facecolor == np.array([[0, 0, 0, 1]])).all():
            c = df.color.values[event.ind][0]
            c = matplotlib.colors.to_rgba(c)
            scatter._facecolors[event.ind,:] = c
        else:
            ID = df.id.values[event.ind][0]
            oIDs = getOtherIDsfromID(ID)
            # for each ID obtained, make the respective point black.
            rows = df.loc[df.id.isin([ID] + oIDs)]
            for i, row in rows.iterrows():
                scatter._facecolors[i,:] = (0, 0, 0, 1)
            tx = "You selected id {}.\n".format(ID)
            tx += "Points with other ids {} will be affected as well"
            tx = tx.format(oIDs)
            print tx

        fig.canvas.draw_idle()

fig.canvas.mpl_connect('pick_event', select_point)

plt.show()

In the image below, the point with id DAEE has been clicked on, and other points with ids ['EDEA', 'DEEA', 'EDAE', 'DEAE'] have been chosen by the blackbox function. Not all of those IDs exist, such that two other points with an existing id are colorized as well.

enter image description here

Upvotes: 1

Related Questions