El_que_no_duda
El_que_no_duda

Reputation: 417

Plot dataframe using markers based on another dataframe

I am trying to plot df2 as a regular continuous plot but using the values from df1 to select markers.

DATAdict = {
    'A': [ "foo",         "miau",  "ciao" ],
    'B': [ "miau",         "haha",  "ciao"],
    }

df1 = pd.DataFrame.from_dict(DATAdict, orient='columns')

DATAdict = {
    'A': [ 0.79, 0.86, 0.88, ],
    'B': [ 0.89, 0.89, 0.90, ],
    }

df2 = pd.DataFrame.from_dict(DATAdict, orient='columns')

unique = pd.unique(df1.values.ravel('K'))

markers = {}
marker_list = ["s", "o", "+", "x", "P"]
for label, i in zip(unique, range(0, len(unique))):
    markers[label] = marker_list[i]

The idea is to plot A column using range(0,3) for the x axis and 0.79, 0.86, etc for the y axis and then select the markers based on foo, miau and ciao.

Upvotes: 1

Views: 518

Answers (2)

Mr. T
Mr. T

Reputation: 12410

I don't see a way in matplotlib not to plot the scatter markers separately:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np


DATAdict = {
    'A': [ "foo",         "miau",  "ciao" ],
    'B': [ "miau",         "haha",  "ciao"],
    }

df1 = pd.DataFrame.from_dict(DATAdict, orient='columns')

DATAdict = {
    'A': [ 0.79, 0.86, 0.88, ],
    'B': [ 0.89, 0.89, 0.90, ],
    }

df2 = pd.DataFrame.from_dict(DATAdict, orient='columns')


marker_list = ["s", "o", "+", "x", "P"]
marker_dict = dict(zip(pd.unique(df1.values.flat), marker_list))

ax = df2.plot()
row, col = df1.shape
for x, y, m, c in zip(np.repeat(np.arange(row), col), df2.values.flat, df1.values.flat, np.tile(np.arange(col), row)):
    if not c:
        ax.set_prop_cycle(None)
    ax.scatter(x, y, marker=marker_dict[m])
plt.show()

Sample output: enter image description here

Upvotes: 0

tdy
tdy

Reputation: 41327

The simplest way is to use sns.scatterplot with the style param:

style : vector or key in data

Grouping variable that will produce points with different markers. Can have a numeric dtype but will always be treated as categorical.

To connect the lines, I don't know of a way to do that with sns.scatterplot, but we can overlay the line using the Axes handle:

import seaborn as sns
ax = sns.scatterplot(data=df2.reset_index(), x='index', y='A', style=df1['A'], s=100)
ax.plot(df2.index, df2['A'])

Note that seaborn chooses the style markers automatically. You can use the markers param to set them manually, but the markers cannot mix filled and unfilled markers:

marker_list = ['s', 'o', 'P', 'X'] # all "filled" type
ax = sns.scatterplot(data=df2.reset_index(), x='index', y='A', style=df1['A'], markers=marker_list, s=100)

Upvotes: 2

Related Questions