Reputation: 2031
I have calculated document distances, and am using MDS in sklearn to plot them with matplotlib. I want to plot them with seaborn (pairplot) but don't know how to translate the MDS data so that it is readable by seaborn.
from sklearn.manifold import MDS
mds = MDS(n_components=2, dissimilarity="precomputed", random_state=1)
pos = mds.fit_transform(dist)
xs, ys = pos[:, 0], pos[:, 1]
names = [name for name in labels]
# Define the plot
for x, y, name in zip(xs, ys, names):
plt.scatter(x, y, color=color)
plt.text(x, y, name)
plt.show()
Upvotes: 3
Views: 4029
Reputation: 944
As a complement to Diziet Asahi's response, here is a minimalistic code to create a DataFrame:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
data = {'col1':[1, 1, 1 ,1 ,1 ,1 ,12, 3, 4,5], 'col2':[1, 1, 1 ,1 ,1 ,1 ,12, 3, 4,5]}
df = pd.DataFrame(data)
sns.violinplot(data=df, palette="Pastel1")
plt.show()
Here is the result of this code:
Here, you can find other ways to build a Panda DataFrame.
Upvotes: 1
Reputation: 40697
As stated in the documentation for pairplot()
, this function expects a long-form dataframe where each column is a variable and each row is an observation.
The easiest would be to use Pandas to construct this dataframe (although I believe a numpy array would work).
A long-form dataframe would have as many rows as there are observations, and each column is a variable. The power of seaborn
is to use categorical columns to split the dataframe is different groups.
In your case the dataframe would probably look like:
X Y label
0 0.094060 0.484758 Label_00
1 0.375537 0.150206 Label_00
2 0.215755 0.796629 Label_02
3 0.204077 0.921016 Label_01
4 0.673787 0.884718 Label_01
5 0.854112 0.044506 Label_00
6 0.225218 0.552961 Label_00
7 0.668262 0.482514 Label_00
8 0.935415 0.100438 Label_00
9 0.697016 0.633550 Label_01
(...)
And you would pass it to pairplot
like so:
sns.pairplot(data=df, hue='label')
Upvotes: 2