Reputation: 325
Lets say I have the following dataframe:
X Y Category
1 2 A
5 3 B
-1 1 C
7 0 A
1 2 B
...
I want to find a way to color code the output of df['X']
and df['Y']
depending on their category (df['Category']
).
I have tried this so far:
cm = pd.unique(df['Category'])
plt.scatter(data['X'], data['Y'], c=cm)
but it's telling me
c of shape (37,) not acceptable as a color sequence for x with size 67725, y with size 67725
Upvotes: 1
Views: 80
Reputation: 153460
You can reshape your dataframe an use pandas plot.
df.set_index(['X','Category'])['Y'].unstack().plot(marker='o',linestyle='none')
Output:
Or you can use seaborn:
import seaborn as sns
_ = sns.pointplot(x='X',y='Y', hue='Category', data=df, linestyles='none')
Output:
Upvotes: 1
Reputation: 76297
It is much simpler to do this using a higher-level library such as seaborn
, specifically through seaborn.lmplot
:
import seaborn as sns
sns.lmplot(x=X, y=Y, huge='Category', data=df)
and let it take care of the details.
See Plotting With Categorical Data to see seaborn
's other options for plotting categorical data.
Upvotes: 3