user287474
user287474

Reputation: 325

Color Code Dataframe in Scatter Plot

Lets say I have the following dataframe:

X    Y     Category
1    2         A 
5    3         B 
-1   1         C 
7    0         A 
1    2         B 
...

I want to find a way to color code the output of df['X'] and df['Y'] depending on their category (df['Category']).

I have tried this so far:

cm = pd.unique(df['Category'])
plt.scatter(data['X'], data['Y'], c=cm)

but it's telling me

c of shape (37,) not acceptable as a color sequence for x with size 67725, y with size 67725

Upvotes: 1

Views: 80

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

You can reshape your dataframe an use pandas plot.

df.set_index(['X','Category'])['Y'].unstack().plot(marker='o',linestyle='none')

Output:

enter image description here

Or you can use seaborn:

import seaborn as sns
_ = sns.pointplot(x='X',y='Y', hue='Category', data=df, linestyles='none')

Output:

enter image description here

Upvotes: 1

Ami Tavory
Ami Tavory

Reputation: 76297

It is much simpler to do this using a higher-level library such as seaborn, specifically through seaborn.lmplot:

import seaborn as sns

sns.lmplot(x=X, y=Y, huge='Category', data=df)

and let it take care of the details.

See Plotting With Categorical Data to see seaborn's other options for plotting categorical data.

Upvotes: 3

Related Questions