ron kolel
ron kolel

Reputation: 49

matplotlip legend base dataframe column

I try to create scatter plot base dataframe with 3 columns: 'a', 'b' , 'c'.

  a  |  b  |  c
  2  | 0.8 |  k
  3  | 0.4 |  l
  4  | 0.2 |  k

I set the 'a' column to x axis and the 'b' column to y axis.

fig, ax = plt.subplots()
df = pd.read_csv(csv_file)
ax.scatter(df['a'],df['b'])
plt.show()

The 'c' column is categorical column. I try to use this column to legend that every category will be in other color.

How can I do that?

EDIT

I don't know the labels in the 'c' column and how much labels.

Upvotes: 0

Views: 97

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150735

if you are open to other package, try seaborn:

import seaborn as sns
sns.scatterplot(data=df, x='a',y='b', hue='c')

Output:

enter image description here

Upvotes: 1

Bagutreko
Bagutreko

Reputation: 11

You can use a parameter c in scatter, like this:

ax.scatter(df['a'],df['b'],c=df['c'])

Here is the documentation for scatter:

According to this answer to another question How to convert categorical data to numerical data?, you can use pd.factorize to create a column of int for each of your categories like so: df['new_column'] = pd.factorize(df['some_column'])[0]

Upvotes: 0

Related Questions