Reputation: 49
I try to create scatter plot base dataframe with 3 columns: 'a', 'b' , 'c'.
a | b | c
2 | 0.8 | k
3 | 0.4 | l
4 | 0.2 | k
I set the 'a' column to x axis and the 'b' column to y axis.
fig, ax = plt.subplots()
df = pd.read_csv(csv_file)
ax.scatter(df['a'],df['b'])
plt.show()
The 'c' column is categorical column. I try to use this column to legend that every category will be in other color.
How can I do that?
EDIT
I don't know the labels in the 'c' column and how much labels.
Upvotes: 0
Views: 97
Reputation: 150735
if you are open to other package, try seaborn:
import seaborn as sns
sns.scatterplot(data=df, x='a',y='b', hue='c')
Output:
Upvotes: 1
Reputation: 11
You can use a parameter c
in scatter
, like this:
ax.scatter(df['a'],df['b'],c=df['c'])
Here is the documentation for scatter
:
According to this answer to another question How to convert categorical data to numerical data?, you can use pd.factorize
to create a column of int
for each of your categories like so: df['new_column'] = pd.factorize(df['some_column'])[0]
Upvotes: 0