user3447653
user3447653

Reputation: 4158

color by a column name using plotly

I have a dataframe in the below format:

 id     distance    value    is_match
 1      234         0.8      True 
 2      314         0.5      False
 3      904         0.1      False
 4      123         0.4      False
 5      287         0.9      True 

I tried plotting it using plotly. X axis would have "distance", y axis would have "value" and color the circles using "is_match". Used the below code:

import plotly.express as px
px.scatter(df, x='distance', y='value', color='is_match')

But this does not color code based on "is_match" column.

Any leads would be appreciated.

Upvotes: 0

Views: 539

Answers (1)

Rob Raymond
Rob Raymond

Reputation: 31226

  • works fine. Have generated much larger dataset as per comments from you sample
  • when number of points 10**5 then second trace (True) dominates as it is above first trace (False)
import io
import pandas as pd
import numpy as np
import plotly.express as px

df = pd.read_csv(io.StringIO("""id     distance    value    is_match
 1      234         0.8      True 
 2      314         0.5      False
 3      904         0.1      False
 4      123         0.4      False
 5      287         0.9      True """), sep="\s+")


ROWS = 10**4
df = pd.DataFrame({"distance":np.random.randint(df["distance"].min(), df["distance"].max(), ROWS),
             "value":np.random.uniform(df["value"].min(), df["value"].max(), ROWS),
             "is_match":np.random.randint(0,2,ROWS).astype(bool)})

px.scatter(df, x='distance', y='value', color='is_match')


Upvotes: 1

Related Questions