Reputation: 1780
I have a pandas dataframe `df' in tidy format, like so
date population country
Feb. 1 2000 99999 Canada
Feb. 1 2000 98765 Spain
Feb. 2 2000 99998 Canada
...
I would like to do a line plot using Bokeh, where each country gets its own line and color.
One way to do this seems to be to use the legend
keyword to line()
to give me a different line for each country:
source = ColumnDataSource(df)
plot = figure(...)
plot.line(x='date', y='count', source=source, legend='country')
Unfortunately, it seems like there's no straighforward to select colors for each country...
And since there's a multi_line()
plotting function, this seems like what I should be using. However, I don't know a simple way to do this. Something like the following can work:
plot.multi_line(xs=[df['date'], df['date']],
ys=[df[df['country']=='Canada'],
df[df['country']=='Canada']],
colors=['red', 'blue'])
This doesn't seem very elegant either, especially since I actually have many more conutries than the two in my toy example above.
What's the proper way to do this with bokeh?
Upvotes: 0
Views: 210
Reputation: 8297
Short and elegant:
from bokeh.palettes import Category10
groups = df.groupby('country')
p = figure(x_axis_type = "datetime")
p.multi_line(xs = [df.date for i, df in groups],
ys = [df.population for i, df in groups],
line_color = Category10[10][0: len(groups)],)
More elegant:
from bokeh.palettes import Category10
groups = df.groupby('country')
data = {'date': [], 'population': [], 'legend': []}
for i, df in groups:
data['date'].append(df.date.tolist())
data['population'].append(df.population.tolist())
data['legend'].append(i)
data['color'] = Category10[10][0: len(groups)]
p = figure(x_axis_type = "datetime")
p.multi_line(xs = 'date',
ys = 'population',
line_color = 'color',
legend = 'legend',
source = data, )
Upvotes: 1