Maverick
Maverick

Reputation: 799

Bokeh multiline plot

I am trying to plot RPI, CPI and CPIH on one chart with a HoverTool showing the value of each when you pan over a given area of the chart.

I initially tried adding each line separately using line() which kind of worked:

enter image description here

However, the HoverTool only works correctly when you scroll over the individual lines.

I have tried using multi_line() like:

combined_inflation_metrics = 'combined_inflation_metrics.csv'
df_combined_inflation_metrics = pd.read_csv(combined_inflation_metrics)
combined_source = ColumnDataSource(df_combined_inflation_metrics)


l.multi_line(xs=['Date','Date','Date'],ys=['RPI', 'CPI', 'CPIH'], source=combined_source)
#l.multi_line(xs=[['Date'],['Date'],['Date']],ys=[['RPI'], ['CPI'], ['CPIH']], source=combined_source)

show(l)

However, this is throwing the following:

RuntimeError: 
Supplying a user-defined data source AND iterable values to glyph methods is
not possibe. Either:

Pass all data directly as literals:

    p.circe(x=a_list, y=an_array, ...)

Or, put all data in a ColumnDataSource and pass column names:

    source = ColumnDataSource(data=dict(x=a_list, y=an_array))
    p.circe(x='x', y='y', source=source, ...)

But I am not too sure why this is?

Update:

I figured out a workaround by adding all of the values in each of the data sources. It works, but doesn't feel most efficient and would still like to know how to do this properly.

Edit - Code request:

from bokeh.plotting import figure, output_file, show
from bokeh.models import NumeralTickFormatter, DatetimeTickFormatter, ColumnDataSource, HoverTool, CrosshairTool, SaveTool, PanTool
import pandas as pd
import os
os.chdir(r'path')

#output_file('Inflation.html', title='Inflation')

RPI = 'RPI.csv'
CPI = 'CPI.csv'
CPIH = 'CPIH.csv'

df_RPI = pd.read_csv(RPI)
df_CPI = pd.read_csv(CPI)
df_CPIH = pd.read_csv(CPIH)

def to_date_time(data_frame, data_series):
    data_frame[data_series] = data_frame[data_series].astype('datetime64[ns]')

to_date_time(df_RPI, 'Date')
to_date_time(df_CPI, 'Date')
to_date_time(df_CPIH, 'Date')

RPI_source = ColumnDataSource(df_RPI)
CPI_source = ColumnDataSource(df_CPI)
CPIH_source = ColumnDataSource(df_CPIH)

l = figure(title="Historic Inflaiton Metrics", logo=None)
l.plot_width = 1200


l.xaxis[0].formatter=DatetimeTickFormatter(
        days=["%d %B %Y"],
        months=["%d %B %Y"],
        years=["%d %B %Y"],
    )


glyph_1 = l.line('Date','RPI',source=RPI_source, legend='TYPE', color='red')
glyph_2 = l.line('Date','CPI',source=CPI_source, legend='TYPE', color='blue')
glyph_3 = l.line('Date','CPIH',source=CPIH_source, legend='TYPE', color='gold')


hover = HoverTool(renderers=[glyph_1],
                 tooltips=[     ("Date","@Date{%F}"),
                                ("RPI","@RPI"),
                                ("CPI","@CPI"),
                                ("CPIH","@CPIH")],
                          formatters={"Date": "datetime"},
                      mode='vline'
                 )
l.tools = [SaveTool(), PanTool(), hover, CrosshairTool()]

show(l)

Upvotes: 1

Views: 6128

Answers (1)

syntonym
syntonym

Reputation: 7384

The hover tool looks up the data to show in the ColumnDataSource. Because you created a new ColumnDataSource for each line and restricted the hover tool to line1 it can only lookup data in the data source there.

The general solution is to only create one ColumnDataSource and reuse that in each line:

df_RPI = pd.read_csv(RPI)
df_CPI = pd.read_csv(CPI)
df_CPIH = pd.read_csv(CPIH)

df = df_RPI.merge(dfd_CPI, on="date")
df = df.merge(df_CPIH, on="date")

source = ColumnDataSource(df)

l = figure(title="Historic Inflation Metrics", logo=None)

glyph_1 = l.line('Date','RPI',source=source, legend='RPI', color='red')
l.line('Date','CPI',source=source, legend='CPI', color='blue')
l.line('Date','CPIH',source=source, legend='CPIH', color='gold')

hover = HoverTool(renderers=[glyph_1],
                 tooltips=[     ("Date","@Date{%F}"),
                                ("RPI","@RPI"),
                                ("CPI","@CPI"),
                                ("CPIH","@CPIH")],
                          formatters={"Date": "datetime"},
                      mode='vline'
                 )

show(l)

This is of course only possible if you all your dataframes can be merged into one, i.e. the measurement timepoints are the same. If they are not besides resampling/interpolating I do not know a good method to do what you want.

Upvotes: 3

Related Questions