Reputation: 6549
I want to prepare a bokeh plot that uses a ColumnDataSource
. The pandas
DataFrame
that is the source of the data has one column and a datetime
index:
How do I specify that the x value should be the index. I tried just omitting it, hoping that would be the default, but it did not work:
There is an ugly solution where I just copy the index as a column in the dataframe, but I hope there is a more elegant solution:
Upvotes: 12
Views: 16248
Reputation: 138
You can call the index with the usual syntax to get an index from DF
as:
p.line(x = df.index.values, y = df['values_for_y'])
Upvotes: 2
Reputation: 2137
The issue is that you have to specify which column should be the "x" column. If you don't specify the "x" value, the default behavior in bokeh.plotting is to try to find a column called "x" in your ColumnDataSource (which doesn't exist).
One tricky thing here is that you're using a named index ('timeseries') in pandas. That name is carried over when you create a ColumnDataSource, so that your source probably looks like:
ds = ColumnDataSource(df)
print(ds.data)
# the ts_n values would be the actual timestamps from the df
> {'timestamp': [ts_1, ts_2, ts_3, ts_4, ts_5], 'avg': [0.9, 0.8, 0.7, 0.8, 0.9]}
It would work if you use:
p.line(source=ds, x='timestamps', y='avg')
Upvotes: 12
Reputation: 856
I usually reset the index and this makes the index a column. Similar to your ugly solution. Then plot the specified columns.
df.reset_index(inplace = True)
Alternatively you could reference just the column and in matplotlib it usually uses the index by default in the way you want. Not sure if it will work for you but worth a try.
df["avg"].plot()
Alternatively you could try the time series plot approach? Detailed below.
TimeSeries in Bokeh using a dataframe with index
Upvotes: 5