user4279562
user4279562

Reputation: 669

bokeh - plotting shapefile map using datashader

Initially, I created an interactive map of the UK Postcode area where an individual area is color represented based on its value (e.g. population in that post code area) as following.

from bokeh.plotting import figure
from bokeh.palettes import Viridis256 as palette
from bokeh.models import LinearColorMapper
from bokeh.models import ColumnDataSource
import geopandas as gpd

shp = 'file_path_to_the_downloaded_shapefile'
#read shape file into dataframe using geopandas
df = gpd.read_file(shp)

def expandMultiPolygons(row, geometry):
    if row[geometry].type = 'MultiPolygon':
       row[geometry] = [p for p in row[geometry]]
    return row
#Some rows were in MultiPolygons instead of Polygons.
#Expand MultiPolygons to multi rows of Polygons
df = df.apply(expandMultiPolygons, geometry='geometry', axis=1)
df = df.set_index('Area')['geometry'].apply(pd.Series).stack().reset_index()

#Visualize the polygons. To visualize different colors for different post areas, I added another column called 'value' which has some random integer value. 

p = figure()
color_mapper = LinearColorMapper(palette=palette)
source = ColumnDataSource(df)
p.patches('x', 'y', source=source,\
            fill_color={'field': 'value', 'transform': color_mapper},\
            fill_alpha=1.0, line_color="black", line_width=0.05)

where df is a dataframe of four columns : post code area, x-coordinate, y-coordinate, value (i.e. population).

The above code creates an interactive map on a web browser which is great but I noticed the interactivity is not very smooth in speed. If I zoom in or move the map, it renders slowly. The size of the dataframe is only 1106 rows, so I'm quite confused why it is so slow.

As one of the possible solutions, I came across with datashader (https://datashader.readthedocs.io/en/latest/) but I find the example script is quite complicated and most of them are with holoview package on Jupyter notebook but I want to create a dashboard using bokeh.

Does anyone advise me in incorporating datashader into the above bokeh script? Do I need a different function within datashader to create the shape map instead of using bokeh's patches function?

Any suggestion would be highly appreciated!!!

Upvotes: 1

Views: 2660

Answers (3)

user4279562
user4279562

Reputation: 669

I found a way to speed up the interactive visualization of the UK map as I move the slider.

I created individual image (in 2D) for a different value of slider first and updated the map using the 2D images instead of using bokeh's patches function.

Since the images are in array format, it is much faster to update the image while changing the values in the slider. one downside in this method is that I can no longer use hover function on the UK map.

I referred to the following url to convert polygon information into arrays: https://gist.github.com/brendancol/db030013e981c46acb2886060dde607e#file-rasterio_datashader_polygons-py-L35

Upvotes: 0

mc51
mc51

Reputation: 2287

As mentioned in my comment, I believe that the complexity of your polygons might cause your problem. The file you linked to contains several shapefile of different sizes and complexities. You can simplify those, i.e. reduce the number of points for each polygon. This can change how they look. It can range from almost no difference over a bit more "edginess" to an angular appearance. This depends on the level of simplification you chose. Depending on your needs you can chose different levels of simplicity.

I know of three easy options to get this done:

  1. GUI: Try QGis. It is a great opensource tool for geospatial data processing. Load your Shapefile as a new layer. Then use the "Simplify Geometries" tool under the Vector menu.
  2. Command-Line: GDAL is an open-source library. It comes with an useful command-line tool. You can use it like this: ogr2ogr outfile.shp infile.shp -simplify 0.000001
  3. Online: Visit mapshader. Import your file. Select simplify and chose your level. Then, export the result. What I really like here is that your file is rendered instantly. Hence, you can immediately see the result of your simplification.

Other than that, you should also update your bokeh version. It gets updated regularly and there have been some performance improvements since.

Using HoloViews or GeoViews will not positively affect your performance. Thus, it is not related to your issues. I guess @James A. Bednar was just giving some side advice there.

Upvotes: 1

James A. Bednar
James A. Bednar

Reputation: 3255

Without the data file involved, I can't answer your question directly, but can offer some observations:

  1. Datashader is unlikely to be of value for this purpose, because datashader does not currently have any support for rendering polygons. As a rule of thumb, Datashader is designed to aggregate your data, and if it's already aggregated, Datashader won't normally be of help. Here your data is aggregated by postcode, which datashader can't process, but if you had the original data per person it would be happy to render it.
  2. If you prefer working with Bokeh directly rather than via the higher-level HoloViews/GeoViews interface, I'd recommend folllwing Matt Rocklin's work on accelerating geopandas; his approach should be very fast for your purpose.
  3. All that said, HoloViews, and GeoViews should be a convenient way to work with Bokeh in general, whether or not you want to create a dashboard. E.g. the 2017 JupyterCon tutorial shows how to make a simple Bokeh dashboard using both libraries. It doesn't cover shape files, but those are covered in other GeoViews examples.

Upvotes: 6

Related Questions