Reputation: 61
I am new to geopandas and I am having trouble creating choropleth subplots with consistent bins. I need to create a consistent user defined color scheme across all subplots.
I have followed the examples below: matplotlib geopandas plot chloropleth with set bins for colorscheme https://github.com/geopandas/geopandas/issues/1019
While I am able to reproduce both examples, I get very strange behavior with my own data. Below is a toy example that replicates my problem.
import geopandas as gpd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mapclassify import Quantiles, UserDefined
import os
# Note you can read directly from the URL
gdf = gpd.read_file('https://opendata.arcgis.com/datasets/8d3a9e6e7bd445e2bdcc26cdf007eac7_4.geojson')
#gdf.plot()
gdf.shape
gdf.columns
gdf['rgn15nm'].head(9)
d = {
'rgn15nm': ['North East', 'North West', 'Yorkshire and The Humber', 'East Midlands', 'West Midlands', 'East of England', 'London', 'South East', 'South West'],
'1980' : pd.Series([0, 1, 0, 0, 0, 0, 0, 0, 0]),
'2000' : pd.Series([1, 1, 1, 0, 0, 0, 0, 0, 0]),
'2020' : pd.Series([1, 1, 10, 3, 1, 0, 0, 0, 1])
}
df = pd.DataFrame(d)
The data looks like this:
gdf = gdf.merge(df, on='rgn15nm')
# Define bins
gdf['2020'].describe()
bins= UserDefined(gdf['2020'], bins=[0,1,2,3,4,5,6,7,8,9,10]).bins
bins
# create a new column with the discretized values and plot that col
# repeat for each view
fig,(ax1,ax2,ax3) = plt.subplots(1,3,figsize=(15,6))
gdf.assign(cl=UserDefined(gdf['1980'].dropna(), bins).yb).plot(column='cl', ax=ax1, cmap='OrRd', legend = True )
gdf.assign(cl=UserDefined(gdf['2000'].dropna(), bins).yb).plot(column='cl', ax=ax2, cmap='OrRd', legend = True)
gdf.assign(cl=UserDefined(gdf['2020'].dropna(), list(bins)).yb).plot(column='cl', ax=ax3, cmap='OrRd', legend = True)
for ax in (ax1,ax2,ax3,):
ax.axis('off')
Clearly, the color scheme is not the same across subplots. What I mean by this is that 'Northwest' (the only region highlighted in the 1980 subplot) had the same value of 1 in all years 1980, 2000 and 2020. Yet, this region shows in different colors across the 3 subplots, despite the value being constant. I want "Northwest" to show in the same color (that of the subplot for 2020) across all 3 subplots.
I also tried this:
fig,(ax1,ax2,ax3) = plt.subplots(1,3,figsize=(15,6))
ax1.set_title('1980')
ax2.set_title('2000')
ax3.set_title('2020')
gdf.plot(column='1980', ax=ax1, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})
gdf.plot(column='2000', ax=ax2, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})
gdf.plot(column='2020', ax=ax3, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})
for ax in (ax1,ax2,ax3):
ax.axis('off')
But got exactly the same figure as immediately above (see below)
Does any one have any insight? I want a consistent color scheme across all 3 subplots.
Upvotes: 2
Views: 1464
Reputation: 61
So ultimately the solution was using the "norm" option. Following this example: Geopandas userdefined color scheme drops colors. See below:
from matplotlib.colors import Normalize
bins= UserDefined(gdf['2020'], bins=[0,1,2,3,4,5,6,7,8,9,10]).bins
bins
fig,(ax1,ax2,ax3) = plt.subplots(1,3,figsize=(15,6))
ax1.set_title('1980')
ax2.set_title('2000')
ax3.set_title('2020')
gdf.plot(column='1980', ax=ax1, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':bins}, norm=Normalize(0, len(bins)))
gdf.plot(column='2000', ax=ax2, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':bins}, norm=Normalize(0, len(bins)))
gdf.plot(column='2020', ax=ax3, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':bins}, norm=Normalize(0, len(bins)))
for ax in (ax1,ax2,ax3):
ax.axis('off')
The result is what I wanted:
or as suggested by Paul H:
fig,(ax1,ax2,ax3) = plt.subplots(1,3,figsize=(15,6))
ax1.set_title('1980')
ax2.set_title('2000')
ax3.set_title('2020')
gdf.plot(column='1980', ax=ax1, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':bins}, vmin = 0, vmax = 10)
gdf.plot(column='2000', ax=ax2, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':bins}, vmin = 0, vmax = 10)
gdf.plot(column='2020', ax=ax3, cmap='OrRd', scheme='userdefined', classification_kwds={'bins':bins}, vmin = 0, vmax = 10)
for ax in (ax1,ax2,ax3):
ax.axis('off')
Upvotes: 4