Reputation: 2656
This is partially two questions:
Some types of data, e.g. BMI score, have a natural mid-point. In matplotlib, there are several diverging colormaps. I want the center of the colormap, i.e. the "middle" of the spectrum to be on the "ideal" BMI score, independent of what distribution of BMI scores is plotted.
BMI class thresholds are: bmi_threshold = [16, 17, 18.5, 25, 30, 35]
.
In the code below I make a scatter-plot of 300 random BMI values, with weight on x-axis and height on y-axis, as shown in the image below it.
In the first image, I have used np.digitize(bmi, bmi_threshold)
as c
-parameter to the ax.scatter()
-call, but then each value in colorbar also become in range(7)
, whereas I want the colorbar ticks to be in BMI scores (approxx. 15-40). (bmi
is the array of 300 random bmi scores corresponding to x
and y
)
BMI thresholds are not evenly spread out, so the distance from digitized class indexes e.g. between 2
and 3
, is will not be correctly represented if I merely change the tick labels in the colorbar.
In the second image, which is used with the code as shown below, does not seem to be centered correctly at the "ideal" BMI score of 22. I try to use the technique from "Make a scatter colorbar display only a subset of the vmin/vmax" to adjust the color range in the colorbar, but it doesn't seem to work as (I) expected.
Further, I think I could emphasize the "center" aka "ideal" scores by "squeezing" the colors by setting low
and high
in cmap(np.linspace(low, high, 7))
to values outside [0, 1], e.g. [-0.5,1.5], but then I have even more trouble to center the colorbar.
What am I doing wrong, and how can I achieve this?
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import matplotlib as mpl
np.random.seed(4242)
# Define BMI class thresholds
bmi_thresholds = np.array([16, 17, 18.5, 25, 30, 35])
# Range to sample BMIs from
max_bmi = max(bmi_thresholds)*0.9
min_bmi = min(bmi_thresholds)*0.3
# Convert meters into centimeters along x-axis
@mpl.ticker.FuncFormatter
def m_to_cm(m, pos):
return f'{int(m*100)}'
# Number of samples
n = 300
# Heights in range 0.50 to 2.20 meters
x = np.linspace(0.5, 2.2, n)
# Random BMI values in range [min_bmi, max_bmi]
bmi = np.random.rand(n)*(max_bmi-min_bmi) + min_bmi
# Compute corresponding weights
y = bmi * x**2
# Prepare plot with labels, etc.
fig, ax = plt.subplots(figsize=(10,6))
ax.set_title(f'Random BMI values. $n={n}$')
ax.set_ylabel('Weight in kg')
ax.set_xlabel('Height in cm')
ax.xaxis.set_major_formatter(m_to_cm)
ax.set_ylim(min(y)*0.95, max(y)*1.05)
ax.set_xlim(min(x), max(x))
# plot bmi class regions (i.e. the "background")
for i in range(len(bmi_thresholds)+1):
area_min = bmi_thresholds[i-1] if i > 0 else 0
area_max = bmi_thresholds[i] if i < len(bmi_thresholds) else 10000#np.inf
area_color = 'g' if i == 3 else 'y' if i in [2,4] else 'orange' if i in [1,5] else 'r'
ax.fill_between(x, area_min * x**2, area_max * x**2, color=area_color, alpha=0.2, interpolate=True)
# Plot lines to emphasize regions, and additional bmi score lines (i.e. 10 and 40)
common_plot_kwargs = dict(alpha=0.8, linewidth=0.5)
for t in (t for t in np.concatenate((bmi_thresholds, [10, 40]))):
style = 'g-' if t in [18.5, 25] else 'r-' if t in [10,40] else 'k-'
ax.plot(x, t * x**2, style, **common_plot_kwargs)
# Compute offset from target_center to median of data range
target_center = 22
mid_bmi = np.median(bmi)
s = max(bmi) - min(bmi)
d = target_center - mid_bmi
# Use offset to normalize offset as to the range [0, 1]
high = 1 if d < 0 else (s-d)/s
low = 0 if d >= 0 else -d/s
# Use normalized offset to create custom cmap to centered around ideal BMI?
cmap = plt.get_cmap('PuOr')
colors = cmap(np.linspace(low, high, 7))
cmap = mpl.colors.LinearSegmentedColormap.from_list('my cmap', colors)
# plot random BMIs
c = np.digitize(bmi, bmi_thresholds)
sax = ax.scatter(x, y, s=15, marker='.', c=bmi, cmap=cmap)
cbar = fig.colorbar(sax, ticks=np.concatenate((bmi_thresholds, [22, 10, 40])))
plt.tight_layout()
Upvotes: 4
Views: 4850
Reputation: 5728
You can use the matplotlib
built-in function that does the same thing:
matplotlib.colors.TwoSlopeNorm
See: https://matplotlib.org/3.2.2/gallery/userdemo/colormap_normalizations_diverging.html
Upvotes: 5
Reputation: 365
I found a decent solution here:
http://chris35wills.github.io/matplotlib_diverging_colorbar/
They created a normalization class using this code:
class MidpointNormalize(colors.Normalize):
def __init__(self, vmin=None, vmax=None, midpoint=None, clip=False):
self.midpoint = midpoint
colors.Normalize.__init__(self, vmin, vmax, clip)
def __call__(self, value, clip=None):
# I'm ignoring masked values and all kinds of edge cases to make a
# simple example...
x, y = [self.vmin, self.midpoint, self.vmax], [0, 0.5, 1]
return np.ma.masked_array(np.interp(value, x, y), np.isnan(value))
The class is used by doing something like this:
elev_max=3000; mid_val=0;
plt.imshow(ras, cmap=cmap, clim=(elev_min, elev_max), norm=MidpointNormalize(midpoint=mid_val,vmin=elev_min, vmax=elev_max))
plt.colorbar()
plt.show()
Upvotes: 4