James Arten
James Arten

Reputation: 666

log_scale = True in seaborn.histplot()

I'm just wondering about putting log_scale=True inside

sns.histplot(data, log_scale=True)

Does this mean that the data get converted through a log function, or do they remain the same and it just a matter of how they are plotted?

Many thanks,

James

Upvotes: 2

Views: 3903

Answers (1)

JohanC
JohanC

Reputation: 80339

The data is left as-in, but the bin edges for the histogram are calculated to be distributed evenly when drawn on a log scale x-axis. Also, the x-axis is automatically drawn as log scale.

Here is some code, comparing log_scale=True with how it could be simulated by a separate calculation of bin edges. The code for the left plot is a simplification of this example.

import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
import seaborn as sns

sns.set_theme(style="ticks")
diamonds = sns.load_dataset("diamonds")

fig, axs = plt.subplots(ncols=2, figsize=(12, 5))
sns.despine(fig)

sns.histplot(diamonds, x="price",
             edgecolor=".3", linewidth=.5, log_scale=True, ax=axs[0])
axs[0].set_title("setting 'log_scale=True'")

sns.histplot(diamonds, x="price",
             bins=np.logspace(np.log10(diamonds["price"].min()), np.log10(diamonds["price"].max()), 46),
             edgecolor=".3", linewidth=.5, log_scale=False, ax=axs[1])
axs[1].set_xscale('log')
axs[1].set_title("mimicking 'log_scale=True'")

for ax in axs:
    ax.xaxis.set_major_formatter(ScalarFormatter())
    ax.set_xticks([500, 1000, 2000, 5000, 10000])

plt.show()

sns.histplot with log_scale=True

PS: internally, seaborn uses numpy's histogram_bin_edges to calculate the bins, something like:

bins=np.power(10, np.histogram_bin_edges(np.log10(diamonds["price"]), bins='auto'))

Upvotes: 4

Related Questions