Reputation: 666
I'm just wondering about putting log_scale=True
inside
sns.histplot(data, log_scale=True)
Does this mean that the data get converted through a log function, or do they remain the same and it just a matter of how they are plotted?
Many thanks,
James
Upvotes: 2
Views: 3903
Reputation: 80339
The data is left as-in, but the bin edges for the histogram are calculated to be distributed evenly when drawn on a log scale x-axis. Also, the x-axis is automatically drawn as log scale.
Here is some code, comparing log_scale=True
with how it could be simulated by a separate calculation of bin edges. The code for the left plot is a simplification of this example.
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
import seaborn as sns
sns.set_theme(style="ticks")
diamonds = sns.load_dataset("diamonds")
fig, axs = plt.subplots(ncols=2, figsize=(12, 5))
sns.despine(fig)
sns.histplot(diamonds, x="price",
edgecolor=".3", linewidth=.5, log_scale=True, ax=axs[0])
axs[0].set_title("setting 'log_scale=True'")
sns.histplot(diamonds, x="price",
bins=np.logspace(np.log10(diamonds["price"].min()), np.log10(diamonds["price"].max()), 46),
edgecolor=".3", linewidth=.5, log_scale=False, ax=axs[1])
axs[1].set_xscale('log')
axs[1].set_title("mimicking 'log_scale=True'")
for ax in axs:
ax.xaxis.set_major_formatter(ScalarFormatter())
ax.set_xticks([500, 1000, 2000, 5000, 10000])
plt.show()
PS: internally, seaborn uses numpy's histogram_bin_edges
to calculate the bins, something like:
bins=np.power(10, np.histogram_bin_edges(np.log10(diamonds["price"]), bins='auto'))
Upvotes: 4