Lars Kotthoff
Lars Kotthoff

Reputation: 109242

log-transformed density function not plotting correctly

I'm trying to log-transform the x axis of a density plot and get unexpected results. The code without the transformation works fine:

library(ggplot2)
data = data.frame(x=c(1,2,10,11,1000))

dens = density(data$x)
densy = sapply(data$x, function(x) { dens$y[findInterval(x, dens$x)] })

ggplot(data, aes(x = x)) +
    geom_density() +
    geom_point(y = densy)

enter image description here

If I add scale_x_log10(), I get the following result:

enter image description here

Apart from the y values having been rescaled, something seems to have happened to the x values as well -- the peaks of the density function are not quite where the points are.

Am I using the log transformation incorrectly here?

Upvotes: 1

Views: 1219

Answers (1)

Rorschach
Rorschach

Reputation: 32436

The shape of the density curve changes after the transformation because the distribution of the data has changed and the bandwidths are different. If you set a bandwidth of (bw=1000) prior to the transformation and 10 afterward, you will get two normal looking densities (with different y-axis values because the support will be much larger in the first case). Here is an example showing how varying bandwidths change the shape of the density.

data = data.frame(x=c(1,2,10,11,1000), y=0)

## Examine how changing bandwidth changes the shape of the curve
par(mfrow=c(2,1))
greys <- colorRampPalette(c("black", "red"))(10)
plot(density(data$x), main="No Transform")
points(data, pch=19)
plot(density(log10(data$x)), ylim=c(0,2), main="Log-transform w/ varying bw")
points(log10(data$x), data$y, pch=19)
for (i in 1:10)
    points(density(log10(data$x), bw=0.02*i), col=greys[i], type="l")
legend("topright", paste(0.02*1:10), col=greys, lty=2, cex=0.8)

enter image description here

Upvotes: 2

Related Questions