Reputation: 8412
I have some lognormal data that I want to transform, then fit a normal distribution to. Here is a reproducible example with data that's not exactly lognormal, but it's close enough:
# create some lognormal data
dat <- data.frame(x = c(runif(1000, 0, 1), runif(300, 1, 100), runif(100,100,1000)))
I can transform these data by log10
and fit a normal distribution to them with MASS::fitdistr
.
# fit normal distribution to log-transformed values
library(MASS)
fit <- fitdistr(log10(dat$x), densfun = "normal")
fit$estimate # mean and SD of the normal distribution
Now I want to plot the data, and the distribution on top of it. I transform my data by log10, and plot the normal distribution with stat_function
, but it's not fitting the data.
# plot data and distribution
ggplot(data = dat) +
geom_histogram(mapping = aes(x = log10(x))) +
stat_function(fun = dnorm, args = list(mean = fit$estimate[1], sd = fit$estimate[2], log = TRUE))
Any pointers, as well as verification if I'm going about this correctly, would be very helpful.
Lastly to get my x axis in displaying units of 10, 100, 1000, etc... should I use scale_x_log10()
? What's a simple way to format the x axis?
Upvotes: 1
Views: 2952
Reputation: 24262
When you want to draw on the same graph a histogram and a density distribution you need to plot a density histogram using the aesthetic y=..density..
Here is an example. To make things clear, I generated data from a lognormal distribution.
set.seed(123)
# Generate data from a log-normal distribution
dat <- data.frame(x=rlnorm(10000))
# Fit normal distribution to log-transformed values
library(MASS)
fit <- fitdistr(log10(dat$x), densfun = "normal")
# Plot density histogram and fitted distribution
ggplot(data = dat) +
geom_histogram(mapping = aes(x = log10(x), y = ..density..), col="white") +
stat_function(fun = dnorm,
args = list(mean = fit$estimate[1], sd = fit$estimate[2], log = F),
color="red", lwd=1)
Upvotes: 3