user2272972
user2272972

Reputation: 165

Plotting normal distribution density plot for hazard ratio in R

I'm trying to plot 2 normal distribution density plots for null and alternative hazard ratios of 1 and 0.65, respectively, to replicate an example (plot attached). Here's my code so far but it doesn't makes sense to me to have negative values for hazard ratios, but when I don't have negative values, the distributions are cut off. So I know I'm doing something wrong here. Thanks!

x <- seq(-2, 2, length.out = 100000)
df <- do.call(rbind,
  list(data.frame(x=x, y=dnorm(x, mean = log(1), sd = sqrt(1/60 + 1/60)), id="H0, HR = 1"),
       data.frame(x=x, y=dnorm(x, mean = log(0.65), sd = sqrt(1/60 + 1/60)), id="H1, HR = 0.65")))

vline <- 0.65
p1 <- ggplot(df, aes(x, y, group = id, color = id)) +
  geom_line() +
  geom_area(aes(fill = id),
            data = ~ subset(., (id == "H1, HR = 0.65" & x > (vline)) | (id == "H0, HR = 1" & x < (vline))),
            alpha = 0.3) +
  geom_vline(xintercept = vline, linetype = "dashed") +
  labs(x = "log(Hazard Ratio)", y = 'Density') + xlim(-2, 2) +
  guides(fill = "none", color = guide_legend(override.aes = list(fill = "white"))) +
  theme_classic() + 
  theme(legend.title=element_text(size=10), legend.position = c(0.8, 0.4),
        legend.text = element_text(size = 10), 
    axis.line.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  ) + 
  scale_color_manual(name = '', values = c('red', 'blue')) +
  scale_fill_manual(values = c('red', 'blue'))

The plot I'm trying to replicate

The plot I'm trying to replicate

Upvotes: 0

Views: 300

Answers (1)

Dion Groothof
Dion Groothof

Reputation: 1456

This gets reasonably close to the image that you have posted.

You should not use the log() of the means, but rather the means as is. Moreover if you use the normal distribution, you assume that parameters can take any value between -Inf and Inf, albeit with very small densities far from the mean. Therefore, you cannot expect all values to be positive. If you would like your values to be bounded by 0, then you should use a gamma distribution instead.

x <- seq(-2, 2, length.out = 1000)
df <- do.call(rbind,
              list(data.frame(x=x, y=dnorm(x, mean = 1, sd = sqrt(1/50)), id="H0, HR = 1"),
                   data.frame(x=x, y=dnorm(x, mean = 0.65, sd = sqrt1/50)), id="H1, HR = 0.65")))

vline <- 0.65

ggplot(df, aes(x, y, group = id, color = id)) +
  geom_line() +
  geom_area(aes(fill = id),
            data = ~ subset(., (id == "H1, HR = 0.65" & x > (vline)) | (id == "H0, HR = 1" & x < (vline))),
            alpha = 0.3) +
  geom_vline(xintercept = vline, linetype = "dashed") +
  labs(x = "log(Hazard Ratio)", y = 'Density') + xlim(-2, 2) +
  guides(fill = "none", color = guide_legend(override.aes = list(fill = "white"))) +
  theme_classic() + 
  theme(legend.title=element_text(size=10), legend.position = c(0.8, 0.4),
        legend.text = element_text(size = 10), 
        axis.line.y = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank()
  ) + 
  scale_color_manual(name = '', values = c('red', 'blue')) +
  scale_fill_manual(values = c('red', 'blue')) +
  scale_x_continuous(breaks = seq(-0.3, 2.1, 0.3),
                     limits = c(-0.3, 2.1))

enter image description here

Upvotes: 2

Related Questions