waterproofpatch
waterproofpatch

Reputation: 126

Label survival plot lines using ggplot

I'm trying to label the lines on the output of an autoplot generated from a survfit object. I've been experimenting with the directlabels package without success (the issue seems to be that the geom_* functions don't have access to the underlying data and cannot find the variables from the dataset when used in conjunction with a survfit object, rather than just the data itself inside a ggplot.

The autoplot routine I'm using now is:

autoplot(survfit(Surv(time, status) ~ sex, data = lung), fun = 'event')

This generates a plot like:

enter image description here

What I would like to do is to relocate the legend "strata" from the right side onto the lines (right above them on the left, or right, those details aren't important to me).

I do not wish to label each individual point, just to label each line locally.

Upvotes: 1

Views: 457

Answers (1)

leomfn
leomfn

Reputation: 165

Using the libraries ggplot and ggrepel, here's what you could do:

Adding Labels to each line

autoplot(survfit(Surv(time, status) ~ sex, data = lung), fun = 'event', legendLabs = FALSE) +
  geom_label_repel(data = . %>% group_by(strata) %>% summarise(x = mean(time), y = mean(surv)), 
                  aes(x = x, y = y, label = strata, color = strata)) +
  theme(legend.position = 'none')

As autoplot can be handled the same way as a ggplot object, you can add a text label, where ggrepel comes in handy. ggrepel tries to optimize the positions of the added text/labels.

Because, you don't want to add a label to every single data point, I changed the data used by geom_label_repel using dplyr's summarise, so that there are two rows left, one for each 'strata', and two additional columns (x and y), to specify the label's position based on the respective mean values.

Also, since I think it is not necessary anymore, I removed the legend.

result plot

Custom label text

If you want to customize the label's text, e. g. because now the legend title is gone and you want to add that information, you can do that by adding another column to the data used be geom_label_repel. Here's an example:

autoplot(survfit(Surv(time, status) ~ sex, data = lung), fun = 'event', legendLabs = FALSE) +
  geom_label_repel(data = . %>% group_by(strata) %>% summarise(x = mean(time), y = mean(surv)) %>% mutate(label = paste('strata =', strata)), 
                  aes(x = x, y = y, label = label, color = strata)) +
  theme(legend.position = 'none')

example plot with custom labels

Upvotes: 1

Related Questions