Felix
Felix

Reputation: 309

Which curve is which in Survival Function plot?

I am plotting survival functions with the survival package. Everything works fine, but how do I know which curve is which? And how can I add it to a legend?

  url <- "http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt"
  Rossi <- read.table(url, header=TRUE)[,c(1:10)]
  km <- survfit(Surv(week, arrest)~race, data=Rossi)
  plot(km, lty=c(1 ,2))

Upvotes: 4

Views: 542

Answers (3)

MartineJ
MartineJ

Reputation: 795

Thanks to the answer of Richard, I found a way to plot the right names to the right curves, with the base plot way that is used in plot.survfit:

legend_values <- names(km$strata)
plot(km)
legend(
  "topright",
  legend=legend_values,
  col=1:2,
  lty = c(1,1),
  horiz=FALSE,
  bty='n')

I prefer the ggplot way of plotting, but I like to retain the stepwise presentation in plot.survfit.

Upvotes: 1

Richard
Richard

Reputation: 1294

how do I know which curve is which?

Using str() you can see which elements are in km. km$strata shows there are 48 and 10 elements. This coincides with the declining pattern of the first 48 items and last 10 items in km$surv

km$surv[1:48]
km$surv[49:58]

So in addition to the hint on order in print(), with this particular dataset we can also be sure that the first 48 elements belong to race=black

And how can I add it to a legend?

Unlike other model output km is not easily transformed to a data.frame. However, we can extract the elements ourselves and create a data.frame and then plot it ourselves.

First we create a factor referring to the strata: 48 blacks and 10 others

race <- as.factor(c(rep("black", 48), rep("other", 10)))
df <- data.frame(surv = km$surv, race = race, time = km$time)

Next we can plot it as usual (in my case, using ggplot2).

library(ggplot2)
ggplot(data = df, aes(x = time, y = surv)) + 
    geom_point(aes(colour = race)) + 
    geom_line(aes(colour = race)) +
    theme_bw()

survival by race

Upvotes: 3

ulfelder
ulfelder

Reputation: 5335

Unfortunately, the plot.survival function doesn't seem to offer a nice option for labeling the curves. The documentation says that curves are plotted in the order they appear in print, so you can figure out which is which when you vary line type or color. But that's not great for sharing.

One alternative is to use the survplot function from rms, which does label the curves. Here's how that looks with your example and CI plotting turned off. (Note that survplot won't take a survfit object, so you've got to redo the estimation with a function whose results it can read -- here, npsurv.)

library(rms)
survplot(npsurv(Surv(week, arrest)~race, data=Rossi), conf = "none")

enter image description here

Check the documentation for ways to tweak other aspects of the chart, including replacing the labels in the plot with a legend.

Upvotes: 2

Related Questions