Reputation: 309
I am plotting survival functions with the survival package. Everything works fine, but how do I know which curve is which? And how can I add it to a legend?
url <- "http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt"
Rossi <- read.table(url, header=TRUE)[,c(1:10)]
km <- survfit(Surv(week, arrest)~race, data=Rossi)
plot(km, lty=c(1 ,2))
Upvotes: 4
Views: 542
Reputation: 795
Thanks to the answer of Richard, I found a way to plot the right names to the right curves, with the base plot way that is used in plot.survfit
:
legend_values <- names(km$strata)
plot(km)
legend(
"topright",
legend=legend_values,
col=1:2,
lty = c(1,1),
horiz=FALSE,
bty='n')
I prefer the ggplot way of plotting, but I like to retain the stepwise presentation in plot.survfit
.
Upvotes: 1
Reputation: 1294
how do I know which curve is which?
Using str()
you can see which elements are in km
.
km$strata
shows there are 48 and 10 elements. This coincides with the declining pattern of the first 48 items and last 10 items in km$surv
km$surv[1:48]
km$surv[49:58]
So in addition to the hint on order in print()
, with this particular dataset we can also be sure that the first 48 elements belong to race=black
And how can I add it to a legend?
Unlike other model output km
is not easily transformed to a data.frame. However, we can extract the elements ourselves and create a data.frame and then plot it ourselves.
First we create a factor referring to the strata: 48 blacks and 10 others
race <- as.factor(c(rep("black", 48), rep("other", 10)))
df <- data.frame(surv = km$surv, race = race, time = km$time)
Next we can plot it as usual (in my case, using ggplot2).
library(ggplot2)
ggplot(data = df, aes(x = time, y = surv)) +
geom_point(aes(colour = race)) +
geom_line(aes(colour = race)) +
theme_bw()
Upvotes: 3
Reputation: 5335
Unfortunately, the plot.survival
function doesn't seem to offer a nice option for labeling the curves. The documentation says that curves are plotted in the order they appear in print
, so you can figure out which is which when you vary line type or color. But that's not great for sharing.
One alternative is to use the survplot
function from rms
, which does label the curves. Here's how that looks with your example and CI plotting turned off. (Note that survplot
won't take a survfit
object, so you've got to redo the estimation with a function whose results it can read -- here, npsurv
.)
library(rms)
survplot(npsurv(Surv(week, arrest)~race, data=Rossi), conf = "none")
Check the documentation for ways to tweak other aspects of the chart, including replacing the labels in the plot with a legend.
Upvotes: 2