Reputation: 55
I am trying to plot cumulative hazard function and seeing how it fits against cumulative hazard function from a weibull distribution.
I am doing it in the following way:
library(ggplot2)
df1 <- structure(list(time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), cumhaz = c(
0.0012987012987013,
0.00259909141573641, 0.00390287238053432, 0.00521006192301798,
0.00652412236979854, 0.00916613029582231, 0.0158150664660351,
0.0184996302244243, 0.019847339119303, 0.0225647304236509
)), row.names = c(
NA,
-10L
), class = "data.frame")
df2 <- structure(list(
time = c(
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
), variable = c(
"est",
"est", "est", "est", "est", "est", "est", "est", "est", "est",
"lcl", "lcl", "lcl", "lcl", "lcl", "lcl", "lcl", "lcl", "lcl",
"lcl", "ucl", "ucl", "ucl", "ucl", "ucl", "ucl", "ucl", "ucl",
"ucl", "ucl"
), value = c(
0.000427087666907353, 0.00125579463203928,
0.00236002165980674, 0.00369249753634486, 0.00522530742007584,
0.0069393306375308, 0.00882036474985765, 0.010857299201679, 0.0130411216860664,
0.0153643233399894, 0.000132730646554643, 0.00047934870180344,
0.00101968629026559, 0.00175966372305041, 0.00268278540068723,
0.00376105113380831, 0.00500451835889961, 0.00634199908365718,
0.0079694006901553, 0.00969173651303582, 0.00113691538883517,
0.00283227928015465, 0.00480474736249053, 0.00698677110701666,
0.00936461891781656, 0.0119398743413438, 0.0146624641508895,
0.0175805674751187, 0.0205366926372124, 0.0235955294708706
),
type = c(
"Estimate", "Estimate", "Estimate", "Estimate",
"Estimate", "Estimate", "Estimate", "Estimate", "Estimate",
"Estimate", "Confidence Interval", "Confidence Interval",
"Confidence Interval", "Confidence Interval", "Confidence Interval",
"Confidence Interval", "Confidence Interval", "Confidence Interval",
"Confidence Interval", "Confidence Interval", "Confidence Interval",
"Confidence Interval", "Confidence Interval", "Confidence Interval",
"Confidence Interval", "Confidence Interval", "Confidence Interval",
"Confidence Interval", "Confidence Interval", "Confidence Interval"
)
), row.names = c(NA, -30L), class = "data.frame")
ggplot() +
geom_step(ggplot2::aes(x = time, y = cumhaz), df1, group = 1, colour = "#4C5D8A") +
geom_line(ggplot2::aes(x = time, y = value, group = variable, linetype = type), df2,
colour = "#F3C911", show.legend = FALSE) +
theme_minimal() +
labs(
x = "Time", y = "Cumulative Hazard",
title = "Weibull Distribution"
) +
scale_x_continuous(limits = c(0, 10), breaks = seq(0, 10, by = 2)) +
theme(
legend.position = "bottom",
legend.direction = "horizontal",
plot.title = element_text(hjust = 0.5)
) +
scale_linetype_manual(values = c(2, 1))
This results in a plot similar to this:
I want to create a legend like this:
Essentially, I want to combine the color and linetype scales, but not all combinations should be displayed in the legend (No purple colored confidence limits legend key)
Upvotes: 2
Views: 142
Reputation: 1268
Here's an imperfect solution:
df1$group <- "a"
ggplot() +
geom_step(aes(x = time, y = cumhaz, color = group),
df1,
group = 1) +
geom_line(aes(x = time, y = value, group = variable, linetype = type),
df2,
colour = "#F3C911") +
theme_minimal() +
labs(x = "Time",
y = "Cumulative Hazard",
title = "Weibull Distribution") +
scale_x_continuous(limits = c(0, 10),
breaks = seq(0, 10, by = 2)) +
theme(legend.position = "bottom",
legend.direction = "horizontal",
plot.title = element_text(hjust = 0.5)) +
scale_linetype_manual(values = c(2, 1),
name = "Line",
labels = c("Actual Cumulative Hazard",
"Weibull Cumulative Hazard",
"95% Confidence Interval")) +
scale_color_manual(values = "#4C5D8A",
name = "",
labels = c("Actual Cumulative Hazard",
"Weibull Cumulative Hazard",
"95% Confidence Interval"))
First, the color for geom_step
needs to be called within the aesthetics, so I created a constant value for that. Then I specify a manual color scale later that passes the color value you want for that line. The two manual scales are given identical labels.
Where this answer is imperfect is that the name cannot be set identically (as in this answer) and thus there is extra space between the two parts of the legend. There may be a workaround here, but any alternatives I try in which linetype
and color
are defined for both lines ends up producing all possible combinations of the color and linetype values, which makes no sense here.
Upvotes: 0
Reputation: 448
My solution is to call the cum. hazard color as an aesthetic in geom_step
. Then reveal the legend for the linetype, and adjust titles and legend order.
ggplot() +
geom_step(ggplot2::aes(x = time, y = cumhaz, colour = "Actual Cumulative Hazard"), df1, group = 1) +
geom_line(ggplot2::aes(x = time, y = value, group = variable, linetype = fct_rev(type)), df2,
colour = "#F3C911") +
theme_minimal() +
scale_colour_manual(values = "#4C5D8A",
name = '',
guide = guide_legend(order = 1)) +
scale_linetype_manual(values = c(1, 2),
labels = c('Weibull Cumulative Hazard','95% Confidence Limits'),
guide = guide_legend(title = NULL, order = 0)) +
labs(
x = "Time", y = "Cumulative Hazard",
title = "Weibull Distribution"
) +
scale_x_continuous(limits = c(0, 48), breaks = seq(0, 48, by = 12)) +
theme(
legend.position = "bottom",
legend.direction = "horizontal",
plot.title = element_text(hjust = 0.5)
)
Upvotes: 1