Reputation: 361
I have data that varies for different companies who may have different numbers of relevant "measures". If a measure falls below the benchmark, it should be colored a certain color which I've set to pink. If a measure is above the benchmark, it should be colored blue. The problem is, different companies have different numbers of measures and these measures could be lower or higher than the benchmark; there is no pattern.
I am using this condition in fill and it works sometimes.
ggplot(df, aes(measure)) + geom_col(aes(y=company, fill=overall > company)) + geom_point(aes(y=overall, color="overall"), size=8, shape=124) +
scale_color_manual("",values=c("company" = "yellow", "overall"="blue"),labels=c("company" = "Your Company", "overall"= "Overall Benchmark")) +
coord_flip()+ guides(size=FALSE) + theme(legend.box="horizontal",legend.key=element_blank(), legend.title=element_blank(),legend.position="top") +
scale_fill_manual(values=c("lightblue2", "lightpink2"),labels=c("Better","Worse"))
But for example if the data frame looks like this, it's completely off:
df = data.frame(
measure = c("Measure A","Measure B","Measure C","Measure D"),
overall = c(9, 5, 11, 19),
company = c(4,3,7, 16)
)
If the data frame looks like this, it's fine:
df2 = data.frame(
measure = c("Measure A","Measure B", "Measure C"),
overall = c(9, 5, 11),
company = c(11,7, 9)
)
I think this method doesn't accurately color the bars but I'm not sure why exactly.
Upvotes: 0
Views: 2653
Reputation: 29125
Try the following instead:
library(dplyr)
ggplot(df %>%
mutate(fill = ifelse(overall > company, "Worse", "Better")), aes(measure)) +
geom_col(aes(y=company, fill=fill)) +
geom_point(aes(y=overall, color="overall"), size=8, shape=124) +
coord_flip()+ guides(size=FALSE) +
theme(legend.box="horizontal",legend.key=element_blank(),
legend.title=element_blank(),legend.position="top") +
scale_fill_manual(values=c("Better" = "lightblue2", "Worse" = "lightpink2"))
Explanation: Without specifying the fill colour that's associated with each value, you'll run into this problem when you have different fill values.
In your second case, overall > company
evaluates to c(FALSE, TRUE, TRUE)
for the 3 measures. The first unique value (FALSE
) gets mapped to light blue / "Better", while the second (TRUE
) gets mapped to light pink / "Worse".
In your first case, overall > company
evaluates to c(TRUE, TRUE, TRUE)
, so it is TRUE
that gets mapped to light blue / "Better", because light blue / "Better" comes first sequentially. Nothing maps to light pink / "Worse" because there's only one fill value.
This version creates a fill variable explicitly in the source data, with the labels "Better" / "Worse", & uses a named vector in scale_fill_manual
to associate each label with the appropriate colour. It will work with both cases in your example.
Upvotes: 2