Reputation: 18595
Let's say that I have a long data set and I would like to colour a specific label on the x-axis. In the case of the example below I would like to colour the label for Valiant.
# Packs
require(ggplot2)
require(reshape2)
# Data and trans
data(mtcars)
mtcars$model <- rownames(mtcars)
mtcars <- melt(mtcars, id.vars = "model")
# Some chart
ggplot(data = subset(x = mtcars, subset = mtcars$variable == "cyl"),
aes(x = model, y = value)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90,
colour =
ifelse(mtcars$model == "Valiant",
"red","black")))
The code produces the chart below that is erroneous as the wrong label is coloured.
The reason is fairly simple as what is created by ifelse
does not match the order on the axis. I can fix the code by forcing ggplot
to colour a specific row. The code below colours the right label as in the particular data.frame
used for the chart the row with the Valiant value is 31.
# Fixed chart
ggplot(data = subset(x = mtcars, subset = mtcars$variable == "cyl"),
aes(x = model, y = value)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90,
colour =
ifelse(as.numeric(rownames(mtcars)) == 31,
"red","black")))
Clearly this solutions is extremely impractical. On the actual data I've a vast number of observations with multiple columns (geo, gender, indicator, value, etc.). That data is subsequently filtered via subset and different options are passed to the aes
settings. Trying to figure out the row that should be coloured is a nightmare. I'm looking for a solution that would enable me to:
id
with some string as a way of indicating the text I wan to highlightggplot2
code, I don't want to create separate data subsets only to derive colouring vector as I will be doing this a number of times. This would unnecessary multiply objects.Upvotes: 4
Views: 2228
Reputation: 56935
The reason the first one mismatches is that mtcars$model
is much longer than the subset you are plotting, so the colour vector ifelse(mtcars$model == "Valiant","red","black")
is of length 352 but the subset you are plotting is only of length 32. The same problem exists with your second example, though in this case the extra elements of colour
(which are all "black" anyway) are dropped so you don't notice.
Unfortunately it looks like theme(...)
doesn't get evaluated with the data column-names available to it (i.e. can't just do colour=ifelse(model == "Valiant", "red", "black")
directly in the theme(...)
call)
One alternative is to make model
a factor and filter on levels(..) == "Valiant"
. If you have a long dataframe your id variable is most likely a factor anyway (or it would make sense for it to be one).
mtcars$model = factor(mtcars$model)
ggplot(data=subset(mtcars, variable == 'cyl'), aes(x=model, y=value)) +
geom_bar(stat="identity") +
theme(axis.text.x=element_text(angle=90,
colour=ifelse(levels(mtcars$model) == 'Valiant', 'red', 'black')))
(your problem stems from feeding subset()
into ggplot as your data, and then not being able to refer back to that particular subset in the theme
call. I don't know if there is a tricksy way to do this).
Upvotes: 2