Niek de Klein
Niek de Klein

Reputation: 8824

How to keep colours correct when facetting and using variable column names?

I am trying to make a facetted plot like this:

example_data <- data.frame(x = rnorm(100, mean = 0, sd = 1), 
                           y = rnorm(100, mean = 0, sd = 1),
                           facet=sample(c(0,1), replace=TRUE, size=100))

ggplot(example_data, aes(x=x, y=y, colour=sign(x)!=sign(y)))+
  geom_point()+
  geom_hline(yintercept=0)+
  geom_vline(xintercept=0)+
  facet_wrap(~facet)

enter image description here

However, I am doing this for multiple plots where the column names are variable. For plotting the x and y this works using the aes_string, and without facetting this also works for the colour:

ggplot(example_data, aes_string(x='x', y='y', colour=sign(example_data[['x']])!=sign(example_data[['y']])))+
  geom_point()+
  geom_hline(yintercept=0)+
  geom_vline(xintercept=0)+
  guides(col=F)

enter image description here

But then when I facet, the colours are not correct anymore:

ggplot(example_data, aes_string(x='x', y='y', colour=sign(example_data[['x']])!=sign(example_data[['y']])))+
  geom_point()+
  geom_hline(yintercept=0)+
  geom_vline(xintercept=0)+
  guides(col=F)+
  facet_wrap(~facet)

enter image description here

I'm guessing it is because the order of the points is dependent on which facet they are in. I can solve this by getting the colour per facet:

 col_facet_0 <- sign(example_data[example_data$facet==0,][['x']])!=sign(example_data[example_data$facet==0,][['y']])
 col_facet_1 <- sign(example_data[example_data$facet==1,][['x']])!=sign(example_data[example_data$facet==1,][['y']])
 col <- c(col_facet_0, col_facet_1)

ggplot(example_data, aes_string(x='x', y='y', colour=col))+
  geom_point()+
  geom_hline(yintercept=0)+
  geom_vline(xintercept=0)+
  guides(col=F)+
  facet_wrap(~facet)

enter image description here

The problem is, I need to know before hand which of the facet colours needs to be at the start of colour vector, and which last. e.g in above code, if I had used col <- c(col_facet_1, col_facet_0) instead, the colours would have been wrong.

My question, is there a way to do this within the ggplot function so that I don't need to know which facet has to be first?

Upvotes: 1

Views: 43

Answers (1)

Axeman
Axeman

Reputation: 35177

You can make the expression a string, like so:

ggplot(example_data, aes_string(x='x', y='y', colour='sign(x) != sign(y)'))+
  geom_point()+
  geom_hline(yintercept=0)+
  geom_vline(xintercept=0)+
  guides(col=F)

If you need flexible column names, one could do e.g.:

x_col <- 'x'
y_col <- 'y'

ggplot(
  example_data, 
  aes_string(x_col, y_col, colour = sprintf('sign(%s) != sign(%s)', x_col, y_col))
) + ...

Upvotes: 1

Related Questions