Jesse
Jesse

Reputation: 244

Plotting data by color variable with ggplot2 in R

This is something of an extension of a previous question here:

Assign colors to a data frame based on shared values with a character string in R

I now have a data frame with x, y, errrs, and newcolors and I want to plot the data and error bars by color using ggplot2. I've tried this, and the plot works well, but the colors are not even close to being correct. I've tried defining the color variables in different places within the ggplot() call, but no luck. What am I missing?

Here is the data:

names          <- c( "TC3", "102", "172", "136", "142", "143", "AC2G" )
colors         <- c( "darkorange", "forestgreen", "darkolivegreen", "darkgreen", "darksalmon", "firebrick3", "firebrick1" )
dataA          <- c( "JR13-101A", "TC3B", "JR12-136C", "AC2GA", "TC3A" )
newcolors      <- rep( NA, length( dataA ) )
dataA          <- as.data.frame( cbind( dataA, newcolors ) )
x              <- c( 1, 2, 3, 4, 5 )
y              <- c( 10, 6, 3, 18, 2 )
errs           <- c( 2, 1, 2, 1, 2 )

dataA          <- cbind( dataA, x, y, errs )

and a solution to my previous question by @Dave2e that assigns colors by sample name:

dataA$newcolors <- as.character( dataA$newcolors )
 for( j in 1:length( names ) ) {
  dataA$newcolors[ grep( names[ j ], dataA$dataA ) ] <- colors[ j ] 
}

and finally the plotting code I've tried:

ggplot( dataA, aes( x = x, y = y) ) +
  geom_errorbar( aes( ymin = y - errs, ymax = y + errs, color = newcolors ), 
                 width = 0.03 ) + 
  geom_point( size = 5, aes( color = newcolors) ) 

(I've also tried putting the colors into the aes() call up front by aes( x = x, y = y, color = newcolors). The plot looks good, apart from the fact that the colors are not correct. "darkorange" shows up as a light green "darkgreen" is some pinkish color, and "firebrick1" is a light blue.

Upvotes: 1

Views: 1631

Answers (1)

Uwe
Uwe

Reputation: 42582

In response to this comment of the OP, you may try

ggplot(dataA, aes(x = x, y = y, fill = I(newcolors))) +
  geom_errorbar(aes(ymin = y - errs, ymax = y + errs), 
                 width = 0.03) + 
  geom_point(size = 5, shape = 21) 

enter image description here

In geom_point() size, shape and border color have been explicitely defined, i.e., outside of a call to aes(). Shapes numbered 21 to 24 are filled (see http://ggplot2.tidyverse.org/reference/scale_shape.html for available shapes). Consequently, the fill aesthetic but not the color aesthetic has been defined in the call to aes(). So, error bars and symbol borders are printed in black by default.

The advantage (or drawback, perhaps) is that the data point which had been assigned NA as newcolor is visible.

For comparison, below is how the plot looked after picking up the suggestion from Richard Telford's comment:

ggplot(dataA, aes(x = x, y = y, color = I(newcolors))) +
  geom_errorbar(aes(ymin = y - errs, ymax = y + errs), 
                width = 0.03) + 
  geom_point(size = 5) 

enter image description here

Note that the leftmost data point was removed by ggplot2 as there was no color, i.e., NA, assigned to this data point.

Warning message:
Removed 1 rows containing missing values (geom_point).

Upvotes: 2

Related Questions