Dr.Fykos
Dr.Fykos

Reputation: 90

ggplot2 scale colours for heatmap

I have a dataset with positive and negative values and I am trying to generate a heatmap in ggplot which will have different color gradients for all the values less than zero and all the values greater than zero.

I managed to work that out with the code below but the scale on the legend doesn't show the whole colour range and it doesn't represent the data well. I tried to normalize and scale the data between 0 and 1 but this produces a continuous colour scale with just one colour.

You can find the data here http://pastebin.com/gVHBcVc6

I would appreciate any other ideas.

mylimits <- c(round(min(dat$ratio[!is.na(dat$ratio) > 0])),
          round(min(dat$ratio[!is.na(dat$ratio) > 0])) / 2,
          -0.2,
          0,
          0.2,
          round(max(dat$ratio[!is.na(dat$ratio) > 0])) / 2,
          round(max(dat$ratio[!is.na(dat$ratio) > 0])))


ggplot(data = dat, aes(x = ACC, y = variable)) +
  geom_tile(aes(fill = as.numeric(sprintf("%1.2f", 100 * ratio))), colour = 'white') +
  geom_text(aes(label = text), size = 2) +
  scale_fill_gradientn(colours=c('red', 'yellow', 'cyan', 'blue'),
                       values = rescale(mylimits)) +
  theme(axis.text.x = element_text(angle = 60, hjust = 1, color="black"), legend.title = element_blank(), legend.position="top", legend.key.size = unit(2.5, "cm"))

Upvotes: 2

Views: 6569

Answers (1)

cuttlefish44
cuttlefish44

Reputation: 6796

[REVISION]

In scale_fill_gradientn(), it is needed that length(colours) and length(values) are the same. You can calculate values easily using scales::rescale(). I took the values under -100 as outliers and change colour gradient under -100. It would be better to give nbin large value because the legend need to express rapid change at zero.

library(ggplot2); library(scales)

  ## combine fill values (because of convenience, not necessary)
dat <- cbind(dat, ratio2 = as.numeric(sprintf("%1.2f", 100 * dat$ratio)))

  ## get max and min values using range(data)
r_range <- range(dat$ratio2, na.rm = T)

  ## defaine values and colours
   # values for heatmap
main_val <- c(r_range[1], seq(-100, -1.0E-6, length = 3),  # minus
              seq(1.0E-6, r_range[2], length = 3))         # plus

   # values for legend (I made two patterns; ignore outliers or not)
legend_val_100_max <- main_val[-1]
legend_val_120_max <- c(-120, seq(-100, -1.0E-6, length = 3),
                        seq(1.0E-6, r_range[2], length = 3))

   # colours    
mycol <- c("navy", "blue", "cyan", "lightcyan",              # minus
           "yellow", "red", "red4")                          # plus
make heatmap without a legend
g <- ggplot(data = dat, aes(x = ACC, y = variable)) +
  geom_tile(aes(fill = ratio2), colour = 'white') +
  theme(axis.text.x = element_text(angle = 60, hjust = 1, color="black"), 
        legend.title = element_blank(), legend.position="top", legend.key.size = unit(2.5, "cm")) + 
  scale_fill_gradientn(colours = mycol, values = rescale(main_val), guide = F)

  # to check
ggplot(data = dat, aes(x = ACC, y = variable)) +
  geom_tile(aes(fill = cut(ratio2, breaks = c(-Inf, 0, Inf))), colour = 'white')
add legend
  # -100 to max version (ignore outliers)
g + scale_colour_gradientn(colours = mycol[-1], values = rescale(legend_val_100_max),
                           limits = c(-100, max(dat$ratio2, na.rm=T)), breaks= c(-80, -40, 0, 40, 80),
                           guide = guide_colorbar(nbin=100))

  # -120 to max version (but chaged labels to look like min to max)
g + scale_colour_gradientn(colours = mycol, values = rescale(legend_val_120_max),
                         limits = c(-120, max(dat$ratio2, na.rm=T)), 
                         breaks = c(-120, -111, -109,  -100, -50, 0, 50),
                         labels = c(-500, "/", "/", -100, -50, 0, 50), #  (to be exact, not -500 but -494.42))
                         guide = guide_colorbar(nbin=100))

enter image description here enter image description here

Another approach
You can use scale of main heatmap with a lower limit by scales package, oob=squish. In this approach, you needn't add a legend. But outlier colour becomes the lower limit's colour (i.e., if you set -200 as the lower limit, colour of -1000 is the same as -200).

  # I used -200 as the limit to distinguish -151.30 from -494.42 (`sort(dat$ratio2, na.last = T)[1:2])`)
limited_val <- c(-200, seq(-100, -1.0E-6, length = 3),
                 seq(1.0E-6, r_range[2], length = 3))

ggplot(data = dat, aes(x = ACC, y = variable)) +
  geom_tile(aes(fill = ratio2), colour = 'white') +
  theme(axis.text.x = element_text(angle = 60, hjust = 1, color="black"), 
        legend.title = element_blank(), legend.position="top", legend.key.size = unit(2.5, "cm")) + 
  scale_fill_gradientn(colours = mycol, values = rescale(limited_val), 
                       limits = c(-200, max(dat$ratio2, na.rm=T)), breaks = c(-100, -50, 0, 50),
                       oob=squish, guide = guide_colorbar(nbin = 100))

enter image description here

Upvotes: 3

Related Questions