Gahoo
Gahoo

Reputation: 225

How to plot heatmap with multiple categories in a single cell with ggplot2?

How to plot heatmap with multiple categories in a single cell with ggplot2? Heatmap plot of categorical variables could be done with this code

#data 
datf <- data.frame(indv=factor(paste("ID", 1:20),
    levels =rev(paste("ID", 1:20))), matrix(sample(LETTERS[1:7], 400, T), ncol = 20))



library(ggplot2); 
library(reshape2)
# converting data to long form for ggplot2 use

datf1 <- melt(datf, id.var = 'indv')

ggplot(datf1, aes(variable, indv)) + geom_tile(aes(fill = value),
   colour = "white")  +   scale_fill_manual(values= rainbow (7))

The codes came from here: http://rgraphgallery.blogspot.com/2013/04/rg54-heatmap-plot-of-categorical.html

But what about multiple categories in a single cell like this? Is it possible to use triangle or other shape as a cell?

http://postimg.org/image/4dudrv0nz/

http://postimg.org/image/4dudrv0nz/

copy from biostar as Alex Reynolds suggested.

Upvotes: 2

Views: 2587

Answers (1)

Shadow
Shadow

Reputation: 1042

For those interested, this apperas to be Figure 2 from Exome sequencing identifies mutation in CNOT3 and ribosomal genes RPL5 and RPL10 in T-cell acute lymphoblastic leukemia.

I wanted to create a similar plot with ggplot and geom_tile for a bigger collection of genes (few hundreds) but finally decided to use geom_points instead to provide additional information per cell (tile). Also it looks to me a lot like this plot was generated in Excel or some other spreadsheet software (maybe along those lines https://www.youtube.com/watch?v=0s5OiRMMzuY). The colors in the cells (tiles) do not match those in the legend (suggesting that they have been added separately and not automatically) and there appears to be an erroneous cell (diagonal separating colors -upper left to lower right - different from diagonal in black color - lower left to upper right -).

Hence, my concluding two cents: Doing this automatically is probably very time-consuming and in my opinion makes only sense if you want to do this repeatedly, e.g., on data that is subject to change or on multiple datasets, and/or if you have a larger collections of genes.

Otherwise, following the instructions in the youtube video for a rather small number of cells is likely to be more efficient. Or use geom_point (similar to Adding points to a geom_tile layer in ggplot2 or Marking specific tiles in geom_tile() / geom_raster() ) to represent information about an additional category (variable).

In any case, should anyone have other suggestions on how to automatically create such a figure, I am more than happy to hear about that.

Upvotes: 2

Related Questions