crazysantaclaus
crazysantaclaus

Reputation: 623

R ggplot2: unify fill for grouped samples with binary data in geom_tile

I'm trying to display an "absence/presence" heatmap with geom_tile in R. I would like to have a fill for "1" or "present" if a feature (here: OTU) can be found in at least one of the samples within a group. So below is the example code, where I grouped the samples by site:

library(reshape2)
library(ggplot2)

df <- data.frame(
  OTU = c("OTU001", "OTU002", "OTU003", "OTU004", "OTU005"),
  Sample1 = c(0,0,1,1,0),
  Sample2 = c(1,0,0,1,0),
  Sample3 = c(1,1,0,1,0),
  Sample4 = c(1,1,1,1,0))   
molten_df <- melt(df)

# add group data
sites <- data.frame(
  site = c(rep("site_A", 10), rep("site_B", 10)))
molten_df2 <- cbind(molten_df, sites)

# plot heatmap based on group variable sites
ggplot(molten_df2, aes(x = site, y = OTU, fill = value)) +
  geom_tile()

enter image description here

the tile (site_A, OTU003) consists of the values Sample1 = 1 and Sample2 = 0 and the outcome is 0. On the other hand, the tile (site_B, OTU003) also has Sample3 = 0 and Sample4 = 1, but it turns out as 1. Maybe it uses the last value for the fill? As I would like to display 1 if an OTU appears in any of the grouped samples regardless of the order, I wondered if anyone knows how to do this within ggplot2?

The other way I thought of (but failed coding) is to write a function that sets the remaining values of a given tile to 1, if at least one 1 appears.

Upvotes: 0

Views: 953

Answers (1)

ophdlv
ophdlv

Reputation: 254

With library dplyr, you can create a new variable indicating if OTU at a given site is present in, at least, one sample :

tmp = group_by(molten_df2,OTU, site) %>% 
  summarise(., PA=as.factor(ifelse(sum(value)>0,1,0)))

Then plot :

ggplot(tmp, aes(x = site, y = OTU, fill = PA)) +
  geom_tile()

enter image description here

Or directly inside the ggplot function :

ggplot(group_by(molten_df2,OTU, site) %>%
         summarise(., PA=factor(ifelse(sum(value)>0,1,0))), 
       aes(x = site, y = OTU, fill =PA)) +
  geom_tile()

Upvotes: 1

Related Questions