zengc
zengc

Reputation: 97

How to create such a figure using ggplot2 in R?

enter image description here

I have a matrix with many zero elements. The column names are labeled on the horizontal axis. I'd like to show explictly the nonzero elements as the bias from the vertical line for each column.

So how should construct a figure such as the example using ggplot2?

An example data can be generated as follow:

set.seed(2018)
N <- 5
p <- 40
dat <- matrix(0.0, nrow=p, ncol=N)
dat[2:7,   1] <- 4*rnorm(6)
dat[4:12,  2] <- 2.6*rnorm(9)
dat[25:33, 3] <- 2.1*rnorm(9)
dat[19:26, 4] <- 3.3*rnorm(8)
dat[33:38, 5] <- 2.9*rnorm(6)
colnames(dat) <- letters[1:5]

print(dat)

Upvotes: 1

Views: 120

Answers (2)

Maurits Evers
Maurits Evers

Reputation: 50728

Here is another option using facet_wrap and geom_col with theme_minimal.

library(tidyverse)
dat %>%
    as.data.frame() %>%
    rowid_to_column("row") %>%
    gather(key, value, -row) %>%
    ggplot(aes(x = row, y = value, fill = key)) +
    geom_col() +
    facet_wrap(~ key, ncol = ncol(dat)) +
    coord_flip() +
    theme_minimal()

enter image description here


To further increase the aesthetic similarity to the plot in your original post we can

  1. move the facet strips to the bottom,
  2. rotate strip labels,
  3. add "zero lines" in matching colours,
  4. remove the fill legend, and
  5. get rid of the x & y axis ticks/labels/title.

library(tidyverse)
dat %>%
    as.data.frame() %>%
    rowid_to_column("row") %>%
    gather(key, value, -row) %>%
    ggplot(aes(x = row, y = value, fill = key)) +
    geom_col() +
    geom_hline(data = dat %>%
        as.data.frame() %>%
        gather(key, value) %>%
        count(key) %>%
        mutate(y = 0),
        aes(yintercept = y, colour = key), show.legend = F) +
    facet_wrap(~ key, ncol = ncol(dat), strip.position = "bottom") +
    coord_flip() +
    guides(fill = FALSE) +
    theme_minimal() +
    theme(
        strip.text.x = element_text(angle = 45),
        axis.title = element_blank(),
        axis.text = element_blank(),
        axis.ticks = element_blank())

enter image description here

Upvotes: 4

Adela
Adela

Reputation: 1797

It would be much easier if you can provide some sample data. Thus I needed to create them and there is no guarantee that this will work for your purpose.

set.seed(123)

# creating some random sample data
df <- data.frame(id = rep(1:100, each = 3),
                 x = rnorm(300),
                 group = rep(letters[1:3], each = 100),
                 bias = sample(0:1, 300, replace = T, prob = c(0.7, 0.3)))

# introducing bias
df$bias <- df$bias*rnorm(nrow(df))              

# calculate lower/upper bias for errorbar
df$biaslow <- apply(data.frame(df$bias), 1, function(x){min(0, x)})
df$biasupp <- apply(data.frame(df$bias), 1, function(x){max(0, x)})

Then I used kind of hack to be able to print groups in sufficient distance to make them not overlapped. Based on group I shifted bias variable and also lower and upper bias.

# I want to print groups in sufficient distance
df$bias <- as.numeric(df$group)*5 + df$bias
df$biaslow <- as.numeric(df$group)*5 + df$biaslow
df$biasupp <- as.numeric(df$group)*5 + df$biasupp

And now it is possible to plot it:

library(ggplot2)

ggplot(df, aes(x = x, col = group)) + 
  geom_errorbar(aes(ymin = biaslow, ymax = biasupp), width = 0) + 
  coord_flip() + 
  geom_hline(aes(yintercept = 5, col = "a")) +
  geom_hline(aes(yintercept = 10, col = "b")) +
  geom_hline(aes(yintercept = 15, col = "c")) +
  theme(legend.position = "none") + 
  scale_y_continuous(breaks = c(5, 10, 15), labels = letters[1:3])

enter image description here

EDIT:

To incorporate special design you can add

theme_bw() + 
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 1),
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank()) 

to your plot.

enter image description here

EDIT2:

To incorporate several horizontal lines, you can create different dataset:

df2 <- data.frame(int = unique(as.numeric(df$group)*5), 
                  gr = levels(df$group))

And use

geom_hline(data = df2, aes(yintercept = int, col = gr))

instead of copy/pasting geom_hline for each group level.

Upvotes: 3

Related Questions