Zuhaib Ahmed
Zuhaib Ahmed

Reputation: 497

ggplot2; Creating a boxplot from raw count data

I'm trying to create a boxplot in ggplot2 from a data frame, which contains information about count data from multiple samples. In my case, for each of 6 samples, a count is recorded for each gene.

So it would look like this:

df <- data.frame(sample(c(1:100), 20, replace = T), sample(c(1:100), 20, replace = T),
                 sample(c(1:100), 20, replace = T), sample(c(1:100), 20, replace = T),
                 sample(c(1:100), 20, replace = T), sample(c(1:100), 20, replace = T))
names(df) <- paste0("Sample-", c(1:6))
rownames(df) <- paste0("Gene-", c(1:20))

Here's what I've tried:

bp <- ggplot(df, aes(x = names(df), y = )) + geom_boxplot()

but I have 0 idea what to put for the y value. Literally no clue. I'm pretty sure I'm not even indicating the x-axis properly. I'd appreciate some help with this very basic problem. I'm sorry for such a simple question.

Upvotes: 0

Views: 3208

Answers (2)

neilfws
neilfws

Reputation: 33772

ggplot works best when data are in tidy, "long" format as opposed to "wide" format. You can get samples into one column and their values into another using tidyr::gather:

library(tidyverse)

set.seed(1001) # for reproducible example data
# generate df here as in your question

df %>% 
  gather(Sample, Count) %>% 
  ggplot(aes(Sample, Count)) + 
  geom_boxplot()

Result:

enter image description here

Upvotes: 2

william3031
william3031

Reputation: 1708

Like this?

library(tidyverse)
df2 <- df %>% 
  gather(sample, value)

ggplot(df2, aes(sample, value)) +
  geom_boxplot() + coord_flip()

enter image description here

Upvotes: 1

Related Questions