Reputation: 497
I'm trying to create a boxplot in ggplot2 from a data frame, which contains information about count data from multiple samples. In my case, for each of 6 samples, a count is recorded for each gene.
So it would look like this:
df <- data.frame(sample(c(1:100), 20, replace = T), sample(c(1:100), 20, replace = T),
sample(c(1:100), 20, replace = T), sample(c(1:100), 20, replace = T),
sample(c(1:100), 20, replace = T), sample(c(1:100), 20, replace = T))
names(df) <- paste0("Sample-", c(1:6))
rownames(df) <- paste0("Gene-", c(1:20))
Here's what I've tried:
bp <- ggplot(df, aes(x = names(df), y = )) + geom_boxplot()
but I have 0 idea what to put for the y value. Literally no clue. I'm pretty sure I'm not even indicating the x-axis properly. I'd appreciate some help with this very basic problem. I'm sorry for such a simple question.
Upvotes: 0
Views: 3208
Reputation: 33772
ggplot
works best when data are in tidy, "long" format as opposed to "wide" format. You can get samples into one column and their values into another using tidyr::gather
:
library(tidyverse)
set.seed(1001) # for reproducible example data
# generate df here as in your question
df %>%
gather(Sample, Count) %>%
ggplot(aes(Sample, Count)) +
geom_boxplot()
Result:
Upvotes: 2
Reputation: 1708
Like this?
library(tidyverse)
df2 <- df %>%
gather(sample, value)
ggplot(df2, aes(sample, value)) +
geom_boxplot() + coord_flip()
Upvotes: 1