caddymob
caddymob

Reputation: 317

Selecting data frame columns to plot in ggplot2

I have a big table of data with ~150 columns. I need to make a series of histograms out of about 1/3rd of them. Rather than putting 50 lines of the same plot command in my script, I want to loop over a list telling me which columns to use. Here is a test dataset to illustrate:

d <- data.frame(c(rep("A",5), rep("B",5)),
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE))

colnames(d) <- c("col1","col2","col3","col4","col5","col6" )


ggplot(data=d, aes(col2, fill= col1)) + geom_density(alpha = 0.5)

So, rather than writing this a 50 times and replacing the aes() values, I really want to do something more like this...

cols_to_plot <- c("col2","col4","col6")

for (i in length(cols_to_plot)) {
  ggplot(data=d, aes(cols_to_plot[i], fill= col1)) + geom_density(alpha = 0.5)

} 

But of course, this doesn't work... Is there a way to do this kind of thing?

Thanks!

Upvotes: 7

Views: 16830

Answers (4)

Gavin Simpson
Gavin Simpson

Reputation: 174853

Since version 3.0.0 of ggplot2, the aes_string() function has been soft deprecated, with the focus now to use aes().

The trick now is to use the .data object to refer to the data object supplied to ggplot(), which here is d.

Using this, we recover the behviour prior to version 3.0.0 without the soft deprecation wraning:

d <- data.frame(c(rep("A",5), rep("B",5)),
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE))

colnames(d) <- c("col1","col2","col3","col4","col5","col6" )

cols_to_plot <- c("col2","col4","col6")

for (i in seq_along(cols_to_plot)) {
  print(ggplot(data = d,
               aes(x = .data[[cols_to_plot[i]]], fill= .data[["col1"]])) +
    geom_density(alpha = 0.5))
}

Original answer using aes_string()

There is an alternative to aes(); aes_string(). With this you can pass in strings for the aesthetic mappings. Note you have to quote col1 here in fill = "col1". Also note that in a for() loop you need to explicitly print() a ggplot object in order for the plot to be drawn on the current device.

d <- data.frame(c(rep("A",5), rep("B",5)),
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE),     
                sample(c(1:10), 10, replace=TRUE))

colnames(d) <- c("col1","col2","col3","col4","col5","col6" )

cols_to_plot <- c("col2","col4","col6")

for (i in seq_along(cols_to_plot)) {
  print(ggplot(data=d, aes_string(x = cols_to_plot[i], fill= "col1")) +
    geom_density(alpha = 0.5))
}

Upvotes: 8

mnel
mnel

Reputation: 115425

Yes, aes_string

cols_to_plot <- c("col2","col4","col6")

for (i in cols_to_plot) {
  ggplot(data=d, aes_string(i, fill= 'col1')) + geom_density(alpha = 0.5)

} 

Upvotes: 4

metasequoia
metasequoia

Reputation: 7274

lapply() could be used here in place of a for loop with aes_string().

cols_to_plot <- c("col2","col4","col6")
lapply(cols_to_plot,function(i){
  ggplot(data=d, aes_string(x=i, fill= 'col1')) + 
    geom_density(alpha = 0.5)
} )

Upvotes: 3

Harlan
Harlan

Reputation: 19401

I think you'd be better off if you melted your data. Try this:

library(reshape2)
d2 <- melt(d, id='col1')
ggplot(d2, aes(value, fill=col1)) + geom_density(alpha=.5) + facet_wrap(~variable)

Or, if you wanted to do what you originally wanted, use aes_string, like:

ggplot(data=d, aes_string(cols_to_plot[i], fill='col1')) + geom_density(alpha = 0.5)

Upvotes: 10

Related Questions