Reputation: 1178
I have been using lapply and sapply as my go-to functions recently. So far so good, but why the following code does not work baffles me.
df<-as.data.frame(matrix(rnorm(50),ncol=5))
names(df)<-c("x1","x2","x3","x4","x5")
df1<-seq_len(10)
ll<-lapply(seq(1,5), function(i) qplot(df1,df[,i]))
I get the error:
Error in `[.data.frame`(df, , i) : undefined columns selected
Ok, apparently I made quite an unfortunate mistake in my reproducible code. It works now, but all the plots in the ll
list are the same plot. When I run this:
do.call(grid.arrange,ll)
I get the following image:
All the plots are the same! This is also the output I get when I run this through my data.
Upvotes: 5
Views: 14739
Reputation: 60964
The problem you get is related to lazy evaluation. This means that the functions in ll
are only really evaluated when you call them, which is in grid.arrange
. At that time, each function will try and locate i
, which will have a value of 5
by that time because that is the last value of i
at the end of the lapply
loop. Therefore, the data extracted from df
is always the fifth column, thus your plots are all equal.
To prevent this, you need to force the data extraction to take place when the function is created, for example using @BrodieG's method. There, a new data.frame
is created, forcing the data from df
to be picked up. Alternatively, you can use force
to force the evaluation of i
.
See also for more examples and explanations of lazy evaluation:
For creating plots of multiple columns in the same data.frame I would use facet_wrap
. To use facet_wrap
, you need to reorder your data using melt
from the reshape2
package:
library(ggplot2)
library(reshape2)
df$xvalues = 1:10
df_melt = melt(df, id.vars = 'xvalues')
ggplot(df_melt, aes(x = xvalues, y = value)) +
geom_point() + facet_wrap(~ variable)
Upvotes: 5
Reputation: 52677
There are problems with lazy evaluation, or something like it anyway. You need to do the following:
ll<-lapply(
seq(1,5),
function(i) qplot(data=data.frame(y=df[, i]), df1, y)
)
This will force the y values to be updated for each plot.
More discussion in this other SO Post.
Upvotes: 7
Reputation: 2234
You are telling it to execute for 10 columns where you only have 5. This works:
ll<-lapply(seq(1,5), function(i) qplot(df1,df[,i]))
Upvotes: 3