Reputation: 763
With this question I would like to extend and generalize the discussion started here. This is for the benefit of those, like me, who are still in trouble when have to use lapply.
Suppose I have the data frames d1
and d2
which I store in the list my.ls
d1<-data.frame(a=rnorm(5), b=c(rep(2006, times=4),NA), c=letters[1:5])
d2<-data.frame(a=1:5, b=c(2007, 2007, NA, NA, 2007), c=letters[6:10])
my.ls<-list(d1=d1, d2=d2)
How can I obtain another list featuring the same data frames for which I keep only the first and third columns? I tried the following, but it didn't work
my.ls.sub<-lapply(my.ls, my.ls[,c(1,3)])
What if then, I not only want to subset the data frames, but I also want to know what are the unique
values in the columns I am extracting? (In other words, here I would create two vectors for every data frame which could be free or stored in a list of lists). For the latter point I am not able to suggest anything...
Upvotes: 2
Views: 325
Reputation: 61154
Try this
lapply(my.ls, "[", ,c(1,3))
Or editing a little bit your code yields:
lapply(my.ls, function(x) x[, c(1,3)])
Since @Matthew Plourde already answered the second part of your question using lapply
, then I give you an alternative way to do it using rapply
which is the recursive version of lapply
.
rapply(lapply(my.ls, "[", ,c(1,3)), unique, how="list")
Upvotes: 2
Reputation: 44614
You were close: lapply(my.ls, '[', c(1,3))
. This calls the indexing function [
on each data.frame
with the additional argument c(1,3)
, specifying the first and third column.
Equivalently, you could call lapply(my.ls, '[', -2)
to remove the second column.
But I would recommend the more intelligible lapply(my.ls, subset, select=c(1,3))
.
To go directly from your original list to the a list of which values are unique in each column of each data.frame
, you could use nested lapply
statements like so:
lapply(my.ls, function(d) lapply(d[c(1,3)], unique))
Upvotes: 2