Reputation: 77
I have just started to work with lists and lapply function and I'm experiencing some difficulty. I have a list of multiple dataframes and would like to subset dataframes that satisfy a specific condition and save it as a separate list. For instance,
l <- list(data.frame(PPID=1:5, gender=c(rep("male", times=5))),
data.frame(PPID=1:5, gender=c("male", "female", "male", "male", "female")),
data.frame(PPID=1:3, gender=c("male", "female", "male")))
print(l)
What I want to do is to subset only the lists that have both gender (male and female) and save that as another list. So my outcome should be another list which contains only second and third data frames in l.
Things that I tried include:
ll <- subset(l, lapply(1:length(l), function(i) {
length(levels(l[[i]]$gender)) == 2
}))
ll <- subset(l, lapply(1:length(l), function(i) {
l[[i]]$gender == "male" | l[[i]]$gender == "female"
}))
But this returned me a list of 0. Any help would be greatly appreciated!!
Upvotes: 1
Views: 82
Reputation: 50668
This works in base R:
lapply(l, function(x) if (length(unique(x$gender)) == 2) x)
#[[1]]
#NULL
#
#[[2]]
# PPID gender
#1 1 male
#2 2 female
#3 3 male
#4 4 male
#5 5 female
#
#[[3]]
# PPID gender
#1 1 male
#2 2 female
#3 3 male
If you don't want to keep the NULL
entries, you can do
l2 <- lapply(l, function(x) if (length(unique(x$gender)) == 2) x)
Filter(Negate(is.null), l2);
One of the issues with your code is that while gender
is a factor
, it doesn't have the same levels
in all list elements. You can check:
str(l);
#List of 3
# $ :'data.frame': 5 obs. of 2 variables:
# ..$ PPID : int [1:5] 1 2 3 4 5
# ..$ gender: Factor w/ 1 level "male": 1 1 1 1 1
# $ :'data.frame': 5 obs. of 2 variables:
# ..$ PPID : int [1:5] 1 2 3 4 5
# ..$ gender: Factor w/ 2 levels "female","male": 2 1 2 2 1
# $ :'data.frame': 3 obs. of 2 variables:
# ..$ PPID : int [1:3] 1 2 3
# ..$ gender: Factor w/ 2 levels "female","male": 2 1 2
Upvotes: 2
Reputation: 5109
If you're willing to switch to purrr, you can simply :
> library(purrr)
> keep(l, ~ length(unique(.x$gender)) > 1)
[[1]]
PPID gender
1 1 male
2 2 female
3 3 male
4 4 male
5 5 female
[[2]]
PPID gender
1 1 male
2 2 female
3 3 male
Upvotes: 2