The Governor
The Governor

Reputation: 397

How to chain 2 lapply functions to subset dataframes in R?

I have a list containing 3 dataframes and another list containing 3 vectors of IDs. I'd like to subset each dataframe by checking if the IDs in the 1st dataframe match the ones in the first vector. Same for the second df and 2nd vector and 3rd df and 3rd vector. I can do it using lapply but I get a list of 3 lists, each containing a dataframe subsetted according to each of the 3 values in the list of IDs.

I want to get a list of 3 dataframes, the 1st one resulting of the rows in the 1st dataframe that have id in the 1st vector of IDs, the 2nd one resulting of the rows in the 2nd dataframe that have id in the 2ndvector of IDs... etc

n <- seq(1:20)
id <- paste0("ID_", n)

df1 <-data.frame(replicate(3,sample(0:10,10,rep=TRUE)))
df1$id <- replicate(10, sample(id, 1, replace = TRUE))

df2 <-data.frame(replicate(3,sample(0:10,7,rep=TRUE)))
df2$id <- replicate(7, sample(id, 1, replace = TRUE)) 

df3 <-data.frame(replicate(3,sample(0:10,8,rep=TRUE)))
df3$id <- replicate(8, sample(id, 1, replace = TRUE)) 

list_df <- list(df1, df2, df3)
list_id <- list(c("ID_13", "ID_1", "ID_5"), c("ID_1", "ID_17", "ID_4", 
"ID_9"), c("ID_12", "ID_18"))

subset_df <- lapply(list_df, function(x){
lapply(list_id, function(y) x[x$id %in% y,])
})

Thanks for your help!

Upvotes: 1

Views: 359

Answers (1)

tushaR
tushaR

Reputation: 3116

As Nicola suggested, you can use Map or mapply in R. Mapply takes multiple vectors/lists of same lengths as parameters and pass the values corresponding to same index in the vector/lists to the function.

In your example, mapply will pass 1st list of list_df and 1 vector of list_id to df and id respectively and do the required processing and will continue for i=2,3 ...

mapply(function(df,id){ df[df$id %in% id,]},list_df,list_id,SIMPLIFY = FALSE)

Upvotes: 1

Related Questions