Reputation: 867
I am trying to develop a function that creates a list of data frame
subsets from a user provided vector of columns and list of values within each column to subset by.
Example data frame:
df <- data.frame(var1 = rep(1:3, each = 5),
var2 = rep(4:6, each = 5),
var3 = rep(7:9, each = 5))
Vector of columns to subset:
cols.df <- c(1,2,3)
List of values within each column to subset by: rows.df <- list(c(1:3), c(4:6), c(7:9))
Function to iteratively create a list of subsets:
subsetfcn <- function(data, cols, rowslist){
df <- data
listofdfs <- list() # create data.frame to contain subsets
for(a in cols){
for(rows in rowslist) {
for(row in rows) {
df <- df[df[ , a]==row, ]
listofdfs[[row]] <- df
}
}
}
return(listofdfs)
}
results <- subsetfcn(df, cols.df, rows.df)
The expected output is a list of:
> df[df[ , 1]==1, ]
var1 var2 var3
1 1 4 7
2 1 4 7
3 1 4 7
4 1 4 7
5 1 4 7
> df[df[ , 1]==2, ]
var1 var2 var3
6 2 5 8
7 2 5 8
8 2 5 8
9 2 5 8
10 2 5 8
> df[df[ , 1]==3, ]
var1 var2 var3
11 3 6 9
12 3 6 9
13 3 6 9
14 3 6 9
15 3 6 9
>
> df[df[ , 2]==4, ]
var1 var2 var3
1 1 4 7
2 1 4 7
3 1 4 7
4 1 4 7
5 1 4 7
> df[df[ , 2]==5, ]
var1 var2 var3
6 2 5 8
7 2 5 8
8 2 5 8
9 2 5 8
10 2 5 8
> df[df[ , 2]==6, ]
var1 var2 var3
11 3 6 9
12 3 6 9
13 3 6 9
14 3 6 9
15 3 6 9
etc....
As of now, the function returns a list of 9 data frames, but each has no rows. I'm not sure why the correct values are not being passed to a
and row
.
Upvotes: 2
Views: 209
Reputation: 56159
Using mapply:
res <- unlist(
mapply(function(cols.df, rows.df){
lapply(rows.df, function(x){ df[ df[ , cols.df ] == x, ] })
}, cols.df, rows.df, SIMPLIFY = FALSE),
recursive = FALSE)
# check output
length(res)
# [1] 9
res[1:2]
# [[1]]
# var1 var2 var3
# 1 1 4 7
# 2 1 4 7
# 3 1 4 7
# 4 1 4 7
# 5 1 4 7
#
# [[2]]
# var1 var2 var3
# 6 2 5 8
# 7 2 5 8
# 8 2 5 8
# 9 2 5 8
# 10 2 5 8
Upvotes: 2