Reputation: 10131
I have two dataframes
#df1
type <- c("A", "B", "C")
day_start <- c(5,8,4)
day_end <- c(12,10,11)
df1 <- cbind.data.frame(type, day_start, day_end)
df1
type day_start day_end
1 A 5 12
2 B 8 10
3 C 4 11
#df2
value <- 1:10
day <- 4:13
df2 <- cbind.data.frame(day, value)
day value
1 4 1
2 5 2
3 6 3
4 7 4
5 8 5
6 9 6
7 10 7
8 11 8
9 12 9
10 13 10
I would like to subset df2 such that each level of factor "type" in df1 gets its own dataframe, only including the rows/days between day_start and day_end of this factor level.
Desired outcome for "A" would be..
list_of_dataframes$df_A
day value
1 5 2
2 6 3
3 7 4
4 8 5
5 9 6
6 10 7
7 11 8
8 12 9
I found this question on SO with the answer suggesting to use mapply(), however, I just cannot figure out how I have to adapt the code given there to fit my data and desired outcome.. Can someone help me out?
Upvotes: 2
Views: 1198
Reputation: 11617
Yes, you can use mapply
:
Define a function that will do what you want:
fun <- function(x,y) df2[df2$day >= x & df2$day <= y,]
Then use mapply
to apply this function with every element of day_start
and day_end
:
final.output <- mapply(fun,df1$day_start, df1$day_end, SIMPLIFY=FALSE)
This will give you a list with the outputs you want:
final.output
[[1]]
day value
2 5 2
3 6 3
4 7 4
5 8 5
6 9 6
7 10 7
8 11 8
9 12 9
[[2]]
day value
5 8 5
6 9 6
7 10 7
[[3]]
day value
1 4 1
2 5 2
3 6 3
4 7 4
5 8 5
6 9 6
7 10 7
8 11 8
You can name each data.frame
of the list with setNames
:
final.output <- setNames(final.output,df1$type)
Or you can also put an attribute type on the data.frames
of the list:
fun <- function(x,y, type){
df <- df2[df2$day >= x & df2$day <= y,]
attr(df, "type") <- as.character(type)
df
}
Then each data.frame
of final.output
will have an attribute so you know which type it is:
final.output <- mapply(fun,df1$day_start, df1$day_end,df1$type, SIMPLIFY=FALSE)
# check wich type the first data.frame is
attr(final.output[[1]], "type")
[1] "A"
Finally, if you do not want a list with the 3 data.frames
you can create a function that assigns the 3 data.frames to the global environment:
fun <- function(x,y, type){
df <- df2[df2$day >= x & df2$day <= y,]
name <- as.character(type)
assign(name, df, pos=.GlobalEnv)
}
mapply(fun,df1$day_start, df1$day_end, type=df1$type, SIMPLIFY=FALSE)
This will create 3 separate data.frames in the global environment named A, B and C.
Upvotes: 2
Reputation: 44575
The following solution assumes that you have all integer values for days, but if that assumption is plausible, it's an easy one-liner:
> apply(df1, 1, function(x) df2[df2$day %in% x[2]:x[3],])
[[1]]
day value
2 5 2
3 6 3
4 7 4
5 8 5
6 9 6
7 10 7
8 11 8
9 12 9
[[2]]
day value
5 8 5
6 9 6
7 10 7
[[3]]
day value
1 4 1
2 5 2
3 6 3
4 7 4
5 8 5
6 9 6
7 10 7
8 11 8
You can use setNames
to name the dataframes in the list:
setNames(apply(df1, 1, function(x) df2[df2$day %in% x[2]:x[3],]),df1[,1])
Upvotes: 3