Reputation: 47
I'm trying to apply a function to a list of two-dimensional data.
The data I am working on takes measurements over time from many probes. I apply a time index to the matrix that resets when the probe changes.
I have achieved this by transforming the list into individual dataframes, however, I would like to use something from the lapply() family to achieve this as my dataset grows.
This is the individual matrix approach that works:
source = c(1,1,1,2,2,2,3,3,3,4,4,4)
df1 = data.frame(source)
df1$elapsedTime <- (ave(df1$source, df1$source, FUN = seq_along))
df
# source elapsedTime
# 1 1 1
# 2 1 2
# 3 1 3
# 4 2 1
# 5 2 2
# 6 2 3
# 7 3 1
# 8 3 2
# 9 3 3
# 10 4 1
# 11 4 2
# 12 4 3
I would like to use a function from Map family for this process over a list of similar matrices from different experiments.
Upvotes: 0
Views: 49
Reputation: 6969
I think that should give you a base for desired lapply
code:
source = c(1,1,1,2,2,2,3,3,3,4,4,4)
df.in = data.frame(source)
df.list <- split(df.in, f = df$source)
res <- lapply(df.list, function(df){
df$elapsedTime <- seq_along(1:length(df$source))
return(df)
})
df.out <- bind_rows(res)
df.out
# source elapsedTime
# 1 1 1
# 2 1 2
# 3 1 3
# 4 2 1
# 5 2 2
# 6 2 3
# 7 3 1
# 8 3 2
# 9 3 3
# 10 4 1
# 11 4 2
# 12 4 3
Note that data.table
package has dedicated functions for this as well, which can be handy for larger datasets. Also if you just want to do some calculation within a group it is simpler to use data.table for that:
library(data.table)
dt = data.table(source)
dt[, elapsedTime := 1:.N, by = source]
Upvotes: 1
Reputation: 19716
If I understand correctly your data is a list of data frames as in the example posted. If that is the case:
Data:
lis = list(df1 = data.frame(source = c(1,1,1,2,2,2,3,3,3,4,4,4)),
df2 = data.frame(source = rep(1:5, each = 4)))
Function:
lapply(lis, function(x){
elapsedTime = ave(x[,1], x[,1], FUN = seq_along)
return(data.frame(x, elapsedTime))
}
)
If I am mistaken please comment.
Upvotes: 1