Reputation: 25
I'm trying to find the model-predicted value closest to a real observed value within a large dataframe. I believe I need to use lapply, but I'm really not sure. Thanks in advance, SE, and sorry if this is a repeat of a previous post, I looked.
df <- data.frame(pred = rnorm(50, mean = 100, sd = 10),
cand = I(replicate(50, exp = I(list(rnorm(6, mean = 100, sd = 10))))))
So far, I've come up with a 1-line function that works when run on a single row, but I have two problems:
df$closest <- sapply( df, function(x) { which.min( abs( df$pred[x] - df$cand[[x]] ) ) } )
This function won't work on the full list, probably because I am new to the apply family.
This function returns a list position, not the actual value, which is what I need.
Error in df$cand[[x]] : no such index at level 1
Upvotes: 2
Views: 179
Reputation: 1053
apply
allows us to operate on the rows, or the columns, because you are looking to loop through the rows, a margin of 1 (rows) should get the job done!
We could use apply
:
df$closest <- apply( df,MARGIN = 1, function(x) { which.min( abs( x$pred - x$cand ) ) } )
Upvotes: 1
Reputation: 887128
Here, we can use Map
instead of sapply
because sapply
loops over each of the columns and the x
anonymous function value is the value of that column. It cannot be used for indexing
df$closest <- unlist(Map(function(x, y) which.min(abs(y - x)), df$pred, df$cand))
Or else with sapply
, we have to loop over the row index
Upvotes: 0