MD_1977
MD_1977

Reputation: 81

R: How to sample a different column for each row of a dataframe?

I want to sample a different column for each row of a dataframe using differing weights. I have tried a few things but have not been successful, including looking up similar questions. I am presenting a mock DF and expected output below.

library(plyr)
set.seed(12345)
df1 <- mdply(data.frame(mean=c(10, 15, 12, 24)), rnorm, n = 5, sd = 1)
df1

I want a vectorized solution (hopefully) to sample one column from V1 to V5 for every row. The weights for the sampling are the values in each cell from V1 to V5 for the row in question. The actual dataframe could have a couple million rows. A sample output is shown below.

f_col <- c(10,15,12,24)
sampled_column <- c("V3", "V1", "V5", "V5")

output_df1 <- data.frame("mean" = f_col, "result" = sampled_column)
output_df1

Upvotes: 0

Views: 783

Answers (1)

GKi
GKi

Reputation: 39657

In sample you can use prob to weight your sample probability. To make this for every row you can use apply.

output_df1 <- data.frame("mean"=df1$mean, "result"=apply(df1[,-1], 1, function(x) {sample(names(x), 1, prob=x)}))

Upvotes: 1

Related Questions