For each row, replace values from specific columns (defined by another dataframe), with a value from a vector

Question

Let's say we have:

set.seed(42)
df1 <- data.frame(v1=rnorm(10) , v2=rnorm(10), v3=rnorm(10), v4=rnorm(10))

as well as

df2 <- data.frame(v1=rnorm(10) , v2=rnorm(10), v3=rnorm(10), v4=rnorm(10))
vector <- c(17,21,33,41,50,63,72,81,91,10)

df1 and df2 have same column names and df2 is generated by processing of df1.

For each row in df2, I would like to replace a value that meets the condition < 0.5 in df1, with the corresponding value of the vector.

For example, if any of the columns of the first row in df1 has a value lower than 0.5, then the corresponding column(s) of the first row in df2 will have to be replaced with the first element of the vector, that is 17. For the second row, they will be replaced with 21 etc.

I picture some apply and a custom made function would do the trick but I am not able to figure it out. Thank you in advance for the solution.

markus · Accepted Answer

1)

My approach was:

idx <- df1 < .5
tmp <- idx * vector
df2[idx] <- tmp[idx]

2)

A second option provided by @MartinGal in the comments:

df2 * (df1>=0.5) + (df1<0.5) * vector

Result is

df2
#           v1            v2          v3         v4
#1  -1.4936251  5.676206e-01 -0.08610730 17.0000000
#2  21.0000000  2.100000e+01 -0.88767902 21.0000000
#3  33.0000000  6.288407e-05 33.00000000 33.0000000
#4  41.0000000  1.122890e+00 -0.02944488 41.0000000
#5  50.0000000  5.000000e+01 50.00000000 50.0000000
#6  -0.4282589  6.300000e+01 63.00000000 63.0000000
#7  72.0000000  7.200000e+01 72.00000000 72.0000000
#8  81.0000000  8.100000e+01 81.00000000 -0.8002822
#9  -1.2247480  9.100000e+01 91.00000000 91.0000000
#10  0.1795164 -5.246948e-02 10.00000000 10.0000000

We first check at which positions df1 is < .5 and multiply this by vector to get this matrix

idx <- df1 < .5
tmp <- (idx) * vector
tmp
#      v1 v2 v3 v4
# [1,]  0  0  0 17
# [2,] 21 21  0 21
# [3,] 33  0 33 33
# [4,] 41  0  0 41
# [5,] 50 50 50 50
# [6,]  0 63 63 63
# [7,] 72 72 72 72
# [8,] 81 81 81  0
# [9,]  0 91 91 91
#[10,]  0  0 10 10

These are the values you want to insert in df2 at the position where idx equals TRUE.

So the next step is to replace the those values in df2 using a logical matrix, i.e. idx:

df2[idx] <- tmp[idx]

For each row, replace values from specific columns (defined by another dataframe), with a value from a vector

Answers (2)

Related Questions