ana_gg
ana_gg

Reputation: 370

Insert zeros in a dataframe based on value on other in R

I have in R two big data.frames that look like

1 1 1 1 1 2 2 2 2 2
2 2 2 3 3 3 3 3 3 3 
4 4 4 1 1 1 1 1 1 1
5 5 5 5 4 4 5 5 1 1

and

0.98 0.97 0.99 0.98 0.99 0.97 0.96 0.89 0.90 0.99
0.99 0.97 0.99 0.98 0.99 0.97 0.97 0.89 0.90 0.98
0.79 0.97 0.99 0.98 0.99 0.97 0.96 0.89 0.90 0.99
0.89 0.98 0.99 0.98 0.99 0.97 0.96 0.89 0.91 0.91

Both datasets have the same dimensions. Basically, I would like to have a 0 in the second data.frame if there is a 5 in the first, so it would look like:

0.98 0.97 0.99 0.98 0.99 0.97 0.96 0.89 0.90 0.99
0.99 0.97 0.99 0.98 0.99 0.97 0.97 0.89 0.90 0.98
0.79 0.97 0.99 0.98 0.99 0.97 0.96 0.89 0.90 0.99
0    0    0    0    0.99 0.97 0    0    0.91 0.91

But i can not do it column by column because I have almost 50000 columns. I tried with ifelse but I couldn´t do it. Some idea??

Thank you very much!

Upvotes: 0

Views: 153

Answers (1)

Ian Campbell
Ian Campbell

Reputation: 24790

This one is surprisingly easy:

df2[df1==5] <- 0
df2
    V1   V2   V3   V4   V5   V6   V7   V8   V9  V10
1 0.98 0.97 0.99 0.98 0.99 0.97 0.96 0.89 0.90 0.99
2 0.99 0.97 0.99 0.98 0.99 0.97 0.97 0.89 0.90 0.98
3 0.79 0.97 0.99 0.98 0.99 0.97 0.96 0.89 0.90 0.99
4 0.00 0.00 0.00 0.00 0.99 0.97 0.00 0.00 0.91 0.91

It works because df1==5 results in a logical matrix which can be used to subset df2:

df1==5
        V1    V2    V3    V4    V5    V6    V7    V8    V9   V10
[1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[4,]  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE
df1 <- structure(list(V1 = c(1L, 2L, 4L, 5L), V2 = c(1L, 2L, 4L, 5L), 
        V3 = c(1L, 2L, 4L, 5L), V4 = c(1L, 3L, 1L, 5L), V5 = c(1L, 
        3L, 1L, 4L), V6 = c(2L, 3L, 1L, 4L), V7 = c(2L, 3L, 1L, 5L
        ), V8 = c(2L, 3L, 1L, 5L), V9 = c(2L, 3L, 1L, 1L), V10 = c(2L, 
        3L, 1L, 1L)), row.names = c(NA, 4L), class = "data.frame")

df2 <- structure(list(V1 = c(0.98, 0.99, 0.79, 0), V2 = c(0.97, 0.97, 
    0.97, 0), V3 = c(0.99, 0.99, 0.99, 0), V4 = c(0.98, 0.98, 0.98, 
    0), V5 = c(0.99, 0.99, 0.99, 0.99), V6 = c(0.97, 0.97, 0.97, 
    0.97), V7 = c(0.96, 0.97, 0.96, 0), V8 = c(0.89, 0.89, 0.89, 
    0), V9 = c(0.9, 0.9, 0.9, 0.91), V10 = c(0.99, 0.98, 0.99, 0.91
    )), row.names = c(NA, 4L), class = "data.frame")

Upvotes: 1

Related Questions