William
William

Reputation: 164

Replace values of a column based on another column having as input a dataframe

With the base::replace function I want to change some values of column z based on the values of column y

library(tidyverse)

(df <- tibble(y = 10:13, z = 20:23))
#> # A tibble: 4 x 2
#>       y     z
#>   <int> <int>
#> 1    10    20
#> 2    11    21
#> 3    12    22
#> 4    13    23

I have the data.frame val where column a the value to be used as condition and column b will be the replacement value.

(val <- tibble(a = c(10, 12), b = c(100, 200)))
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1    10   100
#> 2    12   200

Using the following approach it is possible to get the desired result, but it only works if all the values inside val are inside df

df %>% mutate(z = replace(z, y %in% val$a, val$b))
#> # A tibble: 4 x 2
#>       y     z
#>   <int> <dbl>
#> 1    10   100
#> 2    11    21
#> 3    12   200
#> 4    13    23

For example, if I update val to have values that are not in df, then:

(val <- tibble(a = c(1, 10, 12), b = c(1, 100, 200)))
#> # A tibble: 3 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     1
#> 2    10   100
#> 3    12   200

and I run the code again ...

df %>% mutate(z = replace(z, y %in% val$a, val$b))
#> Warning in x[list] <- values: number of items to replace is not a multiple of
#> replacement length
#> # A tibble: 4 x 2
#>       y     z
#>   <int> <dbl>
#> 1    10     1
#> 2    11    21
#> 3    12   100
#> 4    13    23

There are errors ... How can I fix this?

Created on 2021-02-19 by the reprex package (v1.0.0)

Upvotes: 2

Views: 2312

Answers (3)

akrun
akrun

Reputation: 887108

We don't need any mapas 'a', 'b' and the number of rows of 'df' have the same length. So, an elementwise comparison with == can be done. Instead of replace, it may be better with ifelse/case_when etc as replace values should be of the same length as the list condition TRUE elements

library(dpyr)
df %>%
    mutate(z = case_when(a == y ~ as.integer(b), TRUE ~ z))

-output

# A tibble: 3 x 2
#      y     z
#  <int> <int>
#1    10   100
#2    11   200
#3    12    22

Or using base R

df$z <- with(df, ifelse(a == y, b, z))

In the OP's code, there is a difference in length when we do

 replace(x = z, y == .x, values = .y)

where 'z' will be the full column length, .x , .y will be each row element

Update

Based on the updated data, we could a join and then use coalesce

df %>% 
    left_join(val, by = c('y' = 'a')) %>%
    transmute(y, z = coalesce(b, z))
# A tibble: 4 x 2
#      y     z
#  <dbl> <dbl>
#1    10   100
#2    11    21
#3    12   200
#4    13    23

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 101335

A base R option with replace

transform(
  df,
  z = replace(z, na.omit(match(y, a)), b[na.omit(match(y, a))])
)

gives

   y   z
1 10 100
2 11 200
3 12  22

Upvotes: 1

tmfmnk
tmfmnk

Reputation: 39858

One possibility might be:

df %>%
 mutate(z = coalesce(b[match(y, a)], z))

      y     z
  <int> <dbl>
1    10   100
2    11   200
3    12    22

Upvotes: 3

Related Questions