Adrian Mak
Adrian Mak

Reputation: 157

Mutate column of dataframe using values stored in different dataframe

I have created two dataframes, with df.1 containing my main data.

ID  A_ratio   B_ratio  C_ratio
1    0.9       7.6      3.5
2    3.1       4.4      0.7     
3    6.3       8.2      1.2

The dataframe cut only contains one row.

A_cut  B_cut  C_cut
 4.5    5.3    2.0

I now want to use the values stored in cut to binarize df, turning X_ratio <= X_cut to 1 and X_ratio > X_cut to 0. The new column could be called X_bin. I've tried the following dplyr approach:

df.2 <- df.1 %>%
  mutate(across(ends_with("ratio"), ~if_else(. <= get(cut[str_replace(cur_column(),"ratio","cut")]), 1, 0)
            .names = "{.col}_bin"))%>%
  rename_with(~str_replace(.,"_ratio",""),contains("_ratio_"))
  select(ID, ends_with("bin"))

But I'm unfortunately getting an Error: unexpected symbol. Could someone point out my mistake? The desired output in df.2 would be

ID A_bin B_bin C_bin
1   1     0     0
2   1     1     1
3   0     0     1

Thanks a lot in advance!

Upvotes: 3

Views: 180

Answers (3)

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

purrr

df <- structure(list(ID = 1:3, A_ratio = c(0.9, 3.1, 6.3), B_ratio = c(7.6, 
                                                                         4.4, 8.2), C_ratio = c(3.5, 0.7, 1.2)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                     -3L))

cut <- structure(list(A_cut = 4.5, B_cut = 5.3, C_cut = 2), class = "data.frame",
                 row.names = c(NA, 
                               -1L))
library(purrr)
df[-1] <- +map2_dfc(df[-1], cut, ~.x <= .y)
df
#>   ID A_ratio B_ratio C_ratio
#> 1  1       1       0       0
#> 2  2       1       1       1
#> 3  3       0       0       1

Created on 2021-04-02 by the reprex package (v1.0.0)

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388862

Base R answer :

df.1[-1] <- +(sweep(df.1[-1], 2, unlist(cut), `<=`))
df.1

#  ID A_ratio B_ratio C_ratio
#1  1       1       0       0
#2  2       1       1       1
#3  3       0       0       1

Upvotes: 3

akrun
akrun

Reputation: 887008

There is a , missing before the .names and if we are extracting the column from cut, we don't need any get along with the fact that instead of mutate, use transmute to return only those columns needed so that the last step with select can be removed

library(dplyr)
library(stringr)
df.1 %>%
  transmute(ID, across(ends_with("ratio"), 
      ~if_else(. <=  cut[[str_replace(cur_column(),"ratio","cut")]], 
            1, 0),
        .names = "{.col}_bin")) %>% 
   rename_with(~str_replace(.,"_ratio",""),contains("_ratio_"))

-output

#  ID A_bin B_bin C_bin
#1  1     1     0     0
#2  2     1     1     1
#3  3     0     0     1

As we are returning binary columns, if_else is not really needed. The logical vector can be coerced to binary with as.integer or wrapped with +(

df.1 %>%
  transmute(ID, across(ends_with("ratio"), 
      ~as.integer(. <=  cut[[str_replace(cur_column(),"ratio","cut")]]),
        .names = "{.col}_bin")) %>% 
   rename_with(~str_replace(.,"_ratio",""),contains("_ratio_"))

Note: cut is a function name, so it is better not to name objects with function names

data

df.1 <- structure(list(ID = 1:3, A_ratio = c(0.9, 3.1, 6.3), B_ratio = c(7.6, 
4.4, 8.2), C_ratio = c(3.5, 0.7, 1.2)), class = "data.frame", row.names = c(NA, 
-3L))

cut <- structure(list(A_cut = 4.5, B_cut = 5.3, C_cut = 2), class = "data.frame",
row.names = c(NA, 
-1L))

Upvotes: 3

Related Questions