Reputation: 163
df1:
a = c(2, 3, 5, 8, 10, 12)
b = c("NA", "bb", "cc", "aa", "bb", "aa")
c = c("bb", "aa", "bb", "cc", "aa", "aa")
d = c("aa", "cc", "bb", "aa", "aa", "aa")
e = c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE)
df1 = data.frame(a, b, c, d, e)
df2:
f = c("aa", "bb")
df2 = data.frame(f)
I want to evaluate df1 columns b,c,d to see if the value is not in df2. If it is, then keep the value. If it's not then input "Rare". Needs to ignore NAs.
Output:
a b c d e
2 NA bb aa true
3 bb aa rare false
5 rare bb bb true
8 aa rare aa false
10 bb aa aa true
12 aa aa aa false
Upvotes: 3
Views: 73
Reputation: 51592
Make sure that your columns are as.character()
, then (and considering your NA is a string just as you define it)
df1[2:4] <- lapply(df1[2:4], function(i)replace(i, !i %in% c('NA', df2$f), 'rare'))
df1
# a b c d e
#1 2 NA bb aa TRUE
#2 3 bb aa rare FALSE
#3 5 rare bb bb TRUE
#4 8 aa rare aa FALSE
#5 10 bb aa aa TRUE
#6 12 aa aa aa FALSE
Upvotes: 4
Reputation: 7818
Another solution with dplyr
, but it uses where
instead.
library(dplyr) # version >=1.0.0
df1 %>% mutate(across(where(is.character), ~ifelse((. %in% df2$f )| is.na(.), ., "rare")))
#> a b c d e
#> 1 2 <NA> bb aa TRUE
#> 2 3 bb aa rare FALSE
#> 3 5 rare bb bb TRUE
#> 4 8 aa rare aa FALSE
#> 5 10 bb aa aa TRUE
#> 6 12 aa aa aa FALSE
Upvotes: 1
Reputation: 39595
Try this:
library(dplyr)
#Data
a = c(2, 3, 5, 8, 10, 12)
b = c(NA, "bb", "cc", "aa", "bb", "aa")
c = c("bb", "aa", "bb", "cc", "aa", "aa")
d = c("aa", "cc", "bb", "aa", "aa", "aa")
e = c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE)
df1 = data.frame(a, b, c, d, e, stringsAsFactors = F)
#Data2
f = c("aa", "bb")
df2 = data.frame(f,stringsAsFactors = F)
#Code
df1 %>% mutate(across(c(b:d), ~ ifelse(.%in% df2$f | is.na(.), ., 'rare')))
Output:
a b c d e
1 2 <NA> bb aa TRUE
2 3 bb aa rare FALSE
3 5 rare bb bb TRUE
4 8 aa rare aa FALSE
5 10 bb aa aa TRUE
6 12 aa aa aa FALSE
Upvotes: 3