prog
prog

Reputation: 1073

Aggregate and find match

How to use dplyr after grouping the column A so that if one among them eg: 'a1' contains A22 or R22 it have to say Yes for all 'a1' else No in a new column C.

Please help me to achieve this.

Dataframe:

df1 <- data.frame(A= c("a1","a1","a1","b1","b1","b1","c1","c1"),
                  B= c("A22","B2","C2","R22","G2","C2","G2","O2"))
A           B            
a1         A22
a1         B2
a1         C2
b1         R22
b1         G2
b1         C2
c1         G2
c1         O2

Expected Result:

A           B       C
a1         A22      Yes
a1         B2       Yes
a1         C2       Yes 
b1         R22      Yes
b1         G2       Yes
b1         C2       Yes
c1         G2       No
c1         O2       No

Upvotes: 2

Views: 62

Answers (3)

akrun
akrun

Reputation: 887118

We can group by 'A', then if any of the strings 'A22', 'R22' are there in% 'B' to create the 'Yes', 'No' values in 'C'

library(dplyr)
df1 %>%
   group_by(A) %>%
   mutate(C = c("No", "Yes")[1+ (any(c("A22", "R22") %in% B))])
# A tibble: 8 x 3
# Groups:   A [3]
#  A     B     C    
#  <fct> <fct> <chr>
#1 a1    A22   Yes  
#2 a1    B2    Yes  
#3 a1    C2    Yes  
#4 b1    R22   Yes  
#5 b1    G2    Yes  
#6 b1    C2    Yes  
#7 c1    G2    No   
#8 c1    O2    No   

Or we can use base R

with(df1, ave(as.character(B), A, FUN = function(x) 
       c("No", "Yes")[1 + any(c("A22", "R22") %in% x)]))
#[1] "Yes" "Yes" "Yes" "Yes" "Yes" "Yes" "No"  "No" 

Upvotes: 2

Bulat
Bulat

Reputation: 6969

Here is the data.table option

library(data.table)
library(dplyr)

dt <- data.table(
  A= c("a1","a1","a1","b1","b1","b1","c1","c1"),
 B= c("A22","B2","C2","R22","G2","C2","G2","O2"))

dt[, 
    flag := case_when(
      B %in% c("A22", "R22") ~ "Yes",
      TRUE ~ "No"
    )
]
dt[, flag := factor(flag, levels = c("No", "Yes"), ordered = TRUE)]

dt[, flag := max(flag), by = A]

Upvotes: 0

d.b
d.b

Reputation: 32548

df1 %>%
    group_by(A) %>%
    mutate(C = factor(max(B %in% c("A22", "R22")),
                      levels = 0:1,
                      labels = c("No", "Yes"))) %>%
    ungroup()
#> # A tibble: 8 x 3
#>   A     B     C    
#>   <fct> <fct> <fct>
#> 1 a1    A22   Yes  
#> 2 a1    B2    Yes  
#> 3 a1    C2    Yes  
#> 4 b1    R22   Yes  
#> 5 b1    G2    Yes  
#> 6 b1    C2    Yes  
#> 7 c1    G2    No   
#> 8 c1    O2    No

Upvotes: 2

Related Questions