cajt
cajt

Reputation: 49

Creating new variable based on more than one condition

I'm trying to create a new variable based on some conditions. I have the following data:

df <- data.frame(ID = c("A1","A1","A2","A2","A3","A4","A4"),
                 type = c("small","large","small","large","large","small","large"),
                 code = c("B9", "[0,20]","B9","[20,40]","[0,20]","B9","[40,60]" ))

Which gives:

    ID  type   code
1   A1  small   B9
2   A1  large   [0,20]
3   A2  small   B9
4   A2  large   [20,40]
5   A3  large   [0,20]
6   A4  small   B9
7   A4  large   [40,60]

I want to create a new variable (code2) which is based on type == large and the corresponding value for code, while grouping by ID. So the ID - A1 should be have [0,20] as its code2. I'd like to achieve the following:

    ID  type   code       code2
1   A1  small   B9        [0,20]    
2   A1  large   [0,20]    [0,20] 
3   A2  small   B9        [20,40]
4   A2  large   [20,40]   [20,40]
5   A3  large   [0,20]    [0,20] 
6   A4  small   B9        [40,60]
7   A4  large   [40,60]   [40,60]

From my little knowledge, I'm trying to use dplyr and ifelse, but have no luck.

Upvotes: 2

Views: 52

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 102890

A data.table option

> setDT(df)[, code2 := code[type == "large"], ID][]
   ID  type    code   code2
1: A1 small      B9  [0,20]
2: A1 large  [0,20]  [0,20]
3: A2 small      B9 [20,40]
4: A2 large [20,40] [20,40]
5: A3 large  [0,20]  [0,20]
6: A4 small      B9 [40,60]
7: A4 large [40,60] [40,60]

Upvotes: 0

akrun
akrun

Reputation: 887951

We can use a group by operation in dplyr i.e. grouped by 'ID', extract the 'code' where the 'type' value is "large" (assuming there are no duplicate values for 'type' within each 'ID'

library(dplyr)
df <- df %>% 
   group_by(ID) %>%
   mutate(code2 = code[type == 'large']) %>%
   ungroup

-output

df
# A tibble: 7 x 4
  ID    type  code    code2  
  <chr> <chr> <chr>   <chr>  
1 A1    small B9      [0,20] 
2 A1    large [0,20]  [0,20] 
3 A2    small B9      [20,40]
4 A2    large [20,40] [20,40]
5 A3    large [0,20]  [0,20] 
6 A4    small B9      [40,60]
7 A4    large [40,60] [40,60]

If there are duplicates, use match, which will give the index of the first matching index

df <- df %>%
       group_by(ID) %>%
       mutate(code2 = code[match('large', type)]) %>%
       ungroup

Upvotes: 3

Related Questions