Reputation: 49
I'm trying to create a new variable based on some conditions. I have the following data:
df <- data.frame(ID = c("A1","A1","A2","A2","A3","A4","A4"),
type = c("small","large","small","large","large","small","large"),
code = c("B9", "[0,20]","B9","[20,40]","[0,20]","B9","[40,60]" ))
Which gives:
ID type code
1 A1 small B9
2 A1 large [0,20]
3 A2 small B9
4 A2 large [20,40]
5 A3 large [0,20]
6 A4 small B9
7 A4 large [40,60]
I want to create a new variable (code2) which is based on type == large and the corresponding value for code, while grouping by ID. So the ID - A1 should be have [0,20] as its code2. I'd like to achieve the following:
ID type code code2
1 A1 small B9 [0,20]
2 A1 large [0,20] [0,20]
3 A2 small B9 [20,40]
4 A2 large [20,40] [20,40]
5 A3 large [0,20] [0,20]
6 A4 small B9 [40,60]
7 A4 large [40,60] [40,60]
From my little knowledge, I'm trying to use dplyr
and ifelse
, but have no luck.
Upvotes: 2
Views: 52
Reputation: 102890
A data.table
option
> setDT(df)[, code2 := code[type == "large"], ID][]
ID type code code2
1: A1 small B9 [0,20]
2: A1 large [0,20] [0,20]
3: A2 small B9 [20,40]
4: A2 large [20,40] [20,40]
5: A3 large [0,20] [0,20]
6: A4 small B9 [40,60]
7: A4 large [40,60] [40,60]
Upvotes: 0
Reputation: 887951
We can use a group by operation in dplyr
i.e. grouped by 'ID', extract the 'code' where the 'type' value is "large" (assuming there are no duplicate values for 'type' within each 'ID'
library(dplyr)
df <- df %>%
group_by(ID) %>%
mutate(code2 = code[type == 'large']) %>%
ungroup
-output
df
# A tibble: 7 x 4
ID type code code2
<chr> <chr> <chr> <chr>
1 A1 small B9 [0,20]
2 A1 large [0,20] [0,20]
3 A2 small B9 [20,40]
4 A2 large [20,40] [20,40]
5 A3 large [0,20] [0,20]
6 A4 small B9 [40,60]
7 A4 large [40,60] [40,60]
If there are duplicates, use match
, which will give the index of the first matching index
df <- df %>%
group_by(ID) %>%
mutate(code2 = code[match('large', type)]) %>%
ungroup
Upvotes: 3