Reputation: 2253
Here is my data:
a <- data.frame(x=c('A','A','A','B','B','B'),
y=c('Yes','No','No','Yes','No','No'),
z=c(1,2,3,4,5,6))
I want to generate a new column this way:
x
, so all the A
s will be in one group and all B
s in anothery=Yes
, then keep the z
value in the new column. If y=No
, then using the z
value with y=Yes
.So, the new data should look like this:
x y z z1
A Yes 1 1
A No 2 1
A No 3 1
B Yes 4 4
B No 5 4
B No 6 4
I can use this way to do:
a1 <- a %>%
filter(y=='Yes') %>%
distinct(x,y,z)
a2 <- a %>%
left_join(a1,by='x') %>%...
But in this way, I have to generate a1
as an intermediate. How to do this just in one pipeline without generating a new variable like a1
in my example?
Upvotes: 0
Views: 64
Reputation: 155
You could combine both pipelines and perform the same functions in one shot.
i.e...
a <- data.frame(x=c('A','A','A','B','B','B'),
y=c('Yes','No','No','Yes','No','No'),
z=c(1,2,3,4,5,6))
a %>% left_join(a %>% filter(y=='Yes') %>% distinct(x,y,z), by='x') %>% select(-y.y)
This results in duplicate columns tagged with .x and .y as a result of the join.
Upvotes: 1