Reputation: 2267
I have a df
ID <- c('DX154','DX154','DX155','DX155','DX156','DX157','DX158','DX159')
Country <- c('US','US','US','US')
Level <- c('Level_1A','Level_1A','Level_1B','Level_1B','Level_1A','Level_1B','Level_1B','Level_1A')
Type_A <- c('Iphone','Iphone','Android','Android','aaa','bbb','ccc','ddd')
Type_B <- c("Iphone,Ipad,Ipod,Mac","Gmail,Android,Drive,Maps","Iphone,Ipad,Ipod,Mac","Gmail,Android,Drive,Maps","ALL","ALL","ALL","ALL")
df <- data.frame(ID ,Country ,Level ,Type_A,Type_B)
df
ID Country Level Type_A Type_B
1 DX154 US Level_1A Iphone Iphone,Ipad,Ipod,Mac
2 DX154 US Level_1A Iphone Gmail,Android,Drive,Maps
3 DX155 US Level_1B Android Iphone,Ipad,Ipod,Mac
4 DX155 US Level_1B Android Gmail,Android,Drive,Maps
5 DX156 US Level_1A aaa ALL
6 DX157 US Level_1B bbb ALL
7 DX158 US Level_1B ccc ALL
8 DX159 US Level_1A ddd ALL
I am trying to filer this data frame by joining the column Type_A, Type_B but not knowing how to parse the comma. Could someone please help me with this.
My Desired output is
ID Country Level Type_A Type_B
1 DX154 US Level_1A Iphone Iphone,Ipad,Ipod,Mac
2 DX155 US Level_1B Android Gmail,Android,Drive,Maps
3 DX156 US Level_1A aaa ALL
4 DX157 US Level_1B bbb ALL
5 DX158 US Level_1B ccc ALL
6 DX159 US Level_1A ddd ALL
Upvotes: 1
Views: 198
Reputation: 887691
We group by 'ID', use grepl
, specify the pattern by paste
ing the 'Type_A' column (In this example, using Type_A[1L]
should also work as the 'Type_A' elements are duplicated. A better example would be nice) and use this to filter
the rows. We also use grepl
to filter
those elements in 'Type_B' that has no ,
from start (^
) to end ($
) of the string.
library(dplyr)
df %>%
group_by(ID) %>%
filter(grepl(paste(Type_A, collapse='|'),
Type_B)|grepl('^[^,]+$', Type_B))
# ID Country Level Type_A Type_B
#1 DX154 US Level_1A Iphone Iphone,Ipad,Ipod,Mac
#2 DX155 US Level_1B Android Gmail,Android,Drive,Maps
#3 DX156 US Level_1A aaa ALL
#4 DX157 US Level_1B bbb ALL
#5 DX158 US Level_1B ccc ALL
#6 DX159 US Level_1A ddd ALL
Upvotes: 2
Reputation: 17279
Here's one solution. It's kind of gimmicky, but someone will be along to give you the super clever and speedy version soon. This does it row-wise, but Akrun's answer shows you how to do it by id only.
library(dplyr)
df <- df %>%
mutate(row_id = 1:n()) %>%
group_by(row_id) %>%
filter(grepl(Type_A, Type_B) | Type_B === "ALL")
Upvotes: 3