Reputation: 191
I have a category column that is separated by ";". I.E Value:
value <- "A > B > C; A > B > D; A > B > C > C1"
It means:
The current product belongs to category "A > B > C", to category "A > B > D" and to category "A > B > C > C1"
If a category is already contained in another, this should be removed. So the goal is:
expectedResult <- "A > B > D; A > B > C > C1"
because "A > B > C > C1" is containing "A > B > C".
How can I solve this?
Note: I know that there are hundreds of questions that seem similar. But I just couldn't find a solution.
Upvotes: 3
Views: 62
Reputation: 101335
Perhaps you can try the code below
v <- unlist(strsplit(value, ";\\s+"))
idx <- colSums(`diag<-`(sapply(v, function(x) {
p <- gsub(x, "", v, fix = TRUE)
p != v & nchar(p) > 0
}), FALSE)) == 0
paste0(names(idx)[idx], collapse = "; ")
which gives
[1] "A > B > D; A > B > C > C1"
Upvotes: 0
Reputation: 5429
This ought to work:
value <- "A > B > C; A > B > D; A > B > C > C1"
els <- strsplit( value, "; " )[[1]]
my_reducer <- function(a,b) {
v <- str_detect( b, fixed(a) )
a <- a[!v]
append(a,b)
}
paste( Reduce( my_reducer, els ), collapse="; " )
Output:
> Reduce( my_reducer, els )
[1] "A > B > D; A > B > C > C1"
Upvotes: 1