anderwyang
anderwyang

Reputation: 2421

In as string, how to remove one part which known 'start' and 'end'?

There is sku name in below dataframe, I want to remove the part which start with 'V' and end with 'b', my code str_remove_all(sku_name,"^(V).*?(\\b)$") can't work.

Anyone can help?

mydata <- data.frame(sku_name=c('wk0001 V1b','123780 PRO V326b','ttttt V321b'))
mydata %>% mutate(sku_name_new=str_remove_all(sku_name,"^(V).*?(\\b)$"))

Upvotes: 2

Views: 1032

Answers (3)

AugtPelle
AugtPelle

Reputation: 549

You were actually really close.

Fix the regex using one alternative mentioned by @2evans and it's done !

I share the code using dplyr pipe lines because it can be better for you.

mydata <- data.frame(sku_name=c('wk0001 V1b','123780 PRO V326b','ttttt V321b'))

mydata %>% mutate(sku_name_new=str_remove_all(sku_name,"V.*b$"))

 sku_name sku_name_new
1       wk0001 V1b      wk0001 
2 123780 PRO V326b  123780 PRO 
3      ttttt V321b       ttttt 


Upvotes: 0

Alvaro Morales
Alvaro Morales

Reputation: 1925

You can do it with this pattern:

vector <- c('wk0001 V1b','123780 PRO V326b','ttttt V321b')

# if only numbers can be between the "V" and "b".
stringr::str_remove(vector , "V\\d+b")

# if any character can be between the "V" and "b", but at least one and no "V" or "b".
stringr::str_remove(vector , "V[^Vb]+b")

Upvotes: 1

r2evans
r2evans

Reputation: 160417

vec <- c('wk0001 V1b','123780 PRO V326b','ttttt V321b')
sub("V.*b$", "", vec)
# [1] "wk0001 "     "123780 PRO " "ttttt "     
stringr::str_remove(vec, "V.*b$")
# [1] "wk0001 "     "123780 PRO " "ttttt "     

This also works with the non-greedy "V.*?b$", over to you if that's necessary.

BTW: \\b is a word-boundary, not the literal b. (V) is saving it as a group, that's not necessary (and looks a little confusing). The real culprit is that you included ^, which means start of string (as you mentioned), which will only match if all strings start with V, and in "Vsomethingb". The current vec strings start with "w", "1", and "t", none of them start with V.

If you need a guide for regex, https://stackoverflow.com/a/22944075/3358272 is a good guide of many components (and links to questions/answers about them).

Upvotes: 5

Related Questions