Replace string with empty string except certain word using R

Question

Good day, I want to gsub all string with " " except INDIVIDUAL/BUSINESS then mutate in a new column called business_type. I've tried many methods but fail. Thanks in advance.

text <- c("|Name:James Indiana|type:INDIVIDUAL|Id::G123456789&M|Location:Indonesia|", "|Name:James Bond|type:BUSINESS|Id::G&987654321M|Location:Indonesia|")

The output will be like this

business_type    
INDIVIDUAL    
BUSINESS

I am using

mutate(business_type = gsub("[^(\bINDIVIDUAL\b)(\bBUSINESS\b)]+"," ",x)

This method removes other strings but exclude some uppercase letter from other strings.

mutate(business_type = gsub("^/(?!INDIVIDUAL$)(?!BUSINESS$)[a-z0-9A-Z:&|]+=$"," ",x)

does not either. I also try ^/(?!ignoreme)([a-z0-9]+)$ regex but it's not working.

Wiktor Stribiżew · Accepted Answer

You can use

mutate(business_type = gsub("\b(?:INDIVIDUAL|BUSINESS)\b(*SKIP)(*F)|(?s)."," ",x, perl=TRUE)

See the regex demo.

Regex details:

\b(?:INDIVIDUAL|BUSINESS)\b - match either an INDIVIDUAL or BUSINESS as whole words and
(*SKIP)(*F) - skip the match and go on matching from the failure location
| - or
(?s). - match any char including line break chars ((?s) is a singleline flag that makes . match any chars in a PCRE regex).

Replace string with empty string except certain word using R

Answers (2)

Related Questions