Cyan0121
Cyan0121

Reputation: 31

Replace words with stringi

I am trying to use stringi to replace certain words using stri_replace, however I have run into an issue when replacing similar parts of a word. In the example below I am fixing the misspellings of triangle, but it seems to be getting confused because 'tri' is part of 'trian' is part of 'triangle', and it comes out like 'trainglegle.' I'm not that familiar with stri_replace, is there some argument I am missing? Thanks for your help.

stri_replace_all_regex("The quick brown tri jumped over the lazy trian.",
      c("tri", "trian", "fox"), c("triangle",  "triangle", "bear"), 
         vectorize_all=FALSE)

## [1] "The quick brown trianglegle jumped over the lazy triangleglean."

Upvotes: 1

Views: 210

Answers (2)

IRTFM
IRTFM

Reputation: 263481

If you do not want partial matches to be done then terminate some (or maybe even all your pattern arguments with a space (and also replace the space:

stri_replace_all_regex("The quick brown tri jumped over the lazy trian.",
  pattern=c("tri "), repl=c("triangle "), 
     vectorize_all=FALSE)

stri_replace_all_regex("The quick brown tri jumped over the lazy trian.",
       c("tri ", "trian", "fox "), c("triangle ",  "triangle", "bear "), 
          vectorize_all=TRUE)
[1] "The quick brown triangle jumped over the lazy trian."
[2] "The quick brown tri jumped over the lazy triangle."  
[3] "The quick brown tri jumped over the lazy trian."     

Upvotes: 0

HubertL
HubertL

Reputation: 19544

You may want to isolate words so that they are different. The \\W is non character. You can try something like that:

stri_replace_all_regex("The quick brown tri jumped over the lazy trian.",
                   paste0(c("trian", "tri",  "fox"), "(\\W)"), 
                   paste0(c("triangle","triangle", "bear"),"$1"),
                   vectorize_all = FALSE)
[1] "The quick brown triangle jumped over the lazy triangle."

Upvotes: 3

Related Questions