Daniel Tan
Daniel Tan

Reputation: 181

Cleaning misspelled names using stri_replace_all_regex

I have a list with a few names, some of which have incorrect spelling.

fruits <- c("apple", "two pears", "three bananas", "appl3")

I would like to use the stri_replace_all_regex method under the stringi package to search for any string beginning with 'a' and replacing it with the string 'apple'.

However, I think my use of regular expressions does not capture what I am trying to achieve.

Where I am stuck:

stri_replace_all_regex(fruits,"^a","apple")
[1] "applepple"     "two pears"     "three bananas" "appleppl3"

stri_replace_all_regex(fruits,"^a(pple)?$","apple")
[1] "apple"         "two pears"     "three bananas" "appl3"

Upvotes: 0

Views: 189

Answers (2)

zyurnaidi
zyurnaidi

Reputation: 2273

You can work it out using base with sub or gsub

fruits <- c("apple", "two pears", "three bananas", "appl3")
gsub("^a.*", "apple", fruits)

"apple"         "two pears"     "three bananas" "apple" 

Upvotes: 0

Lorenzo Rossi
Lorenzo Rossi

Reputation: 1481

Your solution looks for strings starting with an 'a' and then removes that 'a' and adds the string 'apple' instead so apple becomes applepple.

What you want is to select the entire string that starts with a and then substitute it with the new string 'apple':

stri_replace_all_regex(fruits, "^a.*", "apple")

Upvotes: 1

Related Questions