Reputation: 1753
Hi I need a regex which extracts numbers and (numbers + alphabets) if present in a string.
Ex: "4596 2B FC JAIN BHAWAN" --> I want "4596 2B" as my output
> gsub("\\S([a-zA-Z])+\\S", "", "4596 2B FC JAIN BHAWAN")
[1] "4596 2B FC "
I do not understand why the above regex did not replace FC with ""
Any help is appreciated. Thanks
Upvotes: 2
Views: 1373
Reputation: 16090
You are using \\S
(capital) which means "not a space", use the lower case, and only use it once (because the end of your string doesn't terminate with a space):
gsub("\\s([a-zA-Z])+", "", "4596 2B FC JAIN BHAWAN")
Using Simon's suggestion allows us to see the woods for the trees:
gsub("\\b[a-zA-Z]+\\b", "", "aa 4592 2B FC JAIN BHAWAN")
[1] " 4592 2B"
though I might need some help to get rid of the initial space. (I could just put nested gsub
s but that seems cheating.)
Upvotes: 5