Reputation: 15
I would like to remove all numbers and commas from a string except any number that is immediately preceded by $ and immediately followed by a comma.
For example, I have:
str = "1, $100-$1,000 2, $1001-$10,000 3, $10,001-$100,000"
I would like to obtain the following:
"$100-$1,000 $1001-$10,000 $10,001-$100,000"
I have tried to use gsub
with a negative lookbehind
new_str = gsub("(?<!\\$)[0-9]*,", "", str)
However, this gives the following error message:
Error in gsub("(?<!\\$)[0-9]*,", "", str) : invalid regular expression '(<!\$)[0-9]*,', reason 'Invalid regexp'
It seems that the negative lookbehind is incorrectly coded, but I can't seem to figure out why. Any help is much appreciated!
Upvotes: 0
Views: 246
Reputation: 269854
1) This gives the desired answer in the case of the sample string:
gsub("\\d+, ", "", str)
## [1] "$100-$1,000 $1001-$10,000 $10,001-$100,000"
Visualization of regular expression
\d+,
2) Here is a second approach:
library(gsubfn)
paste(strapplyc(str, "(\\$\\S+)", simplify = c), collapse = " ")
## [1] "$100-$1,000 $1001-$10,000 $10,001-$100,000"
Visualization of regular expression
(\$\S+)
Upvotes: 1
Reputation: 7948
you could use this pattern
(\$[0-9,-]+)|\d+,\s
and replace w/ \1
Demo
Upvotes: 0