Reputation: 91
I am aware that the issue involving the dollar sign "$" in regex (here: either in PHP and JavaScript) has been discussed numerous times before: Yes, I know that I need to add a backslash "\" in front of it (depending on the string processing even two), but the correct way to match a dollar sign is "\$". ... Been there, done that, works fine.
But here's my new problem: Dollar signs "\$" next to word boundaries marked with "\b". ... My following examples can easily be reproduced on e.g. regexpal.com.
Let's start with the following text to search in:
Dollar 50
Dollars 50
$ 50
USD 50
My regex should find either "USD", "Dollar", or "$". Easy enough: Let's try
(USD|Dollar|\$)
Success: It finds the "$", the "USD", and both "Dollar" occurrences, including in "Dollars".
But let's try to skip the "Dollars" by adding word boundaries after the multiple choice:
(USD|Dollar|\$)\b
And this is trouble: "USD" is matched, "Dollar" is matched, "Dollars" is rejected ... But the single, properly backslashed (or escaped) "$" is rejected as well, although that worked just a second before.
It's not related to the multiple choice inside the brackets: Try just
\$
vs.
\$\b
and it's just the same: The first one matches the dollar sign, the second one doesn't.
Another finding:
(USD|Dollar|\$) \b
with a blank " " between the ")" and the "\b" actually works. But that workaround might not be viable under all circumstances (in case there should be a non-whitespace word boundary).
It seems that the escaped dollar sign refuses to be found when word boundaries are involved.
I'd love to hear your suggestions to solve this mystery. -- Thanks a lot in advance!
Upvotes: 9
Views: 3133
Reputation: 28040
It doesn't match, because in $
there isn't a word boundary immediately after the $
. There would be, however, if a word started immediately after the $
- for example
$Millions
will match.
What you probably want to do is to make the \b
apply only to those cases where you really do want to match a word boundary - for example
(USD\b|Dollar\b|\$)
This will insist on there being a word boundary after "USD" and after "Dollar", but not after "$".
Upvotes: 6