GerZah
GerZah

Reputation: 91

Dollar Sign "\$" in Regular Expressions with word boundaries "\b" (PHP / JavaScript)

I am aware that the issue involving the dollar sign "$" in regex (here: either in PHP and JavaScript) has been discussed numerous times before: Yes, I know that I need to add a backslash "\" in front of it (depending on the string processing even two), but the correct way to match a dollar sign is "\$". ... Been there, done that, works fine.


But here's my new problem: Dollar signs "\$" next to word boundaries marked with "\b". ... My following examples can easily be reproduced on e.g. regexpal.com.

Let's start with the following text to search in:

Dollar 50

Dollars 50

$ 50

USD 50

My regex should find either "USD", "Dollar", or "$". Easy enough: Let's try

(USD|Dollar|\$)

Success: It finds the "$", the "USD", and both "Dollar" occurrences, including in "Dollars".

But let's try to skip the "Dollars" by adding word boundaries after the multiple choice:

(USD|Dollar|\$)\b

And this is trouble: "USD" is matched, "Dollar" is matched, "Dollars" is rejected ... But the single, properly backslashed (or escaped) "$" is rejected as well, although that worked just a second before.

It's not related to the multiple choice inside the brackets: Try just

\$

vs.

\$\b

and it's just the same: The first one matches the dollar sign, the second one doesn't.


Another finding:

(USD|Dollar|\$) \b

with a blank " " between the ")" and the "\b" actually works. But that workaround might not be viable under all circumstances (in case there should be a non-whitespace word boundary).


It seems that the escaped dollar sign refuses to be found when word boundaries are involved.

I'd love to hear your suggestions to solve this mystery. -- Thanks a lot in advance!

Upvotes: 9

Views: 3133

Answers (1)

psmears
psmears

Reputation: 28040

It doesn't match, because in $ there isn't a word boundary immediately after the $. There would be, however, if a word started immediately after the $ - for example

$Millions

will match.

What you probably want to do is to make the \b apply only to those cases where you really do want to match a word boundary - for example

(USD\b|Dollar\b|\$)

This will insist on there being a word boundary after "USD" and after "Dollar", but not after "$".

Upvotes: 6

Related Questions