Mr. Llama
Mr. Llama

Reputation: 20899

Regex failing when pattern involves dollar sign ($)

I'm running into a bit of an issue when it comes to matching subpatterns that involve the dollar sign. For example, consider the following chunk of text:

Regular Price: $20.50       Final Price: $15.20
Regular Price: $18.99       Final Price: $2.25
Regular Price: $11.22       Final Price: $33.44
Regular Price: $55.66       Final Price: $77.88

I was attempting to match the Regular/Final price sets with the following regex, but it simply wasn't working (no matches at all):
preg_match_all("/Regular Price: \$(\d+\.\d{2}).*Final Price: \$(\d+\.\d{2})/U", $data, $matches);

I escaped the dollar sign, so what gives?

Upvotes: 26

Views: 67465

Answers (2)

shmeeps
shmeeps

Reputation: 7833

I know this question is a little old, but I found this while trying to find the answer to the same problem. I saw that it was at the top of the search engine rankings, so I figured it would be good to explain a simple alternative, and why this happens with double quoted strings ( " )

The regular expression I was using contained plenty of single quote characters ( ' ) in it, so I wasn't too keen on wrapping the expression with them, since I didn't want to escape all of those.

My solution was to "double escape" the dollar sign. In your example, it should look something similar to

"/Regular Price: \\\$(\d+\.\d{2}).*Final Price: \\\$(\d+\.\d{2})/U";

Note that the dollar sign contains 3 slashes now \\\.

Basically, we have two "levels" of interpretation, that of PHP, and that of the regex expression. What's happening is that with one slash, PHP interprets it as a literal character instead of variable modifier, so it eats the slash, interprets the string as outlined in Mark's answer, and then sends that to regex, which interprets as a look-behind.

By "double escaping" the dollar sign, PHP interprets \\\$ as \\ and \$ respectively. We escape the \ from the first set of characters, and escape the $ from the second set, resulting in just \$ after PHP interpretation. This will send the literal string

"/Regular Price: \$(\d+\.\d{2}).*Final Price: \$(\d+\.\d{2})/U";

to regex, which will interpret \$ as the character literal $, which will match $ instead of acting as a look behind, since it is escaped. It is important to realize the double layers of interpretation here, since both PHP and regex have their own interpretation rules, and it may take up to 4 slashes to correctly escape characters.

Single quote strings don't have this problem, since to use a variable $foo in a string, we would have to write

'Hello '. $foo .'!';

instead of

"Hello $foo!";

Like we can in double strings. Unlike double quoted strings, single quote strings can't interpret variables inside the string as variables (unless they are appended like in example above), instead interpreting them as plain text. Since we don't have to escape the variable anymore, we can get away with just

'/Regular Price: \$(\d+\.\d{2}).*Final Price: \$(\d+\.\d{2})/U'

which will send \$ to regex, the same as with \\\$ in a double quote string.

It's all a matter of personal preference on which style you use, or which is easier for the pattern.

TL;DR: Use \$ for single-quoted strings like '/Hello \$bob/is', and \\\$ for double quoted strings like "/Hello \\\$bob/is".

Upvotes: 11

Mark Byers
Mark Byers

Reputation: 838416

Inside a double quoted string the backslash is treated as an escape character for the $. The backslash is removed by the PHP parser even before the preg_match_all function sees it:

$r = "/Regular Price: \$(\d+\.\d{2}).*Final Price: \$(\d+\.\d{2})/U";
var_dump($r);

Output (ideone):

"/Regular Price: $(\d+\.\d{2}).*Final Price: $(\d+\.\d{2})/U"
                 ^                           ^
              the backslashes are no longer there

To fix this use a single quoted string instead of a double quoted string:

preg_match_all('/Regular Price: \$(\d+\.\d{2}).*Final Price: \$(\d+\.\d{2})/U',
               $data,
               $matches);

See it working online: ideone

Upvotes: 48

Related Questions