user2268997
user2268997

Reputation: 1391

Backreferencing regex in apache

I was setting up symfony framework for my system and configuring the .htaccess file to reroute requests when i came across these two lines:

RewriteCond %{REQUEST_URI}::$1 ^(/.+)/(.*)::\2$
RewriteRule ^(.*) - [E=BASE:%1]

this is supposed to give us the rewrite base for the proceeding rewrites. I want to know how $1 is processed here since there are no rewriterules preceding it? and some clarification on how this code works would be greatly appreciated.

Upvotes: 2

Views: 571

Answers (1)

Jon Lin
Jon Lin

Reputation: 143946

The $1 is the last rule capture group. It's a little confusing because of the way rules are processed by mod_rewrite. Given your rule:

RewriteCond %{REQUEST_URI}::$1 ^(/.+)/(.*)::\2$
RewriteRule ^(.*) - [E=BASE:%1]

This is what mod_rewrite does:

  1. I have a URI, let's apply this rule
  2. I look at the RewriteRule line and checks the pattern: URI matches ^(.*), ok good, let's continue
  3. Now let's check the conditions
  4. The %{REQUEST_URI}::$1 string gets mapped out to the URI and my first capture group, which happened in step 2
  5. The pattern in condition matches, apply the target of the rule
  6. The target is -, pass the URI through and apply the flags

So the rule gets "halfway" applied first, that's how the $1 capture group gets set, then the conditions get checked.

Note that if you had:

RewriteCond %{REQUEST_URI}::$1 ^(/.+)/(.*)::\2$
RewriteRule ^.* - [E=BASE:%1]

$1 would be blank because the pattern in the rule doesn't have a capture group.

The \2 is an inline back reference, so it references:

this----v
^(/.+)/(.*)::\2$

grouping.

An example of how that condition works, is say we have the %{REQUEST_URI} of /abc/foo/bar.html, and say those rules are in the /abc/ directory. That means the condition string:

%{REQUEST_URI}::$1

is

/abc/foo/bar.html::foo/bar.html

And the match takes the part after the :: and matches the same thing before the :::

this bit ----------v__________v
/abc/foo/bar.html::foo/bar.html
     ^-----------^____must match this grouping

And thus, what is left is the first grouping (/.+), which is /abc/. And now we have the correct relative URI base, stored in the environment variable "BASE".

Upvotes: 1

Related Questions