Reputation: 16948
I have written a regular expression to be used to match all of the general pages on my web application. The regular expression was working absolutely fine when the application was running on IIS with a web.config file, but I have since moved the site to a Linux server and am now running under Apache.
The strings I am trying to match are as follows:
section 1/
section 1/section 2/
section 1/section 2/section 3/
I want each match to be captured by the pattern, with the following limitations:
This is what I have tried:
^([^(?!_)\/]+)\/?([^(?!_)\/]+)?\/?([^\/]+)?\/?$
Whilst the above works in a Regex Tester, it causes my server to produce an 'Internal Server Error' when I put it in my .htaccess file, it didn't when I ran it in my web.config.
Can anyone suggest a new pattern to use?
Here are a few examples of other requests:
test/testing
SOME/REQUEST/to a page
anything can/be matched/
unless it has/an underscore_/in the/
first_section/or_second_section/
Please note the 'Internal Server Error' is not being caused by other errors in my .htacess, everything works fine until I uncomment my rewrite rule with this particular regex.
Just to be clearer, these are further examples of rewrites that I would like my regex to match:
http://example.com/property/
http://example.com/property.php
http://example.com/property/manage/
http://example.com/property.php?request=manage
http://example.com/property/edit/1234/
http://example.com/property.php?request=edit&id=1234
http://example.com/_property/ Does NOT Match
http://example.com/property/_edit/ Does NOT Match
The following is working but I don't like that I have specified the allowed characters:
^([a-z0-9\s]+)\/?([a-z0-9\s]+)?\/?([a-z0-9\s\_]+)?\/?$
Upvotes: 1
Views: 72
Reputation: 4241
The problem with your Regex is that it is matching it's own results.
For example:
http://example.com/property/
Will be matched as:
http://example.com/property.php
Which will be matched again by the RewriteEngine
as:
http://example.com/property.php.php
Which will be matched as:
http://example.com/property.php.php.php
and so on...............
Solution:
Make the last slash be a must or forbid a character on the url or add an underscore (_
) to the url.
Using your regex:
^([^(?!_)\/]+)\/?([^(?!_)\/]+)?\/?([^\/]+)?\/?$
Change it to:
^([^(?!_)\/]+)\/?([^(?!_)\/]+)?\/?([^\/]+)?\/$
As in the comments, I have proposed the following one as solving the issue:
^(?!(?:\.\.\/?)+)([^_\/]+)(?:\/([^_\/]+)(?:\/([^_\/]+))?)?\/?$
But it won't, but is a simplified version.
To solve this, using the solutions I have said with my Regex:
Disallowing the chars .?
:
^(?!(?:\.\.\/?)+)([^_\/\.\?]+)(?:\/([^_\/]+)(?:\/([^_\/]+))?)?\/?$
Making \
to be a must as the last char:
^(?!(?:\.\.\/?)+)([^_\/]+)(?:\/([^_\/]+)(?:\/([^_\/]+))?)?\/$
A full redirect appending _
to the file, in an non-obstructive hacky way (untested):
RewriteEngine on
RewriteRule ^(?!(?:\.\.\/?)+)([^_\/]+)(?:\/([^_\/]+)(?:\/([^_\/]+))?)?\/?$ $1.php/_?request=$2&id=$3
This has the side effect of setting $_SERVER['PATH_INFO']
as _
.
This solution was untested!
Anything inaccurate, please leave a comment.
Edit:
The reasoning behind this is that we must break the matching loops.
The engine validates http://example.com/property/
and succeeds, redirecting to http://example.com/property.php
.
If this one matches too, all the sub-requests will match and you have recursion.
The idea is that it doesn't match http://example.com/property.php
and continues evaluating all the other rules.
Upvotes: 1