P. Wormer
P. Wormer

Reputation: 387

Removal of trailing dot in RewriteRule of .htaccess

The .htaccess rewrite rule applied in a restful database application:

RewriteRule ^author/([A-z.]+)/([A-z]+)$ get_author.php?first_name=$1&last_name=$2

applied to

http://localhost:8080/API/author/J./Doe

removes the period from "J." and the resulting name "J Doe" is obviously not in the database (while "J. Doe" is). This rewrite rule only removes a trailing period, e.g. "J.O" translates correctly to "J.O". I use XAMPP 7.0.6 plus Apache under Windows 10. What to do in order to NOT remove the trailing dot on the initial?

Update: Apparently my question wasn't clear, I give it another try.

  1. The regexp (RewriteRule) above is supposed to assign "J." to the variable $1. Instead it assigns "J" to $1, in other words, the regex drops the trailing dot. Secondly, the regex assigns "Doe" to the variable $2, this assignment is as expected and correct. The variables $1 (with incorrect value "J") and $2 (with correct value "Doe") are used in a database search. This search fails because of the missing dot. The database contains "J. Doe", but not "J Doe".

  2. When a dot is not trailing, as in "J.O", the variable $1 gets the correct value "J.O". In other words, the regex does not remove all dots, only the trailing ones.

My question is: how can I tell (the rewrite engine of) .htaccess to apply the regexp correctly?

For comparison, the following piece of JS code does what I want:

var regexp = "^author/([A-z.]+)/([A-z]+)$";
var result = "author/J./Doe".match(regexp);
alert(result[1] + " " + result[2]);

Upvotes: 2

Views: 391

Answers (1)

AyrA
AyrA

Reputation: 863

This is apparently (still) a "feature": https://bz.apache.org/bugzilla/show_bug.cgi?id=20036

Problem: Apache strips all trailing dots and spaces unless the path segments is exactly "." or "..".

I ran into the problem because I tried to map an URL from get/a/b/c to get.php?param1=a&param2=b&param3=c, but c can legitimately have trailing dots. The issue is not actually mod_rewrite related but happens with regular URLs too, example URL of a file that's definitely not named this way: Example favicon file. Other servers don't do this. Example: Stackoverflow favicon file, which turns this into a way to detect an Apache server when the HTTP server header is stripped.

To work around this problem, I still map the URL using mod_rewrite, but then in the PHP script, I use the exact same regex to manually map the parameters:

if(preg_match('#/get/([^/]+)/([^/]+)/(.+)$#',$_SERVER['REQUEST_URI'],$matches)){
    $param1=$matches[1];
    $param2=$matches[2];
    $param3=$matches[3];
}

Instead of using the PATH_INFO, I use the REQUEST_URI because it's untouched. This means if you absolutely need to pass trailing dots in a path string to a backend using apache, your best bet right now is to write an intermediate script that extracts the proper parameters and then does the proxy request for you.

Upvotes: 1

Related Questions