Danny Franklin
Danny Franklin

Reputation: 27

URL rewrite with HTACCESS for multiple language support

I am looking to add multiple language support to my website. Is it possible to use the .htaccess file to change something like:

Where this works with any possible directories - so if for instance later on I made a new directory, I wouldn't need to add to this. In the above example, I want the latter to be what the user types in/is in the address bar, and the start to be how it is used internally.

Upvotes: 1

Views: 1473

Answers (1)

MrWhite
MrWhite

Reputation: 45829

Is it possible to use the .htaccess file to change something like:

  • example.com/dir/?lang=en to example.com/en/dir/

Yes, except that you don't change the URL to example.com/en/dir/ in .htaccess. You change the URL to example.com/en/dir/ in your internal links in your application, before you change anything in .htaccess. This is the canonical URL and is "what the user types in/is in the address bar" - as you say.

You then use .htaccess to internally rewrite the request from example.com/en/dir/, back into the URL your application understands, ie. example.com/dir/?lang=en (or rather example.com/dir/index.php?lang=en - see below). This is entirely hidden from the user. The user only ever sees example.com/en/dir/ - even when they look at the HTML source.

So, we need to rewrite /<lang>/<url-path>/ to /<url-path>/?lang=<lang>. Where <lang> is assumed to be a 2 character lowercase language code. If you are offering only a small selection of languages then this should be explicitly stated to avoid conflicts. We can also handle any additional query string on the original request (if this is requried). eg. /<lang>/<url-path>/?<query-string> to /<url-path>/?lang=<lang>&<query-string>.

A slight complication here is that a URL of the form /dir/?lang=en is not strictly a valid endpoint and requires further rewriting. I expect you are relying on mod_dir to issue an internal subrequest for the DirectoryIndex, eg. index.php? So, really, this should be rewritten directly to /dir/index.php?lang=en - or whatever the DirectoryIndex document is defined as.

For example, in your root .htaccess file:

RewriteEngine On

# Rewrite "/<lang>/<directory>/" to `/<directory>/index.php?lang=<lang>"
RewriteCond %{DOCUMENT_ROOT}/$2/index.php -f
RewriteRule ^([a-z]{2})/(.*?)/?$ $2/index.php?lang=$1 [L]

# Rewrite "/<lang>/<directory>/<file>" to `/<directory>/<file>?lang=<lang>"
RewriteCond %{DOCUMENT_ROOT}/$2 -f
RewriteRule ^([a-z]{2})/(.+) $2?lang=$1 [L]

If you have just two languages (as in your example), or a small subset of known languages then change the ([a-z]{2}) subpattern to use alternation and explicitly identify each language code, eg. (en|de|ab|cd).

This does assume you don't have physical directories in the document root that consist of 2 lowercase letters (or match the specific language codes).

Only URLs where the destination directory (that contains index.php) or file exists are rewritten.

This will also rewrite requests for the document root (not explicitly stated in your examples). eg. example.com/en/ (trailing slash required here) is rewritten to /index.php?lang=en.

The regex could be made slightly more efficient if requests for directories always contain a trailing slash. In the above I've assumed the trailing slash is optional, although this does potentially create a duplicate content issue unless you resolve this in some other way (eg. rel="canonical" link element). So, in the code above, both example.com/en/dir/ (trailing slash) and example.com/en/dir (no trailing slash) are both accessible and both return the same resource, ie. /dir/index.php?lang=en.

Upvotes: 3

Related Questions