Reputation: 15800
Could you please explain why my .htaccess code does not work? Whatever the right code is, I'm trying to better understand URL Rewriting and Redirecting and I would appreciate a more verbose/detailed explanation of all syntax and code. Most answers on SO simply state the answer with very little explanation.
# Hypertext Access Directives by Govind Rai
# First rewrite to HTTPS:
# Don't put www. here. If it is already there it will be included, if not
# the subsequent rule will catch it.
RewriteCond %{HTTPS} off
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
# Now, rewrite any request to the wrong domain to use www.
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule .* https://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
###############last two directives that don't work#######################
# hide .html extension govie v1
RewriteCond %{THE_REQUEST} \.html$
RewriteRule ^/[^.]+\.html$ /$1 [NC,R=301,L]
#internal redirect to the right .html file
RewriteCond %{THE_REQUEST} !\.html$
RewriteRule ^/([^.]+)$ /$1.html [L]
I want to understand why the last two rules are not working. When i access a url without the .html
extension I get a 404 page not found error, and a url with extension does not rewrite itself without an extension. I've posted the entire file incase there are conflicting rules.
Upvotes: 2
Views: 203
Reputation: 785196
Problem is this condition:
RewriteCond %{THE_REQUEST} \.html$
That condition will never succeed because example value of %{THE_REQUEST}
is GET /index.php?id=123 HTTP/1.1
. It represents the raw HTTP request as received by Apache.
You can use these rules to fix your issue:
RewriteEngine On
## add www and turn on https in same rule
# if HOST name doesn't start with www. - OR
RewriteCond %{HTTP_HOST} !^www\. [NC,OR]
# if HTTPS is off
RewriteCond %{HTTPS} off
# *capture* hostname part after www in %1
RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]
# redirect with https://www.%1/... to always apply https and www
RewriteRule ^ https://www.%1%{REQUEST_URI} [R=301,L,NE]
## hide .html extension
# if original request is ending with .html then capture part before .html in %1
RewriteCond %{THE_REQUEST} \s/+(.+?)\.html[\s?] [NC]
# and redirect to %1 (part without .html)
RewriteRule ^ /%1 [R=302,NE,L]
# internally add .html if there a matching .html file in your web root
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.+?)/?$ $1.html [L]
Upvotes: 2
Reputation: 10879
${THE_REQUEST}
contains The full HTTP request line sent by the browser to the server (e.g., GET /index.html HTTP/1.1
) so it will never match \.html$
(since it never ends with .html
). Perhaps you can try:
RewriteCond %{THE_REQUEST} \.html\sHTTP
RewriteRule ^([^.]+)\.html$ /$1 [NC,R=301,L]
RewriteCond %{REQUEST_URI} !\.html$
RewriteRule ^ %{REQUEST_URI}.html [L]
Upvotes: 0
Reputation: 42925
The issue most likely is a pretty simple one: when using rewrite rules inside .htaccess
style files the request path is relative, so does not insist on a leading slash. That means you have to modify your rules patterns slightly:
#enable rewriting
Options -Multiviews
RewriteEngine on
RewriteMap /
#internal redirect to the right .html file
RewriteCond %{THE_REQUEST} !\.html$
RewriteCond %{THE_REQUEST} !-f
RewriteCond %{THE_REQUEST} !-d
RewriteRule ^/?([^.]+)$ /$1.html [END]
# hide .html extension govie v1
RewriteCond %{THE_REQUEST} \.html$
RewriteCond %{THE_REQUEST} -f
RewriteRule ^/?([^.]+)\.html$ /$1 [NC,R=301,END]
Instead of completely removing that leading slash I personally like the idea of adding a question mark, so making them optional. This allows to use the same rules inside the http servers host configuration without modification.
I also added the well known twin rules to check if the request does not address a physically existing file or folder. This typically is desired, but you obviously have to decide yourself.
A general hint: you should always prefer to place such rules inside the http servers real host configuration. These .htaccess
style files are notoriously error prone, they are hard to debug and really slow down the server, often without reason. They are only provided for situations where you do not have access to that configuration (read: really cheap hosting providers) or if your application needs to write its own rewriting rules (an obvious security nightmare).
Upvotes: 0