Dhon Collera
Dhon Collera

Reputation: 1

htaccess rewrite condition 404 all querystring only on index page

i was trying to protect the main page because on google console my report on a querystring is visible like this example:

https://example.com/?s=something.g

i would like to 404 all querystring only on the main page "example.com/" but any other like the javascripts/css files, folders and wp-admin can use querystrings

this is not allowed (only on main page):

https://example.com/?anything=something
https://example.com/?anythingnew=something&anotherone=something
https://example.com/index.php?anything=something

but these urls should be allowed (all other should be good):

https://example.com/something.js?anything=something
https://example.com/folder/?anything=something
https://example.com/folder/anotherfolder/anyfile.php?anything=something

i was trying to do this:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^?]*)\?
RewriteRule (.*) /$1? [R=404,L]

it appears that all querystrings as disallowed including the files and folders inside.

i also tried this:

RewriteCond %{QUERY_STRING} .+
RewriteRule (.*) /$1? [R=404,L]

same thing, nothing worked, the rule should only be in the main page. thanks in advance

Upvotes: 0

Views: 105

Answers (1)

Patrick Janser
Patrick Janser

Reputation: 4273

You were not far from the solution:

RewriteCond %{QUERY_STRING} ^.+$
RewriteRule ^(?:index\.php)?$ - [R=404,L]

Explained

  1. The RewriteRule will take the path (without the query string) as input. So, if you want to apply this rule only for the homepage (with or without index.php) then you have to write a regular expression such as ^(?:index\.php)?$ :

    • ^ matches the beginning of the string, meaning "it should start with" instead of just "it should contain".
    • $ matches the end of the string, meaning "it should finish with".
    • (?:) is a non-capturing group. If you put () then it's a capturing group, which will generate a variable called $1. But we don't need to capture this part to put it back in the new rewritten URL as we can just put - to say "nothing to change" and generate the 404 error. Putting the question mark behind this group means that it can be present or not. I've put index\.php inside it to say that we can have it or not in the URL. The dot has to be escaped because . means "any char" in a regular expression pattern.

    You might see someone write also ^/?(?:index\.php)?$ to say that it could be with or without the leading slash. But normally Apache will always strip this leading slash before using it in the RewriteRule test. So there's no reason to put it as this test will use a few CPU cycles for nothing.

  2. The RewriteCond is only run if we enter the RewriteRule. Here, we want to test if the query string is empty or not. This can easily be done by matching any char one or several times with .+. It would work with or without the ^ and $ around. I prefer putting them to show that the full query string must not be empty.

Upvotes: 2

Related Questions