Alex Regan
Alex Regan

Reputation: 499

Apache RewriteRule avoiding loops with regex

I know similar questions have been asked before, but I've been unable to locate one similar to my situation.

I have requests being made on our Joomla site of the form:

/news/privacy/how-2018-became-facebook%C3%A2%C2%80%C2%99s-worst-year-in-privacy-and-security

First, is this a valid URL? If not, then my interest here would only be in figuring out how to avoid a redirect loop in general with a URL involving a regex.

This appears to be due to Microsoft special characters in the title of the article. I'd like to create a rewriterule with a regex that redirects the user to the proper URL.

RewriteRule /news/privacy/how-2018-became-facebook.*s-worst-year-in-privacy-and-security /news/privacy/how-2018-became-facebooks-worst-year-in-privacy-and-security [L,R=301]

However, the above just causes a redirect loop. I've also tried replacing all the above encoded characters with dots and the browser just reports URL invalid. I thought the L flag was enough for it to not further process any rules, including itself. Perhaps I need a RewriteCond?

These URLs are also mostly generated by bots. I think they are guessing the logical URL based on the title of the article, while the actual URL is what appears in the substitution provided above. We're working on eliminating these titles with Microsoft special characters in them, but for the time being, we'd like to create an appropriate substitution. It's also a learning exercise for me.

These RewriteRules would be created based on entries from the access_log, so we'll have the exact pattern for each, but I would like some general guidelines I can follow to prevent RewriteRule loops such as the one above.

The actual URL is here:

https://linuxsecurity.com/news/privacy/how-2018-became-facebooks-worst-year-in-privacy-and-security

You can see when the article was created the title was created using a Microsoft special quote character.

Upvotes: 1

Views: 64

Answers (1)

anubhava
anubhava

Reputation: 785196

You may be able to use this rule for this redirect:

RewriteRule ^(news/privacy/how-2018-became-facebook).+s(-worst-year-in-privacy-and-security)/?$ /$1s$2 [L,R=301]

Upvotes: 1

Related Questions