adamdehaven
adamdehaven

Reputation: 5920

.htaccess Allow All from Specific User Agent

I have a website I am developing that is also going to be pulled into a web app. I have the following code in my .htaccess file to prevent access from ANYONE that is not on my allowed IP:

Order deny,allow
Deny from all
AuthName "Restricted Area - Authorization Required" 
AuthUserFile /home/content/html/.htpasswd 
AuthType Basic
Require valid-user
Allow from 12.34.567.89 
Satisfy Any

QUESTION: I would like to add an Allow from rule that will ALSO allow a specific HTTP user agent access to the site.

I found this code to redirect if not the user agent:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !=myuseragent
RewriteRule ^files/.*$ / [R=302,L]

But I can't seem to figure out how to turn this into an Allow from rule. Help?

UPDATE

I found the code below to block specific user agents... I would instead like to say "if NOT myuseragent, then block."

<IfModule mod_rewrite.c>
SetEnvIfNoCase ^User-Agent$ .*(libwww-perl|aesop_com_spiderman) HTTP_SAFE_BADBOT
Deny from env=HTTP_SAFE_BADBOT
</ifModule>

Upvotes: 16

Views: 50039

Answers (6)

Edward Ruchevits
Edward Ruchevits

Reputation: 6696

    SetEnvIfNoCase User-Agent .*google.* search_robot
    SetEnvIfNoCase User-Agent .*yahoo.* search_robot
    SetEnvIfNoCase User-Agent .*bot.* search_robot
    SetEnvIfNoCase User-Agent .*ask.* search_robot
     
    Order Deny,Allow
    Deny from All
    Allow from env=search_robot

Htaccess SetEnvIf and SetEnvIfNoCase Examples

Upvotes: 22

sys0dm1n
sys0dm1n

Reputation: 879

If you don't want to use mode_rewrite, with Apache 2.4 you can use something similar to this:

<Location />
    AuthType Basic
    AuthName "Enter Login and Password to Enter"
    AuthUserFile /home/content/html/.htpasswd
    <If "%{HTTP_USER_AGENT} == 'myuseragent'">
        Require all granted
    </If>
    <Else>
        Require valid-user
        Require ip 12.34.567.89
    </Else>
</Location>

Upvotes: 4

Nick O
Nick O

Reputation: 1

I used a version like sys0dm1n's answer.

This is my .htaccess file. It allows Google Sheets to access a directory on my server.

AuthType Basic
AuthName "Password Protected Area"
AuthUserFile /var/tools/.htpasswd
<If "%{HTTP_USER_AGENT} == 'Mozilla/5.0 (compatible; GoogleDocs; apps-spreadsheets; +http://docs.google.com)'">
Require all granted
</If>
<Else>
Require valid-user
</Else>

Go to your access.log file in your apache folder to see which User-Agent you need to allow or block.

Upvotes: 0

Alexandre Mazel
Alexandre Mazel

Reputation: 2558

I just want to allow ONE SPECIFIC user agent rather than trying to block all

Here's my config to allow only wget:

SetEnvIf User-Agent .*Wget* wget

Order deny,allow
Deny from all
Allow from env=wget

Upvotes: 7

Igal Zeifman
Igal Zeifman

Reputation: 1146

I just want to allow ONE SPECIFIC user agent rather than trying to block all

Hi

What you need to consider here is that some bots (especially "larger" more prominent ones) will use several user-agents to access your site. For example, a Googlebot (crawler) can use all this different user-agents:

Googlebot-Image/1.0 
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1;+htt://www.google.com/bot.html)
GoogleProducer 
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
Google-Site-Verification/1.0
Google-Test
Googlebot/2.1 (+http://www.google.com/bot.html) 

and I`m not event talking about Google Plus and many other bots used by Google.

Same goes for Yahoo and others.

Just this week our company (Incapsula) launched Botopedia.org - a Community-Sourced bot directory. It's 100% free and open for all and you can use it to find a complete user-agent list for all bots you`ll want to Allow.

If needed, it also has a Reverse IP functionality for Bot verification because, as our recent study of Fake Googlebot visits has shown, some spammer and even cyber-attackers will use legitimate bot signatures to ease their way into your site.

Hope this helps.

Upvotes: -2

InternetSeriousBusiness
InternetSeriousBusiness

Reputation: 2635

Allow from and Rewrite* are directives from two different Apache's modules.

The first one is mod_authz_host and the other from mod_rewrite.

You can use mod_rewrite to do what you want:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !=myuseragent
RewriteRule .* - [F,L]

Upvotes: 5

Related Questions