Christo
Christo

Reputation: 2370

Identifying web crawlers

Is the following property reliable enough to identify search engine web crawlers?

Request.Browser.Crawler

My site creates a new user as a guest upon page request if they havent been to the site before and im getting more hits than my analytic's are suggesting. - alot more.

I use the snippet above to only create legit user guest accounts but im thinking some crawlers are getting through.

Perhaps I could use the HttpRequest UserAgent property to identify them. If so can someone please suggest a list of current crawler names, I believe the bing bot for instance is call bingbot as mentioned here.

Request.UserAgent

UPDATE:

I know for sure that they are not being identified using Request.Browser.Crawler because a request coming from 65.52.110.143 is a serial offender, which I believe is a bingbot.

Upvotes: 2

Views: 1980

Answers (1)

Anirudh Ramanathan
Anirudh Ramanathan

Reputation: 46728

Request.Browser.Crawler is sadly out-of-date

You could add detection of other user-agents as bots, manually. Use the Browser Element and not browserCaps as it is deprecated as of .NET 2.0

Example:

<browsers>
    <browser id="Googlebot" parentID="Mozilla">
        <identification>
            <userAgent match="^Googlebot(\-Image)?/(?'version'(?'major'\d+)(?'minor'\.\d+)).*" />
        </identification>
        <capabilities>
            <capability name="crawler" value="true" />
        </capabilities>
    </browser>
    .
    .
    .
</browsers>

This must be saved with a .browser extension under the App_Browsers directory in your application.

(List of Regexes to Match)

Upvotes: 2

Related Questions