Harry
Harry

Reputation: 54939

expressjs node.js serve different data to google/etc bot and human traffic

I want to determine if incoming requests are from a bot (eg google, bing), or a human, and serve different data to each, for example, json data for client javascript to construct the site or preprocessed html.

Using expressjs, is there an easy way to do this? Thanks.

Upvotes: 5

Views: 1974

Answers (3)

Checking for request header User-Agent or MIME type as suggested is not reliable, since any HTTP GET request can define User-Agent and headers at will.

The most reliable and secure approach is to check by IP.

Therefore I developed an NPM package that does exactly that. It stores at startup in-memory all known IP ranges coming from Google bots and crawlers, for very fast middleware processing.

const express = require('express')
const isGCrawler = require('express-is-googlecrawler')

const app = express()
app.use(isGCrawler)

app.get('/', (req, res) => {
  res.send(res.locals.isGoogleCrawler) // Boolean
})

app.listen(3000)

Upvotes: 0

Ryan Doherty
Ryan Doherty

Reputation: 38740

You can check the req.header('User-Agent') for 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html'. If it's that you know it's Google and can send it different data.

http://www.google.com/support/webmasters/bin/answer.py?answer=1061943

How to get headers http://expressjs.com/4x/api.html#req.get

Upvotes: 4

Rubén Norte
Rubén Norte

Reputation: 416

I recommend you to response according to the requested MIME type (which is present in the "Accept" header). You can do this with Express this way:

app.get('/route', function (req, res) {
    if (req.is('json')) res.json(data);
    else if (req.is('html')) res.render('view', {});
    else ...
});

Upvotes: 3

Related Questions