Reputation: 54939
I want to determine if incoming requests are from a bot (eg google, bing), or a human, and serve different data to each, for example, json data for client javascript to construct the site or preprocessed html.
Using expressjs, is there an easy way to do this? Thanks.
Upvotes: 5
Views: 1974
Reputation: 16233
Checking for request header User-Agent
or MIME type as suggested is not reliable, since any HTTP GET request can define User-Agent
and headers at will.
The most reliable and secure approach is to check by IP.
Therefore I developed an NPM package that does exactly that. It stores at startup in-memory all known IP ranges coming from Google bots and crawlers, for very fast middleware processing.
const express = require('express')
const isGCrawler = require('express-is-googlecrawler')
const app = express()
app.use(isGCrawler)
app.get('/', (req, res) => {
res.send(res.locals.isGoogleCrawler) // Boolean
})
app.listen(3000)
Upvotes: 0
Reputation: 38740
You can check the req.header('User-Agent') for 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html'. If it's that you know it's Google and can send it different data.
http://www.google.com/support/webmasters/bin/answer.py?answer=1061943
How to get headers http://expressjs.com/4x/api.html#req.get
Upvotes: 4
Reputation: 416
I recommend you to response according to the requested MIME type (which is present in the "Accept" header). You can do this with Express this way:
app.get('/route', function (req, res) {
if (req.is('json')) res.json(data);
else if (req.is('html')) res.render('view', {});
else ...
});
Upvotes: 3