or123456
or123456

Reputation: 2199

detect web crawler in site page in angular universal

I want to detect the current request for my page in server side rendered - ssr is from a web crawler or a normal user .
I want to do some things in web crawler and not to do when it's a user.
I want to run ng-in-viewport when web crawler is accessing the page to get complete data.
But not to run when user is using my page.
How to detect this thing ?

Upvotes: 1

Views: 2372

Answers (1)

Sergii L.
Sergii L.

Reputation: 101

recently I faced the same challenge like you. I have SSR for my Angular app, and I use StateTransfer, to avoid making the same API calls in browser after user just saw rendered content from server.

I have a views counter, and since page is rendered on server side, I would like to avoid counting views made by crawlers.

So, what I end up with:

In server.ts (which is standard generated file for Angular SSR) ensure you're passing request object from express, I have it like this:

server.get('*', (req, res) => {
  res.render(indexHtml, {
    req, // Express request
    res, // Express response
    providers: [
      { provide: APP_BASE_HREF, useValue: req.baseUrl },
    ],
  });
});

Then in any constants.ts where you prefer to keep your constants, add VIEWER injection token and possible list of crawlers:

export const VIEWER = new InjectionToken<string>('viewer');
export const CRAWLER_AGENTS = [
  'googlebot', 'yandexbot', 'yahoo', 'bingbot',
  'baiduspider', 'facebookexternalhit', 'twitterbot', 'rogerbot',
  'linkedinbot', 'embedly', 'quora link preview', 'showyoubot', 'outbrain',
  'pinterest/0.', 'developers.google.com/+/web/snippet',
  'slackbot', 'vkshare', 'w3c_validator', 'redditbot', 'applebot',
  'whatsapp', 'flipboard', 'tumblr', 'bitlybot', 'skypeuripreview',
  'nuzzel', 'discordbot', 'google page speed'
];

Then in my app.module.ts in providers I added new provider, which will contain information if this is bot or user:

import { NgModule, Optional, PLATFORM_ID } from '@angular/core';
import { isPlatformBrowser } from '@angular/common';
import { REQUEST } from '@nguniversal/express-engine/tokens';
import { CRAWLER_AGENTS, VIEWER } from './constants';

@NgModule({
  imports: [ /* ... */ ],
  declarations: [ /* ... */ ],
  providers: [
    {
      provide: VIEWER,
      useFactory: viewerFactory,
      deps: [PLATFORM_ID, [new Optional(), REQUEST]],
    },
  ],
})
export class AppModule {}

export function viewerFactory(platformId, req: Request): string {
  if (isPlatformBrowser(platformId)) {
    return 'user';
  }
  const userAgent = (req.get('user-agent') || '').toLowerCase();
  const isCrawler = CRAWLER_AGENTS.some(crawlerAgent => userAgent.indexOf(crawlerAgent) !== -1);
  return isCrawler ? 'bot' : 'user';
}

Yes, according Angular documentation you can pass dependency for provider as an array. We need Optional to prevent app crashing on client side. Obviously Express request object doesn't exist there.

Then in your page component you can check viewer like so:

import { Component, Inject, Optional } from '@angular/core';

@Component({
  selector: 'app-article-page',
  templateUrl: './article-page.component.html',
  styleUrls: ['./article-page.component.scss'],
})
export class ArticlePageComponent {
  constructor(
    @Optional() @Inject(VIEWER) private viewer,
  ) {
    const countViews = this.viewer !== 'bot';
  }
}

Please, IF you find my answer helpful, don't forget to mark it as accepted.

Upvotes: 9

Related Questions