David
David

Reputation: 4818

How does crawler/Search Engine traverse the web?

How does crawler of commercial search engine traverse the web: "Identifying seed pages and through connected links find other pages" OR "Index every file under websites wwwroot directory."

in the case of later option search engine should even have indexed things which are not reference by any other page?

Upvotes: 0

Views: 475

Answers (1)

Maksym Polshcha
Maksym Polshcha

Reputation: 18358

A reference must exist. It can be

  • regular HTML href allowed for indexation
  • link in sitemaps.xml
  • link in robots.txt allowed for a crawler
  • reference provided by a webmaster in his search engine backoffice
  • etc.

It could be any other link.

Upvotes: 1

Related Questions