Do web crawlers rely ONLY on links from homepage to do their crawling?

Question

My homepage has links to pages a.html and b.html. In the same directory with these 2 pages, I have pages c.html and d.html which are not linked to by any other pages.

My question is Do webcrawlers also index c.html and d.html just because they are in the directory? Or do they only follow the links starting from the home page and index only the homepage plus pages a and b? Thanks.

Kiril · Accepted Answer

Web crawlers only know about links, so if nobody in the world has a link to pages c.html and d.html, then the likelihood that a crawler will find them is pretty close to 0.

Let's see how a crawler might find those:

Your home page only points to a.html and b.html, but if those pages have links to c/d.html, then a crawler will eventually them.
If the above is not true, but you've given somebody links to c/d.html and they posted those links on some website online, then a crawler will eventually find them.
If you have a sitemap, then a crawler might eventually find them.

This assumes that the crawler is "good" and it's crawling long enough to get to a page which contains links to your c/d.html pages.

Do web crawlers rely ONLY on links from homepage to do their crawling?

Answers (2)

Related Questions