badc0re
badc0re

Reputation: 3523

Is there any way to find urls folders?

I am kind of asking a weird question, but i am making a spider and i am wondering is there any way to have folders of certain urls like:

   mysite.com/drupal
   mysite.com/wordpress
   mysite.com/abc

is there any way to find for this kind of information???

Upvotes: 0

Views: 82

Answers (2)

Ned Batchelder
Ned Batchelder

Reputation: 375754

Web sites don't typically advertise their entire set of URLs. You can try a few things:

  1. Read the main page, and follow the links on the page. Each leads to another page, which contains links, and so on.

  2. Guess at common folder names.

  3. Eacmine the robots.txt file if the site has one. You should be a good citizen and not retrieve pages it forbids you to.

  4. Try to get the site's sitemap, as this shows: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156184

Upvotes: 1

bluevector
bluevector

Reputation: 3493

If you implement a traditional spider, it will only traverse Urls is finds in the content as it goes along. You could try a dictionary or every-string-in-the-universe check at every directory level, but that wouldn't be playing nice.

So, the short answer is "no".

Upvotes: 0

Related Questions