Wget spider a website to collect all links

Question

I'm trying to spider this website to depth=2 and collect all the links (urls). A simple task but it seems to be impossible and I must be missing something? I get no urls just an empty text file. Here is the latest command I'm using (messy I know):

wget --spider --force-html --span-hosts --user-agent="Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" -np --limit-rate=20k -e robots=off --wait=3 --random-wait -r -l2 https://en.wikibooks.org/wiki/C%2B%2B_Programming 2>&1 | grep '^--' | awk '{ print $3 }' | grep -v '.(css\|js\|png\|gif\|jpg)$' | sort | uniq > urls.txt

Any ideas?

Wget spider a website to collect all links

Answers (1)

Related Questions