Reputation: 8268
I am trying to download all the images from this link. I want to download images from only the hydraulics section, so I used --no-parent
and when I run the command
wget -r --no-parent -e robots=off --user-agent="Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0" -A png http://indiabix.com/civil-engineering/hydraulics/
it only downloads the index.html.
I searched this issue on the web, and Stack Overflow already has two questions:
but they do not help. I also started a bounty on the latter question, but I wonder if anyone can suggest a workaround in my case?
Upvotes: 1
Views: 2414
Reputation: 5463
The answer depends on knowing the path to the images folder, so that it can be added to the list of directories to be included (without the --include
parameter the whole site will be fetched).
wget 'http://indiabix.com/civil-engineering/hydraulics/' --convert-links --adjust-extension --recursive --page-requisites --no-directories --directory-prefix=output --include '/civil-engineering/hydraulics','/_files/images'
Upvotes: 0
Reputation: 11096
Quite simple:
The tiny icons ("View Answer" etc.) are part of a CSS definition for the anchor (background-image). As per now, wget will not parse the external CSS and pick images from there.
With -A png wget will even stop at the first file (.html) since it doesn't match.
I've succeded downloading everything with
lwp-rget --hier --nospace http://indiabix.com/civil-engineering/hydraulics/
The lwp CPAN perl packages need to be installed: zypper se libwww
Upvotes: 1