Reputation: 21556
I've got a file I need to retrieve, then I need to go through that file and download all the images listed. The format is xml, but I don't want to use an xml parser.
When I use
sudo wget --restrict-file-names=windows -nH -nd -r -i -P images \ -A jpeg,jpg,gif,png https://url.com/api/ojgnvhy75hGvcf36dnJO0947bsh62gbs?_=1361842359357
I get the xml file downloaded, but I need the images which are referenced in that file.
What am I doing wrong here?
Upvotes: 0
Views: 511
Reputation: 21556
I ended up with the following code, get the xml file and save it to text, then I get the links form the text file using sed and write those into another file, then use wget on that file to download the images.
#!/bin/dash wget -O xml.txt 'https://url_to_download_from' links=$(sed -n "/image>/s/^ .\([^>]*\)<\/image>.*/\1/gpw links.txt" xml.txt) wget -N -P images -A png -i $links
Sadly, this results a bunch of files which are not images, even though I'm requesting only images.
After this script has completed, I run the following commands to clean up the folder.
cd images shopt -s extglob nocaseglob rm !(*.png)
Upvotes: 0