Reputation: 143
I am trying to obtain clinical images of psoriasis patients from these two websites for research purposes:
http://www.dermis.net/dermisroot/en/31346/diagnose.htm
http://dermatlas.med.jhmi.edu/derm/
For the first site, I tried just saving the page with firefox, but it only saved the thumbnails and not the full-sized images. I was able to access the full-sized images using a firefox addon called "downloadthemall", but it saved each image as part of a new html page and I do not know of any way to extract just the images.
I also tried getting on one of my university's linux machines and using wget to mirror the websites, but I was not able to get it to work and am still unsure as to why.
Consequently, I am wondering whether it would be easy to write a short script (or whatever method is easiest) to (a) obtain the full-sized images linked to on the first website, and (b) obtain all full-sized images on the second site with "psoriasis" in the filename.
I have been programming for a couple of years, but have zero experience with web development and would appreciate any advice on how to go about doing this.
Upvotes: 1
Views: 5474
Reputation: 9149
Why not use wget to recursively download images from the domain? Here is an example:
wget -r -P /save/location -A jpeg,jpg,bmp,gif,png http://www.domain.com
Here is the man page: http://www.gnu.org/software/wget/manual/wget.html
Upvotes: 2
Reputation: 1641
Try HTTrack website copier - it will load all the images on the website. You can also try http://htmlparser.sourceforge.net/. It will grab website as well with resources if you specify it in org.htmlparser.parserapplications.SiteCapturer
Upvotes: 1