revy
revy

Reputation: 4707

Selenium: how to extract all images from a website (including ones from javascript and css)

I need to extract all images from a website using Selenium. This should include all images of any extension (png, jpg, svg, etc) from html, css and javascript. This means that a simple extraction of all the <img> elements will not be sufficient (e.g. any image loaded from css style will be missed):

images = driver.find_elements_by_tag_name('img')  # not sufficient

Is there anything smarter to do instead of downloading and parsing every css and javascript script required in the website and using regex to look for image files?

It would be ideal if there is a way to just look for the downloaded resources after the page load, something similar to the network tab in chrome dev tools:

enter image description here

Any idea?

Upvotes: 1

Views: 1004

Answers (1)

Alif Jahan
Alif Jahan

Reputation: 795

The answer is originally taken from How to access Network panel on google chrome developer tools with selenium?. I just updated a little bit.

resources = driver.execute_script("return window.performance.getEntriesByType('resource');")                                                  
for resource in resources: 
    if resource['initiatorType'] == 'img': # check for other types if needed
        print(resource['name']) # this is the original link of the file

Upvotes: 3

Related Questions