Reputation: 4707
I need to extract all images from a website using Selenium. This should include all images of any extension (png
, jpg
, svg
, etc) from html, css and javascript. This means that a simple extraction of all the <img>
elements will not be sufficient (e.g. any image loaded from css style will be missed):
images = driver.find_elements_by_tag_name('img') # not sufficient
Is there anything smarter to do instead of downloading and parsing every css and javascript script required in the website and using regex to look for image files?
It would be ideal if there is a way to just look for the downloaded resources after the page load, something similar to the network
tab in chrome dev tools
:
Any idea?
Upvotes: 1
Views: 1004
Reputation: 795
The answer is originally taken from How to access Network panel on google chrome developer tools with selenium?. I just updated a little bit.
resources = driver.execute_script("return window.performance.getEntriesByType('resource');")
for resource in resources:
if resource['initiatorType'] == 'img': # check for other types if needed
print(resource['name']) # this is the original link of the file
Upvotes: 3