Selecting href of link with image inside using xpath

Question

I'm using scrapy to write a scraper that finds links with images inside them and grabs the link's href. The page I'm scraping is populated with image thumbnails, and when you click on the thumbnail it links to a full size version of the image. I'd like to grab the full size images.

The html looks somewhat like this:

And I want to grab "example.com/full_size_image.jpg".

My current method of doing so is

img_urls = scrapy.Selector(response).xpath('//a/img/..').xpath("@href").extract()

But I'd like to reduce that to a single xpath expression, as I plan to allow the user to enter their own xpath expression string.

alecxe · Accepted Answer

You can check if an element has an another child element this way:

response.xpath('//a[img]/@href').extract()

Note that I'm using the response.xpath() shortcut and providing a single XPath expression.

Selecting href of link with image inside using xpath

Answers (1)

Related Questions