issues fetching href links from amazon website. xpath find many more href links than expected.

Question

I'm trying to collect all the urls for each video from the amazon website below.

https://www.amazon.com/video-Prime/s?ie=UTF8&page=1&rh=n%3A2858778011%2Ck%3Avideo

I'm using scrapy shell to interactively test my code. I started scrapy shell like below. I

scrapy shell 'https://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3Dinstant-video&field-keywords=video&rh=n%3A2858778011%2Ck%3Avideo'

My response status is 200. Then in scrapy shell, I tried to extract all the video url using xpath selector like below:

response.xpath("//ul[contains(@id, 's-results-list-atf')]/li//a/@href").extract()

I got way more href link than expected. When I checked the web html, that does not make sense. There are ten videos on that page and only one href link for each video. I cannot understand why that happens. I appreciate it if anyone can help. Thanks a lot in advance.

Andersson · Accepted Answer

Try below XPath to match only required links

//ul[@id="s-results-list-atf"]//a[h2]/@href

issues fetching href links from amazon website. xpath find many more href links than expected.

Answers (2)

Related Questions