Scrapy selectors return all on page instead of relative

Question

I am using Scrapy to crawl a website which has a list of items on it. However when looping over the list of items, asking for a relative xpath returns all matching items for the entire the page. I have been using 0.24, however upgrading to the latest (1.0) encounters the same issue.

I have tried running this with virtualenv to avoid conflicts with other libraries on my system with no success.

for sel in response.xpath('//ul[@class="items"]//div[@class="item"]'):
    item = CrawledItem()
    item['id'] = sel.xpath('.//input[@name="id"]/@value').extract()

I have tried debugging using scrapy parse and noticed that the list of ids starts off with all matching and slowly decreases so by the last item it only matches a single id. I was expecting a single id per item, instead I'm getting a response similar to below.

[
    {
        'id': [1,2,3,4,5,6,7,8,9,10]
    },
    {
        'id': [1,2,3,4,5,6,7,8,9]
    },
    [..] // omitted
    {
        'id': [10]
    }
]

I have also tried with css selectors with no success. My understanding was that .// was used to perform this action. How can I make sure that I'm ONLY selecting relative to the current selector?

Scrapy selectors return all on page instead of relative

Answers (1)

Related Questions