Scrapy JSON output - values empty

Question

I would like to crawl a set of web pages using scrapy. However, when I try to write some values into the json file, those fields don't show up.

Here is my code:

import scrapy

class LLPubs (scrapy.Spider):
    name = "linlinks"
    start_urls = [
        'http://www.linnaeuslink.org/records/record/1',
        'http://www.linnaeuslink.org/records/record/2',
    ]

    def parse(self, response):
        for container in response.css('div.item'):
            yield {
                'text': container.css('div.field.soulsbyNo .value span::text').extract(),
                'uniformtitle': container.css('div.field.uniformTitle .value span::text').extract(),
                'title': container.css('div.field.title .value span::text').extract(),
                'opac': container.css('div.field.localControlNo .value span::text').extract(),
                'url': container.css('div#digitalLinks li a').extract(),
                'partner': container.css('div.logoContainer  img:first-child').xpath('@src').extract(),
                }

And an example of my output:

{
"text": ["Soulsby no. 46(1)"], 
"uniformtitle": ["Systema naturae"], 
"title": ["Caroli Linn\u00e6i ... Systema natur\u00e6
in quo natur\u00e6 regna tria, secundum classes, ordines, genera, species, systematice proponuntur."], 
"opac": ["002178079"], 
"url": [], 
"partner": []
},

I am hoping I am doing something silly and easy to fix! Both of the paths I am using for "url" and "partner" were working from here:

scrapy shell 'http://www.linnaeuslink.org/records/record/1'

So, I just don't know what I am missing.

Oh, and exporting to json by using this command for now:

scrapy crawl linlinks -o quotes.json

Thanks for your help!

Wilfredo · Accepted Answer

The problem seems to be that those selectors are not "findable" inside any div.item you probably have validated them without the response.css('div.item') to replicate what you used in the shell just replace the container.css by response.css for the url and partner keys.

Scrapy JSON output - values empty

Answers (1)

Related Questions