Shubham B.
Shubham B.

Reputation: 59

Not able to store data scraped with scrapy in json or csv format

Here I want to store the data from the list given on a website page. If I'm running the commands

response.css('title::text').extract_first()        and
response.css("article div#section-2 li::text").extract()

individually in the scrapy shell it is showing expected output in shell. Below is my code which is not storing data in json or csv format:

import scrapy

class QuotesSpider(scrapy.Spider):
    name = "medical"

    start_urls = ['https://medlineplus.gov/ency/article/000178.html/']


    def parse(self, response):
        yield
        {
            'topic': response.css('title::text').extract_first(),
            'symptoms': response.css("article div#section-2 li::text").extract()
        }

I have tried to run this code using

scrapy crawl medical -o medical.json

Upvotes: 1

Views: 251

Answers (1)

alecxe
alecxe

Reputation: 473863

You need to fix your URL, it is https://medlineplus.gov/ency/article/000178.htm and not https://medlineplus.gov/ency/article/000178.html/.

Also, and more importantly, you need to define an Item class and yield/return it from the parse() callback of your spider:

import scrapy


class MyItem(scrapy.Item):
    topic = scrapy.Field()
    symptoms = scrapy.Field()


class QuotesSpider(scrapy.Spider):
    name = "medical"

    allowed_domains = ['medlineplus.gov']
    start_urls = ['https://medlineplus.gov/ency/article/000178.htm']

    def parse(self, response):
        item = MyItem()

        item["topic"] = response.css('title::text').extract_first()
        item["symptoms"] = response.css("article div#section-2 li::text").extract()

        yield item

Upvotes: 1

Related Questions