python django scrapy return item to controller

Question

I need to do some short real-time scraping and return the resulted data in my Django REST controller.

trying to make use scrapy:

import scrapy
from scrapy.selector import Selector

from . models import Product

class MysiteSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        'https://www.something.com/browse?q=dfd',
    ]
    allowed_domains = ['something.com']

    def parse(self, response):
        items_list = Selector(response).xpath('//li[@itemprop="itemListElement"]')

        for value in items_list:
            item = Product()
            item['picture_url'] = value.xpath('img/@src').extract()[0]
            item['title'] = value.xpath('h2').text()
            item['price'] = value.xpath('p[contains(@class, "ad-price")]').text()
            yield item

items model

import scrapy


class Product(scrapy.Item):
    name = scrapy.Field()
    price = scrapy.Field()
    picture_url = scrapy.Field()
    published_date = scrapy.Field(serializer=str)

according to Scrapy architecture, items will be returned to the Item Pipeline (https://doc.scrapy.org/en/1.2/topics/item-pipeline.html) which is used to store the data to DB/save to files and so on.

However i'm stuck with the question - how can i return the scraped items list through the Django REST APIview?

Expected usage example:

from rest_framework.views import APIView
from rest_framework.response import Response

from .service.mysite_spider import MysiteSpider

    class AggregatorView(APIView):
        mysite_spider = MysiteSpider()

        def get(self, request, *args, **kwargs):

            self.mysite_spider.parse()

            return Response('good')

python django scrapy return item to controller

Answers (1)

Related Questions