Reputation: 653
I am making an api which return the JsonResponse
as my text from the scrapy. When i run the scripts individually it runs perfectly. But when i try to integrate the scrapy script with python django i am not getting the output.
What i want is only return the response to the request(which in my case is POSTMAN
POST
request.
Here is the code which i am trying
from django.http import HttpResponse, JsonResponse
from django.views.decorators.csrf import csrf_exempt
import scrapy
from scrapy.crawler import CrawlerProcess
@csrf_exempt
def some_view(request, username):
process = CrawlerProcess({
'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)',
'LOG_ENABLED': 'false'
})
process_test = process.crawl(QuotesSpider)
process.start()
return JsonResponse({'return': process_test})
class QuotesSpider(scrapy.Spider):
name = "quotes"
def start_requests(self):
urls = [
'http://quotes.toscrape.com/random',
]
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
return response.css('.text::text').extract_first()
I am very new to python and django stuff.Any kind of help would be much appreciated.
Upvotes: 1
Views: 2228
Reputation: 9224
In your code, process_test
is a CrawlerProcess
, not the output from the crawling.
You need additional configuration to make your spider store its output "somewhere". See this SO Q&A about writing a custom pipeline.
If you just want to synchronously retrieve and parse a single page, you may be better off using requests to retrieve the page, and parsel to parse it.
Upvotes: 0