Reputation: 81
python
I am using scrapy to scrape data from a website, where i want to scrape graphic cards title,price and whether they are in stock or not. The problem is my code is looping twice and instead of having 10 products I am getting 20.
import scrapy
class ThespiderSpider(scrapy.Spider):
name = 'Thespider'
start_urls = ['https://www.czone.com.pk/graphic-cards-pakistan-ppt.154.aspx?page=2']
def parse(self, response):
data = {}
cards = response.css('div.row')
for card in cards:
for c in card.css('div.product'):
data['Title'] = c.css('h4 a::text').getall()
data['Price'] = c.css('div.price span::text').getall()
data['Stock'] = c.css('div.product-stock span.product-data::text').getall()
yield data
Upvotes: 0
Views: 101
Reputation: 2335
You're doing a nested for loop when one isn't necessary.
Each card can be captured by the CSS selector response.css('div.product')
def parse(self, response):
data = {}
cards = response.css('div.product')
for card in cards:
data['Title'] = card.css('h4 a::text').getall()
data['Price'] = card.css('div.price span::text').getall()
data['Stock'] = card.css('div.product-stock span.product-data::text').getall()
yield data
get()
instead of getall()
. The output you get is a list, you'll probably want a string which is what get()
gives you.Upvotes: 1