Reputation: 135
I'm trying to learn how to use scrapy and python but I'm not an expert at all... very far from here. I always have an empty file after crawling this page : product of c-discount and I don't understand why...
Here is my code :
import scrapy
from cdiscount_test.items import CdiscountTestItem
f = open('items.csv', 'w').close()
class CdiscountsellersspiderSpider(scrapy.Spider):
name = 'CDiscountSellersSpider'
allowed_domains = ['cdiscount.com']
start_urls = ['http://www.cdiscount.com/mpv-8732-SATENCO.html']
def parse(self, response):
items = CdiscountTestItem()
name = response.xpath('//div[@class="shtName"]/div[@class="shtOver"]/h1[@itemprop="name"]/text()').extract()
country = response.xpath('//div[@class="shtName"]/span[@class="shTopCExp"]/text()').extract()
items['name_seller'] = ''.join(name).strip()
items['country_seller'] = ''.join(country).strip()
pass
And the result I get in the cmd windows :
2017-06-20 18:01:50 [scrapy.core.engine] INFO: Spider opened
2017-06-20 18:01:50 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0
pages/min), scraped 0 items (at 0 items/min)
2017-06-20 18:01:50 [scrapy.extensions.telnet] DEBUG: Telnet console
listening on 127.0.0.1:6023
2017-06-20 18:01:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET
http://www.cdiscount.com/robots.txt> (referer: None)
2017-06-20 18:01:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET
http://www.cdiscount.com/mpv-8732-SATENCO.html> (referer: None)
2017-06-20 18:01:51 [scrapy.core.engine] INFO: Closing spider (finished)
Is there someone to help me please?
Thanks a lot!!!
Upvotes: 1
Views: 12751
Reputation: 3364
One probable scenario for the same issue might be the website content is producing dynamically. You can check that by going to the website and tapping view page source. In such cases, you might have to use splash along with scrapy.
Upvotes: 3
Reputation: 10210
The main problem here is that you don't pass the item from the parse
method back to Scrapy engine. Your last command in parse
is pass
, so you just discard the item. Instead, you need to pass the item from spider to Scrapy engine for further processing using yield item
.
Upvotes: 1