Reputation: 1
I try to run this code and save it as csv file but in csv contain nothing. Is there any wrong in the code? Please help. Thanks in advance
from scrapy.spider import Spider
from scrapy.selector import Selector
from amazon.items import AmazonItem
class AmazonSpider(Spider):
name = "amazon"
allowed_domains = ["amazon.com"]
start_urls = [
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780316324106",
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780307959478",
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780345549334"
]
def parse(self, response):
sel = Selector(response)
sites = sel.xpath('//div[@class="fstRow prod celwidget"]')
items = []
for site in sites:
item = AmazonItem()
item['url'] = response.url
item['price'] = site.xpath('//ul[@class="rsltL"]/li[5]/a/span/text()')
if item['price']:
item['price'] = item['price'].extract()[0]
else:
item['price'] = "NA"
items.append(item)
return items
I would like to save if item not found then replace with the "NA" character.
When I try this code below it's work fine:
from scrapy.spider import Spider
from scrapy.selector import Selector
from amazon.items import AmazonItem
class AmazonSpider(Spider):
name = "amazon"
allowed_domains = ["amazon.com"]
start_urls = [
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780316324106",
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780307959478",
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780345549334"
]
def parse(self, response):
sel = Selector(response)
sites = sel.xpath('//div[@class="fstRow prod celwidget"]')
items = []
for site in sites:
item = AmazonItem()
item['url'] = response.url
item['price'] = site.xpath('//ul[@class="rsltL"]/li[5]/a/span/text()')
items.append(item)
return items
What's wrong in this part? or do I forget something?
if item['price']:
item['price'] = item['price'].extract()[0]
else:
item['price'] = "NA"
I am new in this. Would you helping me please. Thank you very much
Upvotes: 0
Views: 275
Reputation: 166
You don't need items
list, just use yield
statement
def parse(self, response):
sel = Selector(response)
for site in sel.xpath('//div[@class="fstRow prod celwidget"]'):
item = AmazonItem()
item['url'] = response.url
price = site.xpath('//ul[@class="rsltL"]/li[5]/a/span/text()')
if price:
item['price'] = price.extract()[0]
else:
item['price'] = "NA"
yield item
Save to data.csv file:
scrapy crawl amazon -o data.csv -t csv
Upvotes: 0
Reputation: 7889
It looks like items.append (item)
is indented too far in your first code sample.
This would make it part of the else
block of your price check and so no items will get added to the items
list unless it had no price set.
from scrapy.spider import Spider
from scrapy.selector import Selector
from amazon.items import AmazonItem
class AmazonSpider(Spider):
name = "amazon"
allowed_domains = ["amazon.com"]
start_urls = [
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780316324106",
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780307959478",
"http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9780345549334"
]
def parse(self, response):
sel = Selector(response)
sites = sel.xpath('//div[@class="fstRow prod celwidget"]')
items = []
for site in sites:
item = AmazonItem()
item['url'] = response.url
item['price'] = site.xpath('//ul[@class="rsltL"]/li[5]/a/span/text()')
if item['price']:
item['price'] = item['price'].extract()[0]
else:
item['price'] = "NA"
items.append(item)
return items
Upvotes: 1