user2144217
user2144217

Reputation: 1

IndentationError: unexpected indent on def parse_item(self, response) for scrapy,spider

I am using scrapy to get some information from all pages of a website. Here is my dmoz_spider.py file.when i execute this i get IndentationError. Please help me out.

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from scrapy.item import Item, Field
import string
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
class EypItem(Item):
    title = Field()
    link = Field()
    price = Field()
    review = Field()
class eypSpider(CrawlSpider):
    name = "dmoz"
    allowed_domains =["http://www.walgreens.com"]
    start_urls =["http://www.walgreens.com/search/results.jsp?Ntt=allergy%20medicine"] 
rules = (Rule(SgmlLinkExtractor(allow=('/search/results\.jsp', )), callback='parse_item', follow= True),)
    def parse_item(self, response):
    self.log('Hi, this is an item page! %s' % response.url)
        hxs = HtmlXPathSelector(response)
        sites = hxs.select('//div[@id="productGrid"]')
        items = []
        for site in sites:
            itemE = EypItem()
            itemE["title"] = site.select('//*[@class="image-container"]/a/img/@alt').extract()
            itemE["link"] = site.select('//*[@class="image-container"]/a/img/@src').extract()
            itemE["price"] = site.select('//*[@class="pricing"]/div/p/text()').extract()
            itemE["review"] = site.select('//*[@class="reviewSnippet"]/div/div/span/text()').extract()
            items.append(itemE)
        return items

Upvotes: 0

Views: 1591

Answers (1)

Talvalin
Talvalin

Reputation: 7889

Aside from the indentation error, your allowed_domains has been specified incorrectly. Change it as follows (which is to say, remove the "http://" prefix from the URL):

allowed_domains =["www.walgreens.com"]

Upvotes: 1

Related Questions