ans2human
ans2human

Reputation: 2357

Local variable 'links' referenced before assignment

The Scrapy parsing method I have implemented gives me the error:

UnboundLocalError: local variable 'links' referenced before assignment

Being fairly new to scrapy I have very less idea about what I'm doing wrong here. I wanted to have scope for new websites that I will add in future but how do I implement the xpath particular to that website so that it scrapes its inner links?

The xpaths are correct and work in shell.

# -*- coding: utf-8 -*-
import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
from rvcalinkscrapper.items import RvcalinkscrapperItem
from scrapy.loader import ItemLoader
from scrapy.loader.processors import TakeFirst, MapCompose
from w3lib.html import remove_tags

class MultifileoutSpider(scrapy.Spider):
name = 'multifileout'
allowed_domains = []
start_urls = []
read_urls = open('../urls.txt', 'r')
for url in read_urls.readlines():
    url = url.strip() 
    allowed_domains = allowed_domains + [url[4:]]
    start_urls = start_urls + ['http://' + url]
read_urls.close()

def parse(self, response):

    item_loader = ItemLoader(item=RvcalinkscrapperItem(), response=response)
    item_loader.default_input_processor = MapCompose(remove_tags)
    item_loader.default_output_processor = TakeFirst()

    shop = response.xpath("shop")
    if shop == "shop0":
        links = '//li[@class="mobile-nav__item"]/a/@href'
    elif shop == "shop1":
        links = '//ul[@class="level2 unstyled"]/li/a/@href'

    item_loader.add_xpath("links", links)

    item_loader.add_value("shop", shop)

    item_loader.add_value("url", response.url)

    return item_loader.load_item()

Upvotes: 0

Views: 355

Answers (1)

BruceWayne
BruceWayne

Reputation: 23285

If shop is not one of those two things, links is not assigned. You could just do

...
links = '//li[@class="mobile-nav__item"]/a/@href'
shop = response.xpath("shop")
if shop == "shop1":
    links = '//ul[@class="level2 unstyled"]/li/a/@href'
...

Assuming there's a "default" path you want to use (I used shop0's). Obviously you can switch this up, or add an elif to it.

Upvotes: 1

Related Questions