Display name
Display name

Reputation: 177

How to add start_url as an item?

I am new to Python and Scrapy. I want item['Source_Website'] to be the url I am crawling. How can I achieve this?

I tried item['Source_Website'] = selector.ulr and item['Source_Website'] = start_urls but no luck.

from scrapy.selector import Selector
from scrapy.spider import BaseSpider
from shikari.items import ShikariItem

class Radiate (BaseSpider) :
  name = "sss"
  download_delay = 3
  concurrent_requests = 1
  allowed_domains = ["website.com"]
  start_urls = ['http://www.website.com/1',
                'http://www.website.com/2']

  def parse(self, response) :
    sel = Selector (response)
    item = ShikariItem ()
    item['Heading'] = str (sel.xpath ('//h1/text()').extract ())
    item['Source_Website'] = 
    return item

Upvotes: 1

Views: 32

Answers (1)

gtlambert
gtlambert

Reputation: 11961

Use response.url as follows:

item['Source_Website'] = response.url

Upvotes: 1

Related Questions