Reputation: 177
I am new to Python and Scrapy. I want item['Source_Website']
to be the url I am crawling. How can I achieve this?
I tried item['Source_Website'] = selector.ulr
and item['Source_Website'] = start_urls
but no luck.
from scrapy.selector import Selector
from scrapy.spider import BaseSpider
from shikari.items import ShikariItem
class Radiate (BaseSpider) :
name = "sss"
download_delay = 3
concurrent_requests = 1
allowed_domains = ["website.com"]
start_urls = ['http://www.website.com/1',
'http://www.website.com/2']
def parse(self, response) :
sel = Selector (response)
item = ShikariItem ()
item['Heading'] = str (sel.xpath ('//h1/text()').extract ())
item['Source_Website'] =
return item
Upvotes: 1
Views: 32
Reputation: 11961
Use response.url
as follows:
item['Source_Website'] = response.url
Upvotes: 1