Reputation: 31
I am hoping you could help me on a scraping script. From Chrome, I have confirmed the XPath is correct.
I am using XPATH selector for the script:
import scrapy
class SmSpider(scrapy.Spider):
name = 'sm'
def start_requests(self):
urls = []
for i in range (0,10):
urls.append('http://www.example.com/sm.php?a=view&recid='+str(i))
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
companyname = response.selector.xpath('//table[3]/tbody/tr[1]/td[2]').get()
print(companyname)
` BUT, when I try to output the Scraped Companyname I am get 'None'. I am not sure why this is the case. Could it be because of .php? Any workaropund I will appreciate
Upvotes: 1
Views: 717
Reputation: 1150
The range is starting from zero in your code, it's fine if intentional. Next you can use response.xpath('//table[3]/tbody/tr[1]/td[2]').extract()
.
import scrapy
class SmSpider(scrapy.Spider):
name = 'sm'
def start_requests(self):
urls = []
for i in range (1,11):
urls.append('http://www.example.com/sm.php?a=view&recid='+str(i))
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
companyname = response.xpath('//table[3]/tbody/tr[1]/td[2]').extract()
print(companyname)
If my answer is wrong, please provide the URL for the page you wish to scrape for better answer.
Upvotes: 1