Reputation: 12890
I am trying to generate a CSV file with Scrapy, it is working but not as expected. I have an html table which has multiple rows, I want the same in CSV. However, the following code converts all the HTML rows into single CSV row.
code
class DemoSpider(scrapy.Spider):
name = "DemoSpider"
def start_requests(self):
urls = []
for page in range(1, 2):
url = "https://directory.easternuc.com/publicDirectory?page=%s" %page
urls.append(url)
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
item = TutorialItem()
item['name'] = response.selector.xpath("//tr/td/h4/text()").getall()
item['phone'] = response.selector.xpath("//tr/td[2]/text()").getall()
item['mobile'] = response.selector.xpath("//tr/td[3]/text()").getall()
item['email'] = response.selector.xpath("//tr/td[4]/text()").getall()
yield item
if I change the getall()
method to get
I am getting only first row from website into csv
Note: as a workaround, I can find the total rows in the website and then iterate it. However it seems like in the older version of the scrapy this is working.
Upvotes: 0
Views: 160
Reputation: 21201
You will have to iterate each tr
one by one and yield
each record separately
def parse(self, response):
for TR in response.xpath("//table/tr"):
item = TutorialItem()
item['name'] = TR.xpath("./td/h4/text()").get()
item['phone'] = TR.xpath("./td[2]/text()").get()
item['mobile'] = TR.xpath("./td[3]/text()").get()
item['email'] = TR.xpath("./td[4]/text()").get()
yield item
Upvotes: 1