Gaurang Shah
Gaurang Shah

Reputation: 12890

Python Scrapy grabs all the rows into one single CSV row

I am trying to generate a CSV file with Scrapy, it is working but not as expected. I have an html table which has multiple rows, I want the same in CSV. However, the following code converts all the HTML rows into single CSV row.

code

class DemoSpider(scrapy.Spider):
    name = "DemoSpider"

    def start_requests(self):
        urls = []
        for page in range(1, 2):
            url = "https://directory.easternuc.com/publicDirectory?page=%s" %page
            urls.append(url)

        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):

        item = TutorialItem()
        item['name'] = response.selector.xpath("//tr/td/h4/text()").getall()
        item['phone'] = response.selector.xpath("//tr/td[2]/text()").getall()
        item['mobile'] = response.selector.xpath("//tr/td[3]/text()").getall()
        item['email'] = response.selector.xpath("//tr/td[4]/text()").getall()
        yield item

if I change the getall() method to get I am getting only first row from website into csv

Note: as a workaround, I can find the total rows in the website and then iterate it. However it seems like in the older version of the scrapy this is working.

Upvotes: 0

Views: 160

Answers (1)

Umair Ayub
Umair Ayub

Reputation: 21201

You will have to iterate each tr one by one and yield each record separately

def parse(self, response):

    for TR in response.xpath("//table/tr"):
        item = TutorialItem()
        item['name'] = TR.xpath("./td/h4/text()").get()
        item['phone'] = TR.xpath("./td[2]/text()").get()
        item['mobile'] = TR.xpath("./td[3]/text()").get()
        item['email'] = TR.xpath("./td[4]/text()").get()
        yield item

Upvotes: 1

Related Questions