nevster
nevster

Reputation: 379

Scrapy csv output is on a single line per item

I am trying to scrap the names and addresses of restaurants from the http://www.just-eat.co.uk/belfast-takeaway webpage. So far, my csv output has all the names on one line and all the addresses on one line. I am trying to get one line per name and one line per address.

Below is my spider:

import scrapy

from justeat.items import DmozItem

class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["just-eat.co.uk"]
start_urls = ["http://www.just-eat.co.uk/belfast-takeaway",]

def parse(self, response):
    for sel in response.xpath('//*[@id="searchResults"]'):
        item = DmozItem()
        item['name'] = sel.xpath('//*[@itemprop="name"]').extract()
        item['address'] = sel.xpath('//*[@class="address"]').extract()
        yield item

and below is my item:

import scrapy

class DmozItem(scrapy.Item):
name = scrapy.Field()
address = scrapy.Field()

I then use

scrapy crawl dmoz -o items.csv

to run my code.

Can anyone put me on the right path with my coding?

Upvotes: 0

Views: 990

Answers (1)

BoreBoar
BoreBoar

Reputation: 2739

Here you go :)

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import scrapy
from justeat.items import DmozItem

class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["just-eat.co.uk"]
    start_urls = ["http://www.just-eat.co.uk/belfast-takeaway", ]

    def parse(self, response):
        for sel in response.xpath('//*[@id="searchResults"]'):
            names = sel.xpath('//*[@itemprop="name"]/text()').extract()
            names = [name.strip() for name in names]
            addresses = sel.xpath('//*[@class="address"]/text()').extract()
            addresses = [address.strip() for address in addresses]
            result = zip(names, addresses)
            for name, address in result:
                item = DmozItem()
                item['name'] = name
                item['address'] = address
                yield item

Upvotes: 1

Related Questions