The Dan
The Dan

Reputation: 1690

Scrapy + python: csv file not exported in the correct order

I'm creating a csv file with my spider but it gives me a weird order of data:

My code:

class GoodmanSpider(scrapy.Spider):
name = "goodmans"
start_urls = ['http://www.goodmans.net/d/1706/brands.htm']

def parse(self, response):
    items = TutorialItem()
    all_data = response.css('.SubDepartments')
    for data in all_data:
        category = data.css('.SubDepartments a::text').extract()
        category_url = data.css('.SubDepartments a::attr(href)').extract()
        items['category'] = category
        items['category_url'] = category_url
        yield items

My items.py file

My items.py file

The output I get: The output I get

The output I want, more or less: The output I want, more or less

Upvotes: 0

Views: 109

Answers (2)

The Dan
The Dan

Reputation: 1690

This is the code correction, based on Michael's answer. Works perfectly

import scrapy
from ..items import TutorialItem
import pandas as pd

class GoodmanSpider(scrapy.Spider):
    name = "goodmans"
    start_urls = ['http://www.goodmans.net/d/1706/brands.htm']

    def parse(self, response):
        items = TutorialItem()
        all_data = response.css('.SubDepartments')
        for data in all_data:
            category = data.css('.SubDepartments a::text').extract()
            category_url = data.css('.SubDepartments a::attr(href)').extract()
            items['category'] = category
            items['category_url'] = category_url
            for cat, url in zip(category, category_url):
                item = dict(category=cat, category_url=url)
                yield item

Upvotes: 0

Michael Savchenko
Michael Savchenko

Reputation: 1445

You have stacked all your items in a single one. Each item should be a dict of single value for each key, while you're having a list.

Try something like:

for cat, url in zip(category, category_url):
    item = dict(category=cat, category_url=url)
    yield item

Upvotes: 1

Related Questions