Writing to separate columns instead of comma seperated for csv files in scrapy

Question

I am working with scrapy and writing the data fetched from web pages in to CSV files

My pipeline code:

def __init__(self):
    self.file_name = csv.writer(open('example.csv', 'wb'))
    self.file_name.writerow(['Title', 'Release Date','Director'])

def process_item(self, item, spider):
    self.file_name.writerow([item['Title'].encode('utf-8'),
                                item['Release Date'].encode('utf-8'),
                                item['Director'].encode('utf-8'),
                                ])
    return item

And my output format in CSV file is:

Title,Release Date,Director
And Now For Something Completely Different,1971,Ian MacNaughton
Monty Python And The Holy Grail,1975,Terry Gilliam and Terry Jones
Monty Python's Life Of Brian,1979,Terry Jones
.....

But is it possible to write title and its values into one column , Release date and its values into the next column,Director and its values into the next column (because CSV is comma separated values) in a CSV file like the format below.

        Title,                                 Release Date,            Director
And Now For Something Completely Different,      1971,              Ian MacNaughton
Monty Python And The Holy Grail,                 1975,     Terry Gilliam and Terry Jones
Monty Python's Life Of Brian,                    1979,              Terry Jones

Any help would be appreciated. Thanks in advance.

daedalus · Accepted Answer

Update -- Code re-factored in order to:

use a generator function as suggested by @madjar and

fit more closely to the code snippet provided by the OP.

The Target Output

I am trying an alternative using texttable. It produces an identical output to that in the question. This output may be written to a csv file (the records will need massaging for the appropriate csv dialect, and I cannot find a way to still use the csv.writer and still get the padded spaces in each field.

                  Title,                      Release Date,             Director            
And Now For Something Completely Different,       1971,              Ian MacNaughton        
Monty Python And The Holy Grail,                  1975,       Terry Gilliam and Terry Jones 
Monty Python's Life Of Brian,                     1979,                Terry Jones

The Code

Here is a sketch of the code you would need to produce the result above:

from texttable import Texttable

# ----------------------------------------------------------------
# Imagine data to be generated by Scrapy, for each record:
# a dictionary of three items. The first set ot functions
# generate the data for use in the texttable function

def process_item(item):
    # This massages each record in preparation for writing to csv
    item['Title'] = item['Title'].encode('utf-8') + ','
    item['Release Date'] = item['Release Date'].encode('utf-8') + ','
    item['Director'] = item['Director'].encode('utf-8')
    return item

def initialise_dataset():
    data = [{'Title' : 'Title',
         'Release Date' : 'Release Date',
         'Director' : 'Director'
         }, # first item holds the table header
            {'Title' : 'And Now For Something Completely Different',
         'Release Date' : '1971',
         'Director' : 'Ian MacNaughton'
         },
        {'Title' : 'Monty Python And The Holy Grail',
         'Release Date' : '1975',
         'Director' : 'Terry Gilliam and Terry Jones'
         },
        {'Title' : "Monty Python's Life Of Brian",
         'Release Date' : '1979',
         'Director' : 'Terry Jones'
         }
        ]

    data = [ process_item(item) for item in data ]
    return data

def records(data):
    for item in data:
        yield [item['Title'], item['Release Date'], item['Director'] ]

# this ends the data simulation part
# --------------------------------------------------------

def create_table(data):
    # Create the table
    table = Texttable(max_width=0)
    table.set_deco(Texttable.HEADER)
    table.set_cols_align(["l", "c", "c"])
    table.add_rows( records(data) )

    # split, remove the underlining below the header
    # and pull together again. Many ways of cleaning this...
    tt = table.draw().split('
')
    del tt[1] # remove the line under the header
    tt = '
'.join(tt)
    return tt

if __name__ == '__main__':
    data = initialise_dataset()
    table = create_table(data)
    print table

Writing to separate columns instead of comma seperated for csv files in scrapy

Answers (2)

The Target Output

The Code

Related Questions