jcuwaz
jcuwaz

Reputation: 187

Unicode encoding error in csv python

I am continuously getting this type of error when writing parsed content to a csv in python 2.7: UnicodeEncodeError: 'ascii' codec can't encode characters in position 570-579: ordinal not in range(128)

So after some research I found the examples in the docs as well as a similar Question on SO: Read and Write CSV files including unicode with Python 2.7 but I couldn't get mine to work with the following code:

    data = {
        'scrapeUrl': url,
        'model': final_model_num,
        'title': final_name, 
        'description': final_description, 
        'price': str(final_price), 
        'image': final_first_image, 
        'additional_image': final_images,
        'quantity': '1', 
        'subtract': '1', 
        'minimum': '1', 
        'status': '1', 
        'shipping': '1' 
    } 
    with open("local/file1.csv", "w") as f:
        writer=csv.writer(f, delimiter=",")
        writer.writerows([data.keys()])
        for row in zip(*data.values()):
            row=[s.encode('utf-8') for s in row]
            writer.writerows([row])

My version seems to be writing only the first character of each variable to each row; I tried removing the unzip key as a bit of troubleshooting but that resulted in all of the data being printed correctly but to one column of the csv rather than one row.

Upvotes: 1

Views: 7067

Answers (1)

metatoaster
metatoaster

Reputation: 18898

What you essentially have is set of key-value pairs, so you essentially have only one set of values, which ends up being decomposed into individual character 'rows' when zip was called:

>>> zip(*['abc', 'def', 'ghi'])
[('a', 'd', 'g'), ('b', 'e', 'h'), ('c', 'f', 'i')]

Furthermore, the shortest value in your example has a len of 1, which would then explain why you got a single row with just the first character of all the values as your output.

What you want to do is something like

    with open("local/file1.csv", "w") as f:
        writer = csv.writer(f, delimiter=",")
        writer.writerow(data.keys())
        writer.writerow([s.encode('utf8') for s in data.values()])

Alternatively, use codecs.open with an encoding to get around having to manually decode unicode into str.

    with codecs.open("local/file1.csv", "w", encoding='utf8') as f:
        writer = csv.writer(f, delimiter=",")
        writer.writerow(data.keys())
        writer.writerow(data.values())

Upvotes: 3

Related Questions