zulfi123786
zulfi123786

Reputation: 175

Error in writing unicode to file python

This is my first program in python and need some help in writing utf-8 data to file.

The intention is to read data from excel file and write comma seperated data to text file and below is the code I am running which is giving the below pasted error.

import xlrd
import csv
import codecs


wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
sh = wb.sheet_by_index(1)

file_output = codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8')

for rownum in xrange(sh.nrows):
    file_output.write(sh.row_values(rownum))
file_output.close()

and here is the error

Traceback (most recent call last):
  File "TestXls2Csv.py", line 20, in <module>
    file_output.write(sh.row_values(rownum))
  File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 686, in write
    return self.writer.write(data)
  File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
TypeError: coercing to Unicode: need string or buffer, list found

Any help is highly appreciated.

Thanks Zulfi

Tried the below

            row_values = [str(val) for val in sh.row_values(rownum)]
            file_output.write(",".join(row_values) + "\n")

It seems to work fine for one sheet of the excel but is giving the below error for the other sheet

Traceback (most recent call last): File "TestXls2Csv.py", line 12, in file_output.write(",".join(sh.row_values(rownum)) + "\n") TypeError: sequence item 8: expected string or Unicode, float foundI had initially tried using csv.writer but there is a \xa0 character in one of the cells which was causing a lot of trouble hence installed codecs and battling to get it to work.

Below is info on the excel document if that gives any insight

=== File: CustList.xls ===
Open took 3.03 seconds

BIFF version: 8; datemode: 0 codepage: 1200 (encoding: utf_16_le); countries: (1, 1) Last saved by: u'Rajesh, Vatha' Number of data sheets: 2 Use mmap: 1; Formatting: 0; On demand: 0 Ragged rows: 0 Load time: 0.01 seconds (stage 1) 1.86 seconds (stage 2)

sheet 0: name = u'MEMBER'; nrows = 29966; ncols = 11

sheet 1: name = u'PHYSICANS'; nrows = 1619; ncols = 19

command took 0.20 secondsPlease suggest.

Thanks Zulfi

Upvotes: 0

Views: 915

Answers (2)

miindlek
miindlek

Reputation: 3563

If you want the values to be comma-separated in your output file, you could simply change your write command to join the list of values to a comma-separated string.

But at first you have to convert every value in the list to a string, because row_values() is returning a list oft string and floating values.

...
row_values = [str(val) for val in sh.row_values(rownum)]
file_output.write(",".join(row_values) + "\n")
...

Upvotes: 1

Chan
Chan

Reputation: 11

dciriello was right, because file_output.write should take string as its arguments, but sh.row_values(rownum) return a list, that's the main reason.

here is what to do, if you want to copy a file from xls to csv.

import xlrd
import csv
import codecs

wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')

table = wb.sheet_by_index(1)
nrows = table.nrows

with codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8') as file_output:
    spamwriter = csv.writer(file_output)
    for i in range(nrows):
        spamwriter.writerow(table.row_values(i))

Upvotes: 1

Related Questions