Reputation: 175
This is my first program in python and need some help in writing utf-8 data to file.
The intention is to read data from excel file and write comma seperated data to text file and below is the code I am running which is giving the below pasted error.
import xlrd
import csv
import codecs
wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
sh = wb.sheet_by_index(1)
file_output = codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8')
for rownum in xrange(sh.nrows):
file_output.write(sh.row_values(rownum))
file_output.close()
and here is the error
Traceback (most recent call last):
File "TestXls2Csv.py", line 20, in <module>
file_output.write(sh.row_values(rownum))
File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 686, in write
return self.writer.write(data)
File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
TypeError: coercing to Unicode: need string or buffer, list found
Any help is highly appreciated.
Thanks Zulfi
Tried the below
row_values = [str(val) for val in sh.row_values(rownum)]
file_output.write(",".join(row_values) + "\n")
It seems to work fine for one sheet of the excel but is giving the below error for the other sheet
Traceback (most recent call last): File "TestXls2Csv.py", line 12, in file_output.write(",".join(sh.row_values(rownum)) + "\n") TypeError: sequence item 8: expected string or Unicode, float foundI had initially tried using csv.writer but there is a \xa0 character in one of the cells which was causing a lot of trouble hence installed codecs and battling to get it to work.
Below is info on the excel document if that gives any insight
=== File: CustList.xls ===
Open took 3.03 seconds
BIFF version: 8; datemode: 0 codepage: 1200 (encoding: utf_16_le); countries: (1, 1) Last saved by: u'Rajesh, Vatha' Number of data sheets: 2 Use mmap: 1; Formatting: 0; On demand: 0 Ragged rows: 0 Load time: 0.01 seconds (stage 1) 1.86 seconds (stage 2)
sheet 0: name = u'MEMBER'; nrows = 29966; ncols = 11
sheet 1: name = u'PHYSICANS'; nrows = 1619; ncols = 19
command took 0.20 secondsPlease suggest.
Thanks Zulfi
Upvotes: 0
Views: 915
Reputation: 3563
If you want the values to be comma-separated in your output file, you could simply change your write command to join the list of values to a comma-separated string.
But at first you have to convert every value in the list to a string, because row_values()
is returning a list oft string and floating values.
...
row_values = [str(val) for val in sh.row_values(rownum)]
file_output.write(",".join(row_values) + "\n")
...
Upvotes: 1
Reputation: 11
dciriello was right, because file_output.write should take string as its arguments, but sh.row_values(rownum) return a list, that's the main reason.
here is what to do, if you want to copy a file from xls to csv.
import xlrd
import csv
import codecs
wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
table = wb.sheet_by_index(1)
nrows = table.nrows
with codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8') as file_output:
spamwriter = csv.writer(file_output)
for i in range(nrows):
spamwriter.writerow(table.row_values(i))
Upvotes: 1