Reputation: 43
I'm trying to read a bunch of mixed data (strings & numbers) from a csv & write everything to an excel file. unfortunately, everything is written as text since in csv text is the only format. how can i write the numerical data in the correct format? this is the code i have so far...
import xlwt, csv
wb = xlwt.Workbook()
ws = wb.add_sheet('testSheet')
sourceCSV = csv.reader(open('sourceCSVfile.csv', 'rb'), delimiter=",")
for rowi, row in enumerate(sourceCSV):
for coli, value in enumerate(row):
ws.write(rowi, coli, value)
wb.save('TEST.xls')
Upvotes: 4
Views: 3967
Reputation: 82992
Somehow, depending on your data, you need to be able to determine what data-type is in each field.
If your data has the same data-type in each column, you can do something like this:
# 5 columns: text, integer, float, float, date in YYYY-MM-DD format
import datetime
def date_conv(s):
return datetime.datetime.strptime(s, "%Y-%m-%d")
converters = (str.strip, int, float, float, date_conv)
# use int if you want to check that it is an int.
# xlwt will convert the int to float anyway.
...
for rowi, row in enumerate(sourceCSV):
for coli, value in enumerate(row):
ws.write(rowi, coli, converters[coli](value))
Other possibilities:
(1) the suck-it-and-see approach:
def float_if_possible(strg):
try:
return float(strg)
except ValueError:
return strg
...
ws.write(rowi, coli, float_if_possible(value))
(2) the analytical approach:
You need to write carefully nitpicky regexes to analyse your text, and you need to apply them in the appropriate order.
For a float, consider this:
float_const_pattern = r"""
[+-]? # optional sign
(?:
(?: \d* \. \d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc
|
(?: \d+ \. ) # 1. 12. 123. etc
|
(?: \d+ ) # 1 12 123 etc
)
# followed by optional exponent part
(?: [Ee] [+-]? \d+ ) ?
# followed by end of string
\Z # don't use $
"""
along with the flag re.VERBOSE
. Note in particular the "end of string" check. If you don't do that, given input 123qwerty
, the regex will match 123
and the float("123qwerty")
call will raise an exception.
Upvotes: 5