Brobot
Brobot

Reputation: 43

writing mixed data from a .csv file to an .xls with xlwt - how to format numbers?

I'm trying to read a bunch of mixed data (strings & numbers) from a csv & write everything to an excel file. unfortunately, everything is written as text since in csv text is the only format. how can i write the numerical data in the correct format? this is the code i have so far...

import xlwt, csv

wb = xlwt.Workbook()
ws = wb.add_sheet('testSheet')

sourceCSV = csv.reader(open('sourceCSVfile.csv', 'rb'), delimiter=",")

for rowi, row in enumerate(sourceCSV):
  for coli, value in enumerate(row):
      ws.write(rowi, coli, value)
wb.save('TEST.xls') 

Upvotes: 4

Views: 3967

Answers (1)

John Machin
John Machin

Reputation: 82992

Somehow, depending on your data, you need to be able to determine what data-type is in each field.

If your data has the same data-type in each column, you can do something like this:

# 5 columns: text, integer, float, float, date in YYYY-MM-DD format
import datetime
def date_conv(s):
    return datetime.datetime.strptime(s, "%Y-%m-%d")
converters = (str.strip, int, float, float, date_conv)
# use int if you want to check that it is an int.
# xlwt will convert the int to float anyway.
...
for rowi, row in enumerate(sourceCSV):
  for coli, value in enumerate(row):
      ws.write(rowi, coli, converters[coli](value))

Other possibilities:

(1) the suck-it-and-see approach:

def float_if_possible(strg):
    try:
        return float(strg)
    except ValueError:
        return strg
...
ws.write(rowi, coli, float_if_possible(value))

(2) the analytical approach:

You need to write carefully nitpicky regexes to analyse your text, and you need to apply them in the appropriate order.

For a float, consider this:

float_const_pattern = r"""
    [+-]? # optional sign
    (?:
        (?: \d* \. \d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc
        |
        (?: \d+ \. ) # 1. 12. 123. etc
        |
        (?: \d+ ) # 1 12 123 etc
    )
    # followed by optional exponent part
    (?: [Ee] [+-]? \d+ ) ?
    # followed by end of string
    \Z # don't use $
    """

along with the flag re.VERBOSE. Note in particular the "end of string" check. If you don't do that, given input 123qwerty, the regex will match 123 and the float("123qwerty") call will raise an exception.

Upvotes: 5

Related Questions