Geroge
Geroge

Reputation: 561

Conversion text to number in python

input is a text table - some columns are numbers and some text. I have python script to convert this text file to xlsx. After open in Excel or Libre all fields are text. Is there any option to convert some columns to number?

cat script.py:

import csv
import sys
from xlsxwriter.workbook import Workbook

# Add some command-line logic to read the file names.
tsv_file = sys.argv[1]
xlsx_file = sys.argv[2]

# Create an XlsxWriter workbook object and add a worksheet.
workbook = Workbook(xlsx_file)
worksheet = workbook.add_worksheet()

# Create a TSV file reader.
tsv_reader = csv.reader(open(tsv_file, 'rb'), delimiter='\t')

# Read the row data from the TSV file and write it to the XLSX file.
for row, data in enumerate(tsv_reader):
    worksheet.write_row(row, 0, data)

# Close the XLSX file.
workbook.close()

run script:

python script.py in.txt out.xlsx

I would like to change script to convert some columns to text and some to numbers and keep xlsx format. Any idea how to do that?

Upvotes: 1

Views: 11422

Answers (2)

jmcnamara
jmcnamara

Reputation: 41644

You can use the strings_to_numbers XlsxWriter constructor option. From the docs:

strings_to_numbers: Enable the worksheet.write() method to convert strings to numbers, where possible, using float() in order to avoid an Excel warning about “Numbers Stored as Text”. The default is False. To enable this option use:

workbook = xlsxwriter.Workbook(filename, {'strings_to_numbers': True})

Example:

import xlsxwriter

workbook = xlsxwriter.Workbook('test.xlsx', {'strings_to_numbers': True})
worksheet = workbook.add_worksheet()

worksheet.write(0, 0, 'Hello')
worksheet.write(1, 0, '123') # Write this string as a number.

workbook.close()

Output:

enter image description here

Upvotes: 2

Laurent LAPORTE
Laurent LAPORTE

Reputation: 22992

According to the documentation, you need to add a numeric format, and use it when you write cells which need numeric format.

# Add a number format for cells with money.
money = workbook.add_format({'num_format': '$#,##0'})

...

worksheet.write(row_idx, col_idx, your_value, money)

See Tutorial 2: Adding formatting to the XLSX File

Upvotes: 1

Related Questions