Reputation: 17467
How can I convert a CSV file with :
delimiter to XLS (Excel sheet) using openpyxl
module?
Upvotes: 12
Views: 63186
Reputation: 66
On top of the suggestion by John, I slightly modified my script using function to remove the apostrophe of string for all raw data. This way, I managed to check all raw data (string and number), which were also placed in respective cell. Lastly, I assign numeric data to float type starting from row 20 onward. This is because all numeric data existed from row 20th onward, while all data above were text only.
cell_value = cell.replace('"', '')
Below is my script:
import csv
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
with open(filepath1_csv) as f:
reader = csv.reader(f)
for row_index, row in enumerate(reader):
for column_index, cell in enumerate(row):
column_letter = column_index + 1
cell_value = cell.replace('"', '')
ws.cell(row = row_index + 1, column = column_letter).value = cell_value
for row in ws.iter_rows(min_row=20, min_col=1, max_col=5,
max_row=ws.max_row):
for cell in row:
if cell.value is None:
break
else:
cell.value = float(cell.value)
wb.save(filename = filepath1_xlsx)
Upvotes: 1
Reputation: 13699
import csv
from openpyxl import Workbook
from openpyxl.cell import get_column_letter
f = open(r'C:\Users\Asus\Desktop\herp.csv')
csv.register_dialect('colons', delimiter=':')
reader = csv.reader(f, dialect='colons')
wb = Workbook()
dest_filename = r"C:\Users\Asus\Desktop\herp.xlsx"
ws = wb.worksheets[0]
ws.title = "A Snazzy Title"
for row_index, row in enumerate(reader):
for column_index, cell in enumerate(row):
column_letter = get_column_letter((column_index + 1))
ws.cell('%s%s'%(column_letter, (row_index + 1))).value = cell
wb.save(filename = dest_filename)
Upvotes: 18
Reputation: 301
Here is Adam's solution expanded to strip out characters that openpyxl considers illegal and will throw an exception for:
import re
from openpyxl.cell.cell import ILLEGAL_CHARACTERS_RE
...
##ws.append(row) - Replace with the code below
for i in row:
ws.append([ILLEGAL_CHARACTERS_RE.sub('',i)])
ILLEGAL_CHARACTERS_RE is a compiled regular expression containing the characters openpyxl deems "illegal". The code is simply substituting those characters with an empty string.
Source: Bitbucket openpyxl issue #873 - Remove illegal characters instead of throwing an exception
Upvotes: 2
Reputation: 2149
A much simpler, minimalist solution:
import csv
import openpyxl
wb = openpyxl.Workbook()
ws = wb.active
with open('file.csv') as f:
reader = csv.reader(f, delimiter=':')
for row in reader:
ws.append(row)
wb.save('file.xlsx')
Upvotes: 67