Reputation: 123
I've tried a few methods, including pandas
:
df = pd.read_excel('file.xlsx')
df.to_csv('file.csv')
But every time I convert my xlsx
file over to csv
format, I lose all data within columns that include a formula. I have a formula that concatenates values from two other cells + '@domain' to create user emails, but this entire column returns blank in the csv
product.
The formula is basically this:
=CONCATENATE(B2,".",E2,"@domain")
The conversion is part of a larger code workflow, but it won't work if this column is left blank. The only thing I've tried that worked was this API, but I'd rather not pay a subscription if this can be done locally on the machine.
Any ideas? I'll try whatever you throw at me - bear in mind I'm new to this, but I will do my best!
Upvotes: 1
Views: 2024
Reputation: 119
You can try to open the excel file with the openpyxl library in the data-only
mode. This will prevent the raw formulas - they are going to be calculated just the way you see them in excel itself.
import openpyxl
wb = openpyxl.load_workbook(filename, data_only=True)
Watch out when youre working with you original file and save it with the openpyxl-lib in the data-only
-mode all your formulas will be lost. I had this once and it was horrible. So i recommend using a copy of your file to work with.
Since you have your xlsx-file with values only you can now use the internal csv library to generate a proper csv-file (idea from this post: How to save an Excel worksheet as CSV):
import csv
sheet = wb.active # was .get_active_sheet()
with open('test.csv', 'w', newline="") as f:
c = csv.writer(f)
for r in sheet.iter_rows(): # generator; was sh.rows
c.writerow([cell.value for cell in r])
Upvotes: 1