Yohst
Yohst

Reputation: 1902

repairing Excel file

I have about 1500 excel spreadsheet files downloaded from movescount.com - these contain heart rate data taken over a few years. I want to process these files with a bit of Python + openpyxl. The problem is that opening the excel file with openpyxl fails. Specifically it complains that a styles.xml definition is missing. Also Apple’s Numbers cannot open the file. When I open the file with MS Excel (which works) and simply save the file right away, I can open and process it with python and openpyxl just fine.

Clearly, MS Excel is repairing some format issue. But I would like to avoid having to open and save the file with MS Excel - particularly because that means I need to do this on an MS Windows box. What is the best way to repair these files on the fly on my OS X machine?

Upvotes: 2

Views: 1987

Answers (1)

Steven Rumbalski
Steven Rumbalski

Reputation: 45542

An Excel xlsx file is a zipped archive of files. So you can write the missing file to the xlsx archive provided you have one good file to start with.

import zipfile

with zipfile.ZipFile('/path/to/goodfile.xlsx', 'r') as zgood:
    styles_xml = zgood.read('xl/styles.xml')

with zipfile.ZipFile('/path/to/badfile.xlsx', 'a') as zbad:
    zbad.writestr('xl/styles.xml', styles_xml)

Upvotes: 1

Related Questions