Slumdog
Slumdog

Reputation: 470

Which Python Library to use to open an existing large xlsx?

I have used openpyxl with Workbook(write_only=True) to create large Excel xlsx files. In this mode I cannot format the Excel headers. I then save the xlsx and open it again with openpyxl load_workbook(my_book) and have the ability to format the cells. I then save the file. If the file isn't too large it saves but otherwise gives a memory error. So Openfile allows me to create and save the worksheet but not necessarily re-open and save the same worksheet. In this example I just load_workbook then save without changing the xlsx to show the error

from openpyxl import Workbook
from openpyxl import load_workbook
from openpyxl.styles import Font
from openpyxl.styles import PatternFill
wb = Workbook(write_only=True)
# then append a lot of rows
logging.info('Save unformatted xlsx')
wb.save(my_book)
workbook = load_workbook(my_book)
# the worksheet always loads ok at this point, even when  700,000 rows
workbook.save(my_book)

## Immediately after load_workbook I do workbook.save(my_book). 
When the rowcount is around 8600 there is no problem. When 350,000 there is a memory error

File "src\lxml\serializer.pxi", line 1268, in lxml.etree._IncrementalFileWriter._handle_error
File "src\lxml\etree.pyx", line 316, in lxml.etree._ExceptionContext._raise_if_stored
File "src\lxml\serializer.pxi", line 650, in lxml.etree._FilelikeWriter.write
MemoryError


logging.info('Saved unformatted xlsx immediately after opening again')
workbook = load_workbook(my_book)
# If no error I do some formatting and all is well and can save ok

Python 3.4.3 openpyxl (2.5.1) lxml (4.2.1)

There are a lot of solutions for older versions but I cannot see any for openpyxl (2.5.1). Does anyone have an answer for openpyxl, or can recommend what to use to open an existing large xlsx and format cells?

Upvotes: 0

Views: 1004

Answers (1)

Slumdog
Slumdog

Reputation: 470

It wasn't of topic, it was an error even if just a memory error. Anyway in the end I just exported to CSV rather than XSLX as Excel still opens it nicely, not worth the bother.

Upvotes: 0

Related Questions