Padhu
Padhu

Reputation: 41

Reading excel file containing DataValidation with Openpyxl takes more time

While loading an MS Excel Workbook with Data Validation in read/write mode using openpyxl, the time consumed is much higher than loading a workbook without DV. The time consumed grows exponentially as the DV's are more and sometime i get MemoryError too.

Here is a sample that explains the time consumption in seconds:

  1. Without Data validation - sample_nodv.xlsx

    python -c "from time import time; from openpyxl.reader.excel import load_workbook; st = time(); load_workbook('sample_nodv.xlsx'); print(time()-st)" Output: 0.004000425338745117

  2. With Data validation - sample_dv.xlsx

    python -c "from time import time; from openpyxl.reader.excel import load_workbook; st = time(); load_workbook('sample_dv.xlsx'); print(time()-st)" Output: 4.132413148880005

I would like to use openpyxl and load the xlsx file in RW mode and at the same time reduce the time consumed. I don't want to use win32com as i want the solution to be independent of OS.

Note:

Upvotes: 1

Views: 980

Answers (1)

Charlie Clark
Charlie Clark

Reputation: 19527

This is a known issue for validators that are applied to entire columns. See https://bitbucket.org/openpyxl/openpyxl/issues/837 for further information. openpyxl version 2.5b1 contains a fix for this.

Upvotes: 1

Related Questions