Frank Schweizer
Frank Schweizer

Reputation: 31

Decompress .xls file in python

I'm searching for a method to decompress/unzip .xls files in python. By opening Excel files with 7-Zip you can see the directories I'd like to extract.

I already tried renaming the Excel to ".zip" and then extracting it

myExcelFile = zipfile.ZipFile("myExcel.zip") 
myExcelFile.extractall()

but it throws

zipfile.BadZipFile: File is not a zip file

.xls in 7-Zip

Upvotes: 1

Views: 3082

Answers (1)

Frank Schweizer
Frank Schweizer

Reputation: 31

.xls files use the BIFF format. .xlsx files use Office Open XML, which is a zipped XML format. BIFF is not a zipped format; files using that format are not recognized by zip libraries. – shmee

a conversion to .xlsx is the solution

import win32com.client as win32
fname = "full+path+to+xls_file"
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(fname)

wb.SaveAs(fname+"x", FileFormat = 51)    #FileFormat = 51 is for .xlsx extension
wb.Close()                               #FileFormat = 56 is for .xls extension
excel.Application.Quit()

Upvotes: 2

Related Questions