Reputation: 31
I'm searching for a method to decompress/unzip .xls files in python. By opening Excel files with 7-Zip you can see the directories I'd like to extract.
I already tried renaming the Excel to ".zip" and then extracting it
myExcelFile = zipfile.ZipFile("myExcel.zip")
myExcelFile.extractall()
but it throws
zipfile.BadZipFile: File is not a zip file
Upvotes: 1
Views: 3082
Reputation: 31
.xls files use the BIFF format. .xlsx files use Office Open XML, which is a zipped XML format. BIFF is not a zipped format; files using that format are not recognized by zip libraries. – shmee
a conversion to .xlsx is the solution
import win32com.client as win32
fname = "full+path+to+xls_file"
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(fname)
wb.SaveAs(fname+"x", FileFormat = 51) #FileFormat = 51 is for .xlsx extension
wb.Close() #FileFormat = 56 is for .xls extension
excel.Application.Quit()
Upvotes: 2