Vlad
Vlad

Reputation: 113

Python zipfile: Bad magic number for central directory

Ok, so basically I'm trying to write simple quick script in python to search xml from *.fla (flash) files. All I'm doing, is opening *.fla files from project via zipfile.ZipFile, go through all files in this zip archive, and search specific term by regex (dirty and simple). This is not the ideal solution for my problem, but this will work for now. I'm using CS6, and I know that *.fla files from CS5 and above are basically zip archives with xml (and other files) inside, and I have successfully extracted those files via 7zip on windows. But somewhy, on half the files from my project, zipfile.ZipFile throws an exception 'Bad magic number for central directory' on creation. The call stack looks like this:

  File "fla_search.py", line 92, in try_search_zip                                                                                                            
    with zipfile.ZipFile(fla_path, compression=compression) as zip_view:                                                                                                        
  File "C:\bwn_programs\python\lib\zipfile.py", line 1257, in __init__                                                                                                     
    self._RealGetContents()                                                                                                    
  File "C:\bwn_programs\python\lib\zipfile.py", line 1352, in _RealGetContents                                                                                                
    raise BadZipFile("Bad magic number for central directory")

I also have checked header magic number for the faulty file (just in case), and it seems it is actually corresponds to the actual zip archive:

enter image description here

enter image description here

(and yes, all contents of the file successfully opens via 7zip)

So, what could be the problem?

Upvotes: 0

Views: 5496

Answers (1)

pmqs
pmqs

Reputation: 3725

You hex dump shows the start of the file and the first 4 bytes are indeed a valid local header signature. The problem is the python code is complaining about the central directory header - this is near the end of the file.

Some programs that use zip as their container format do non-standard things. That means the the files are not true zip files anymore. The likes of 7zip & unzip have the smarts to work around some, but not all, of these.

If you have unzip available on your windows setup, try running unzip -t yourfile.fla to test the fla file - that may give more clues about the problem with how the file has been constructed.

Are there any public fla files available that have this problem? That would make it easier for us to help root cause the issue.

Upvotes: 1

Related Questions