Salvatore
Salvatore

Reputation: 12074

Remove auto-generated __MACOSX folder from inside a zip file in Python

I have zip files uploaded by clients through a web server that sometimes contain pesky __MACOSX directories inside that gum things up. How can I remove these?

I thought of using ZipFile, but this answer says that isn't possible and gives this suggestion:

Read out the rest of the archive and write it to a new zip file.

How can I do this with ZipFile? Another Python based alternative like shutil or something similar would also be fine.

Upvotes: 2

Views: 7342

Answers (2)

Life is complex
Life is complex

Reputation: 15629

The examples below are designed to determine if a __MACOSX file is contained within a zip file. If it is, then a new zip archive is created and all the files that are not __MACOSX files are written to this new archive. This code can be extended to include .ds_store files.

Example One

from zipfile import ZipFile

original_zip = ZipFile ('original.zip', 'r')
new_zip = ZipFile ('new_archve.zip', 'w')
for item in original_zip.infolist():
   buffer = original_zip.read(item.filename)
   if not str(item.filename).startswith('__MACOSX/'):
     new_zip.writestr(item, buffer)
  new_zip.close()
original_zip.close()

Example Two

def check_archive_for_bad_filename(file):
  zip_file = ZipFile(file, 'r')
  for filename in zip_file.namelist():
     print(filename)
     if filename.startswith('__MACOSX/'):
        return True

def remove_bad_filename_from_archive(original_file, temporary_file):
   zip_file = ZipFile(original_file, 'r')
   for item in zip_file.namelist():
      buffer = zip_file.read(item)
      if not item.startswith('__MACOSX/'):
        if not os.path.exists(temporary_file):
           new_zip = ZipFile(temporary_file, 'w')
           new_zip.writestr(item, buffer)
           new_zip.close()
         else:
           append_zip = ZipFile(temporary_file, 'a')
           append_zip.writestr(item, buffer)
           append_zip.close()

    zip_file.close()


archive_filename = 'old.zip'
temp_filename = 'new.zip'

results = check_archive_for_bad_filename(archive_filename)
if results:
   print('Removing MACOSX file from archive.')
   remove_bad_filename_from_archive(archive_filename, temp_filename)
else:
   print('No MACOSX file in archive.')

Upvotes: 4

Wereii
Wereii

Reputation: 341

The idea would be to use ZipFile to extract the contents into some defined folder then remove the __MACOSX entry (os.rmdir, os.remove) and then compress it again.

Depending if you have zip command on your OS you might be able to skip the re-compressing part. You could as well control this command from python by using os.system or subprocess module.

Upvotes: 0

Related Questions