JKnecht
JKnecht

Reputation: 241

Is it possible to continue when you hit a BadZipFile?

This zip-file contains 130.000 images.

Is it possible to continue when you run into a BadZipFile?

I mean ignore the bad image and move on to the next.

import zipfile
with zipfile.ZipFile("/content/train.zip", 'r') as zip_ref:
    zip_ref.extractall("/content/train/")

The error:

BadZipFile: Bad CRC-32 for file 'calvary-andrea-mantegna.jpg

I want something like this to work.

import zipfile
from zipfile import BadZipfile
try:
    with zipfile.ZipFile("/content/train.zip", 'r') as zip_ref:
      zip_ref.extractall("/content/train/")
except BadZipfile:
    continue

But i know i cant use a continue in a try-except.

Is there a way to solve this?

Upvotes: 0

Views: 854

Answers (1)

tdelaney
tdelaney

Reputation: 77337

You can extract files one by one. This may not work, depending what corrupted the file. For instance, if a block of the file was lost somewhere in the middle of the file, all unzips after that will fail.

import zipfile

try:
    with zipfile.ZipFile("/content/train.zip", 'r') as zip_ref:
        for info in zip_ref.infolist():
            try:
                zip_ref.extract(info, path="/content/train/")
            except zipfile.BadZipFile as e:
                print(f"{e} - offset {info.header_offset}")
except zipfile.BadZipFile as e:
    print(f"could not read zipfile: {e}")         

Upvotes: 1

Related Questions