Exasis
Exasis

Reputation: 51

Zip-file not recognized when downloaded through FTP

I am trying to create a script which will download a ZIP-file and extract it.

I am using Python 2.7 on Windows Server 2016.

I created a download script looking like this:

ftp = FTP()
ftp.connect("***")
ftp.login("***","***")
ftp.cwd(ftppath)
ftp.retrbinary("RETR " + filename ,open(tempfile, 'wb').write)
ftp.quit()

And a zip extraction script:

zip_ref = zipfile.ZipFile(tempfile, 'r')
zip_ref.extractall(localpath)
zip_ref.close()

These work independently. Meaning: If i run the extraction script on my test ZIP-file it will extract the file. Also if i run the FTP script from my server, it will download the file.

However! If i run the scripts together, meaning i download the file from my FTP server and then extract it, it will return an error: "file is not a Zip file".

Anyone who knows why this happens? I have checked the following:

EDIT

I have been reading about IO bytes and the like, however without any luck on implementing it.

Upvotes: 1

Views: 907

Answers (1)

Jean-François Fabre
Jean-François Fabre

Reputation: 140188

probably because of this bad practice one-liner:

ftp.retrbinary("RETR " + filename ,open(tempfile, 'wb').write)

open(tempfile, 'wb').write doesn't give any guarantee as to when the file is closed. You don't store the handle returned by open anywhere so you cannot decide when to close the file (and ensure full disk write).

So the last part of the file could just be not written to disk yet when trying to open it in read mode. And chaining download + unzip can trigger the bug (when 2 separate executions leave the time to flush & close the file)

Better use a context manager like this:

with open(tempfile, 'wb') as f:
    ftp.retrbinary("RETR " + filename ,f.write)

so the file is flushed & closed when exiting the with block (of course, perform the file read operations outside this block).

Upvotes: 2

Related Questions