Reputation: 51
I am trying to create a script which will download a ZIP-file and extract it.
I am using Python 2.7 on Windows Server 2016.
I created a download script looking like this:
ftp = FTP()
ftp.connect("***")
ftp.login("***","***")
ftp.cwd(ftppath)
ftp.retrbinary("RETR " + filename ,open(tempfile, 'wb').write)
ftp.quit()
And a zip extraction script:
zip_ref = zipfile.ZipFile(tempfile, 'r')
zip_ref.extractall(localpath)
zip_ref.close()
These work independently. Meaning: If i run the extraction script on my test ZIP-file it will extract the file. Also if i run the FTP script from my server, it will download the file.
However! If i run the scripts together, meaning i download the file from my FTP server and then extract it, it will return an error: "file is not a Zip file".
Anyone who knows why this happens? I have checked the following:
EDIT
I have been reading about IO bytes and the like, however without any luck on implementing it.
Upvotes: 1
Views: 907
Reputation: 140188
probably because of this bad practice one-liner:
ftp.retrbinary("RETR " + filename ,open(tempfile, 'wb').write)
open(tempfile, 'wb').write
doesn't give any guarantee as to when the file is closed. You don't store the handle returned by open
anywhere so you cannot decide when to close
the file (and ensure full disk write).
So the last part of the file could just be not written to disk yet when trying to open it in read mode. And chaining download + unzip can trigger the bug (when 2 separate executions leave the time to flush & close the file)
Better use a context manager like this:
with open(tempfile, 'wb') as f:
ftp.retrbinary("RETR " + filename ,f.write)
so the file is flushed & closed when exiting the with
block (of course, perform the file read operations outside this block).
Upvotes: 2