Reputation: 4950
I have been playing around with Python's FTP library and am starting to think that it is too slow as compared to using a script file in DOS? I run sessions where I download thousands of data files (I think I have over 8 million right now). My observation is that the download process seems to take five to ten times as long in Python than it does as compared to using the ftp commands in the DOS shell.
Since I don't want anyone to fix my code I have not included any. I am more interested in understanding if my observation is valid or if I need to tinker more with the arguments.
Upvotes: 3
Views: 9985
Reputation: 31
Bigger blocksize is not always the optimum. For example uploading the same 167 MB file through wired network to same FTP server I got following times in seconds for various blocksizes:
Blocksize Time
102400 40
51200 30
25600 28
32768 30
24576 31
19200 34
16384 61
12800 144
In this configuration the optimum was around 32768 (4x8192).
But if I used wireless instead, I got these times:
Blocksize Time
204800 78
102400 76
51200 79
25600 76
32768 89
24576 86
19200 75
16384 166
12800 178
default 223
In this case there were several optimum blocksize values, all different from 32768.
Upvotes: 3
Reputation: 597
define blocksize along with storbinary of ftp connection,so you will get 1.5-3.0x more faster connection than FTP Filezilla :)
from ftplib import FTP
USER = "Your_user_id"
PASS = "Your_password"
PORT = 21
SERVER = 'ftp.billionuploads.com' #use FTP server name here
ftp = FTP()
ftp.connect(SERVER, PORT)
ftp.login(USER, PASS)
try:
file = open(r'C:\Python27\1.jpg','rb')
ftp.storbinary('STOR ' + '1.jpg', file,102400) #here we store file in 100kb blocksize
ftp.quit()
file.close()
print "File transfered"
except:
print "Error in File transfering"
Upvotes: 4
Reputation: 403
disable ftplib and execute ftp via Msdos
os.system('FTP -v -i -s:C:\\ndfd\\wgrib2\\ftpscript.txt')
inside ftpscript.txt
open example.com
username
password
!:--- FTP commands below here ---
lcd c:\MyLocalDirectory
cd public_html/MyRemoteDirectory
binary
mput "*.*"
disconnect
bye
Upvotes: 2
Reputation: 414
import ftplib
import time
ftp = ftplib.FTP("localhost", "mph")
t0 = time.time()
with open('big.gz.sav', 'wb') as f:
ftp.retrbinary('RETR ' + '/Temp/big.gz', f.write)
t1 = time.time()
ftp.close()
ftp = ftplib.FTP("localhost", "mph")
t2 = time.time()
ftp.retrbinary('RETR ' + '/Temp/big.gz', lambda x: x)
t3 = time.time()
print "saving file: %f to %f: %f delta" % (t0, t1, t1 - t0)
print "not saving file: %f to %f: %f delta" % (t2, t3, t3 - t2)
So, maybe not 10x. But my runs of this saving a file are all above 160s on a laptop with a core 1.8Ghz core i7 and 8GB of ram (should be overkill) running Windows 7. A native client does it at 100s. Without the file save I'm just under 70s.
I came to this question because I've seen slow performance with ftplib on a mac (I'll rerun this test again once I have access to that machine again). While going async with the writes might be a good idea in this case, on a real network I suspect that would be far less of a gain.
Upvotes: 2
Reputation: 223112
The speed problem is probably in your code. FTPlib is not 10 times slower.
Upvotes: 4
Reputation: 328770
FTPLib is implemented in Python whereas your "DOS Script" is actually a script which calls a compiled command. Executing this command is probably faster than interpreting Python code. If it is too slow for you, I suggest to call the DOS command from Python using the subprocess module.
Upvotes: 3
Reputation: 597311
FTPlib may not be the cleanest Python API, I don't think it so bad that it run ten times slower than a DOS shell script.
Unless you do not provide any code to compare, e.g you shell and you python snippet to batch dl 5000 files, I can't see how we can help you.
Upvotes: 3