Reputation: 768
I'm actually using python's (2.7) ftplib to send files to an FTP server, but under the hood it uses socket.sendall. The function of interest is below:
def storbinary(self, cmd, fp, blocksize=8192, callback=None, rest=None):
"""Store a file in binary mode. A new port is created for you.
Args:
cmd: A STOR command.
fp: A file-like object with a read(num_bytes) method.
blocksize: The maximum data size to read from fp and send over
the connection at once. [default: 8192]
callback: An optional single parameter callable that is called on
on each block of data after it is sent. [default: None]
rest: Passed to transfercmd(). [default: None]
Returns:
The response code.
"""
self.voidcmd('TYPE I')
conn = self.transfercmd(cmd, rest)
while 1:
buf = fp.read(blocksize)
if not buf: break
conn.sendall(buf)
if callback: callback(buf)
conn.close()
return self.voidresp()
I am trying to choose the optimal block size, or at least understand the things affecting it. The code is currently running on a local gigabit network, with a 0.2ms ping time to the FTP server (yes, 0.2ms, not 0.2s), on Ubuntu kernel 3.2. I have a decent understanding of TCP window scaling and send/receive/congestion windows. I am sending 2GB files across this network and have found, in practice, that the transfer speed increases with block size, up to 533Mb/s using 256KB block size. For reference, a block size of 64KB gives around 330Mb/s.
I'm not complaining about those speeds by any means, but want to understand why 256KB block size is optimal. Everything I have found so far indicates that ~64KB is the larges chunk size needed. I have timed the sub components of the storebinary function to ensure that the total time to send the file actually decrease as chunk size increases up to 256KB (as opposed to time spent reading the file).
My code to transfer these 2GB files will eventually be run on many networks (though same OS, kernel, python version). I am worried about 256KB being sub optimal on other networks and I am curious to why 256KB block size gives the fastest transfer speed. Any insight would be greatly appreciated.
Edit: For those of you concerned with how I timed the actual socket.sendall call, independently, here is the modified version of the function I used to time this. Going from 64KB chunks to 256KB chunks brought read time from ~19s to ~14s and send time from ~18s to ~10s.
def storbinary(self, cmd, fp, blocksize=8192, callback=None, rest=None):
"""Store a file in binary mode. A new port is created for you.
Args:
cmd: A STOR command.
fp: A file-like object with a read(num_bytes) method.
blocksize: The maximum data size to read from fp and send over
the connection at once. [default: 8192]
callback: An optional single parameter callable that is called on
on each block of data after it is sent. [default: None]
rest: Passed to transfercmd(). [default: None]
Returns:
The response code.
"""
self.voidcmd('TYPE I')
conn = self.transfercmd(cmd, rest)
totalTime = 0
totalSendTime = 0
totalCallbackTime = 0
while 1:
startTime = time.time()
buf = fp.read(blocksize)
endTime = time.time()
if not buf: break
totalTime += (endTime - startTime)
startTime = time.time()
conn.sendall(buf)
endTime = time.time()
totalSendTime += (endTime - startTime)
startTime = time.time()
if callback: callback(buf)
endTime = time.time()
totalCallbackTime += (endTime - startTime)
print 'Total read time was %s'%str(totalTime)
print 'Total send time was %s'%str(totalSendTime)
print 'Total callback time was %s'%str(totalCallbackTime)
conn.close()
return self.voidresp()
Upvotes: 3
Views: 2953
Reputation: 56
The bits in the ftp are based on datagrams, so they are send in packets of a particular size through a fixed path. To send all data either you need to determine the size of the complete file and then expect the same on the ftp end. The better way is to add an ending Delimiter at the end of the file. So when you loop across the file's content on the ftp end and you find the ending delimiter you must stop expecting more data in the file from the same client. Keep the nominal size of the bits transferred in a single send to be about 1024, which is preferred size due to various reasons(please do find the reasons to that on Google, you will find it easily).
Upvotes: 1