user10571712
user10571712

Reputation:

Socket Programming Python: How to make sure entire message is received?

I am using python 3.x and the socket module. The server is running on an ipv4 address and using tcp. I read some tutorials on how to send and receive data. For the server or client to make sure the entire message was sent you can simply check if the amount of sent data is equals the size of the message:

def mysend(self, msg):
    totalsent = 0
    while totalsent < MSGLEN:
        sent = self.sock.send(msg[totalsent:])
        if sent == 0:
            raise RuntimeError("socket connection broken")
        totalsent = totalsent + sent

Source: https://docs.python.org/3/howto/sockets.html#socket-howto

And for the client to make sure the entire response has been received this tutorial recommends to add the size of the response at the beginning of the response.

My questions:

  1. How can I make sure I receive the first part of the message indicating the size of the message (assuming my message contains 1000 characters I would need four characters to indicate the size)?
  2. Why can't I just add a specified symbol like '<' at the begging of the message and '>' at the end so I know where it start and ends?

Edit:

  1. When I use sock.recv(1024) and my messages just have a size of 500 to 1000 characters doesn't that make sure I receive all of them?

Upvotes: 6

Views: 8192

Answers (2)

Barmar
Barmar

Reputation: 781300

For sending, you only really need that loop if you've put the socket in non-blocking mode. If the socket is in blocking mode (the default), sock.send() won't return until it has sent the entire message or gets an error.

However, for receiving there's no equivalent, because TCP doesn't include message boundaries in the protocol. sock.recv() returns as soon as any data is available.

  1. Call sock.recv() in a loop until you get everything you need. Similar to the way your sending routine sends shorter substrings each iteration, you can reduce the size of the recv() argument by the amount you've read so far. So it can look like:
def myrecv(self, size):
    buffer = ''
    while size > 0:
        msg = self.sock.recv(size)
        buffer += msg
        size -= len(msg)
    return buffer

If you put a 4-byte length before each message, you can do something like:

msgsize = int(myrecv(4))
message = myrecv(msgsize)
  1. You could do that, but it makes things more complicated. You need to read one character at a time, checking for the delimiters, or implement a buffer that holds data that you've read but haven't yet returned to the caller, because it's past the end of the current message. Also, if the data can contain the delimiters, you need to be able to escape it.

  2. No, recv(1024) can return as soon as it gets any data, which may be less than the size of the message that was sent. If it guaranteed to return 1024 characters, it would hang if the sender only sent 500 characters, because it's waiting for the remaining 524 characters.

Upvotes: 2

President James K. Polk
President James K. Polk

Reputation: 41973

First of all, to send all the bytes you don't need a loop because python sockets provide a simple method: socket.sendall().

Now to your questions:

  1. Yes, even to receive just 4 bytes you should have a receive loop that calls recv() on the socket until 4 bytes are read.

  2. You can, if you can guarantee that such characters will not appear in the message itself. However, you'd still need to search every character that you read in for the magic delimiter, so it seems inferior to simply prefixing the message body with a length.

  3. When you call recv(n) that is only guaranteed to return at most n bytes, not exactly n bytes.

Here are three different recvall() methods to compare:

def recvall(sock, size):
    received_chunks = []
    buf_size = 4096
    remaining = size
    while remaining > 0:
        received = sock.recv(min(remaining, buf_size))
        if not received:
            raise Exception('unexpected EOF')
        received_chunks.append(received)
        remaining -= len(received)
    return b''.join(received_chunks)

and the much shorter

def recvall2(sock, size):
    return sock.recv(size, socket.MSG_WAITALL)

and finally another version that is a little shorter than the first but lacks a couple of features:

def recvall3(sock, size):
    result = b''
    remaining = size
    while remaining > 0:
        data = sock.recv(remaining)
        result += data
        remaining -= len(data)
    return result

The second one is nice and short, but it relies on a socket option socket.MSG_WAITALL that I do not believe is guaranteed to exist on every platform. The first and third ones should work everywhere. I haven't really benchmarked any to compare and contrast.

Upvotes: 5

Related Questions