user3339161
user3339161

Reputation: 1

Parsing packets from TCP stream

I'm often writing simple python TCP servers which respond to a request after parsing a length-prefixed packet. Assuming the socket as been setup, this usually looks a lot like this:

def tcp_server_loop():
    msg = ''
    msg_len = 0
    while True:
        msg += sock.recv(4096)
        if len(msg) >= 4 and msg_len == 0:
            msg_len, = struct.unpack_from("!I", msg)
        if len(msg) >= msg_len:
            protocol.parse_packet(msg[:msg_len])
            msg = msg[msg_len:]
            msg_len = 0

This works and has served me well many times, but I've always been irk'ed by the string appending in msg += sock.recv(4096). For small packets this isn't too bad, as the overhead in allocating new storage for these small strings isn't bad. But for large packets (MBs), a lot of copying goes on behind the scenes in Python's string implementation.

In C, or some similar language, a ring-buffer is the obvious data structure sized to the largest packet you expect. But, I've not found a similar Python implementation. I'm wondering if someone can improve upon my code above. How do you implement these types of servers?

Upvotes: 0

Views: 3002

Answers (1)

cklin
cklin

Reputation: 910

A quick suggestion first: you may wish to rename packet_size to msg_len for clarity. What you are trying to parse out from the TCP stream is an application-level protocol message, not a TCP segment (aka TCP packet).

But to address your question: a more efficient way is, when you receive your message header, allocate a second fixed-size bytearray buffer of length msg_len. Use this to store the data you subsequently read in.

Upvotes: 1

Related Questions