Reputation: 148910
Context:
It is common that a binary protocol defines frames of a given size. The struct
module is good at parsing that, provided everything has been received in a single buffer.
Problem:
TCP sockets are streams. A read from a socket cannot give more bytes than requested but can return less. So this code is not reliable:
def readnbytes(sock, n):
return sock.recv(n) # can return less than n bytes
The naive workaround:
def readnbytes(sock, n):
buff = b''
while n > 0:
b = sock.recv(n)
buff += b
if len(b) == 0:
raise EOFError # peer socket has received a SH_WR shutdown
n -= len(b)
return buff
may not be efficient, because if we ask a large number of bytes, and the data if very fragmented, we will repeatedly re-allocate a new byte buffer.
Question:
How is it possible to reliably receive exactly n bytes from a stream socket with no risk of re-allocation?
References:
Those other questions are related, and do give hints, but none give a simple and clear answer:
Upvotes: 8
Views: 6340
Reputation: 27201
A minor addition to @Serge's answer which returns an IncompleteReadError
(which is a subclass of EOFError
). This contains a partial
attribute containing the partially read data.
import socket
from asyncio import IncompleteReadError
def readexactly(sock: socket.socket, num_bytes: int) -> bytes:
buf = bytearray(num_bytes)
pos = 0
while pos < num_bytes:
n = sock.recv_into(memoryview(buf)[pos:])
if n == 0:
raise IncompleteReadError(bytes(buf[:pos]), num_bytes)
pos += n
return bytes(buf)
Usage:
try:
print(readexactly(sock, 26))
except IncompleteReadError as e:
print(f"Only read {len(e.partial)} out of {e.expected} bytes. :(")
print(e.partial)
Example output upon only reading 5 bytes b"ABCDE"
:
Only read 5 out of 26 bytes. :(
b'ABCDE'
Upvotes: 3
Reputation: 148910
The solution is to use recv_into
and a memoryview
. Python allows to pre-allocate a modifiable bytearray
that can be passed to recv_into
. But you cannot receive data into a slice of the bytearray, because the slice would be a copy. But a memoryview
allows to recieve multiple fragments into the same bytearray
:
def readnbyte(sock, n):
buff = bytearray(n)
pos = 0
while pos < n:
cr = sock.recv_into(memoryview(buff)[pos:])
if cr == 0:
raise EOFError
pos += cr
return buff
Upvotes: 7
Reputation: 177674
You can use socket.makefile() to wrap the socket in a file-like object. Then reads will return exactly the amount requested, unless the socket is closed where it can return the remainder. Here's an example:
server.py
from socket import *
sock = socket()
sock.bind(('',5000))
sock.listen(1)
with sock:
client,addr = sock.accept()
with client, client.makefile() as clientfile:
while True:
data = clientfile.read(5)
if not data: break
print(data)
client.py
from socket import *
import time
sock = socket()
sock.connect(('localhost',5000))
with sock:
sock.sendall(b'123')
time.sleep(.5)
sock.sendall(b'451234')
time.sleep(.5)
sock.sendall(b'51234')
Server Output
12345 12345 1234
Upvotes: 5