R.Rex
R.Rex

Reputation: 11

retransmissions/out-of-order packets with large (~1 MB) writes/reads w/ TCP (SOCK_STREAM) over localhost loopback

I'm seeing retransmissions/out-of-order packets when using the loopback interface and have been googling for days to try to determine if I should be surprised by this (which I am) or if anyone else is seeing it, but I haven't found any answers. Apologies if I'm just missing something. Someone mentioned a loopback specification, but I can't find it.

Here's one thing that I'm doing (tried on several modern Linux kernels, including 5.5.4):

sudo tcpdump -s78 -wt.tcpdump -ilo port 7001 & tcpdump_pid=$!; \
taskset -c 1 ./tcp_loopback.py -s -b1048576 & sleep .5; taskset -c 2 ./tcp_loopback.py -c -b1048576 --count=8192; \
sudo kill $tcpdump_pid; \
tshark -r t.tcpdump | egrep -i 'retrans|out.of.order' | tail

The tcp_loopback.py script is available at home.fnal.gov/~ron/tcp_loopback.py

The heart of the server portion (-s) is:

    sock = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
    sock.setsockopt( socket.SOL_SOCKET, socket.SO_REUSEADDR, 1 )
    sock.bind( ('127.0.0.1', port) )
    sock.listen( 4 )
    sockconn,address = sock.accept()
    while 1:
        data = sockconn.recv(bs)
        if opargs['-v']=='': print('received: '+str(len(data)))
        if len(data) == 0:
            if opargs['-v']=='': print('0 data, closing')
            break

The heart of the client portion (-s) is:

    sock = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
    sock.connect( ('127.0.0.1',port) )
    for xx in range(cnt): sock.send( '*'*bs )

The probability of retransmission seems to increase with larger (i.e. 2M, 4M) writes/reads.

I've read (e.g. Documentation/networking/scaling.rst) about out-of-order issues related to scheduling on different cores, hence the use of "taskset" above.

Is there a way to prevent this from happening (while still using large writes/reads at high rate)?

With loopback, don't really know if I'm looking at the send processing sending things out-of-order or the receive processing receiving things out-of-order. My ultimate goal is to establish a baseline for low-latency inter-node transmission in a 100 Gi, high congestion (many-to-one) environment. I developed an application that uses the "debug socket" to get retransmission information and I was surprised to see retransmissions on localhost.

Can anyone please help me understand what's happening and if there are any knobs (e.g. sysctl) to turn to eliminate retransmission while still maximizing data rate?

Thanks, Ron

P.S. I also eventually see the issue when I do this:

while true;do
 sudo rm -f /tmp/t.tcpdump
 sudo tcpdump -s78 -w/tmp/t.tcpdump -ilo port 7001 & tcpdump_pid=$!
 ncat -4l localhost 7001 > /dev/null & ncat_pid=$!
 sleep .5
 dd if=/dev/zero bs=1048576 count=4096 | ncat localhost 7001
 sudo kill $tcpdump_pid
 tshark -r /tmp/t.tcpdump | egrep -i 'retrans|out.of.order' && break
done

Example output:

/home/ron/notes
ron@ronlap77 :^) sudo tcpdump -s78 -wt.tcpdump -ilo port 7001 & tcpdump_pid=$!; \
> taskset -c 1 ./tcp_loopback.py -s -b1048576 & sleep .5; taskset -c 2 ./tcp_loopback.py -c -b1048576 --count=8192; \
> sudo kill $tcpdump_pid; \
> tshark -r t.tcpdump | egrep -i 'retrans|out.of.order' | tail
[1] 29066
[2] 29067
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 78 bytes
191379 packets captured
383230 packets received by filter
0 packets dropped by kernel
[2]+  Done                    taskset -c 1 ./tcp_loopback.py -s -b1048576
189947   1.092534    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4203412074 Ack=1 Win=65536 Len=65483 TSval=2693245609 TSecr=2693245609
189948   1.092535    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4203477557 Ack=1 Win=65536 Len=65483 TSval=2693245609 TSecr=2693245609
189949   1.092536    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4203543040 Ack=1 Win=65536 Len=65483 TSval=2693245609 TSecr=2693245609
189950   1.092536    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4203608523 Ack=1 Win=65536 Len=65483 TSval=2693245609 TSecr=2693245609
189951   1.092537    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4203674006 Ack=1 Win=65536 Len=65483 TSval=2693245609 TSecr=2693245609
189962   1.092594    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Spurious Retransmission] 46988 → 7001 [ACK] Seq=4203674006 Ack=1 Win=65536 Len=65483 TSval=2693245609 TSecr=2693245609
190115   1.093456    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4211997131 Ack=1 Win=65536 Len=65483 TSval=2693245610 TSecr=2693245610
190116   1.093457    127.0.0.1 → 127.0.0.1    TCP 65549 [TCP Out-Of-Order] 46988 → 7001 [ACK] Seq=4212062614 Ack=1 Win=65536 Len=65483 TSval=2693245610 TSecr=2693245610
190117   1.093457    127.0.0.1 → 127.0.0.1    TCP 1762 [TCP Out-Of-Order] 46988 → 7001 [PSH, ACK] Seq=4212128097 Ack=1 Win=65536 Len=1696 TSval=2693245610 TSecr=2693245610
190138   1.093547    127.0.0.1 → 127.0.0.1    TCP 1762 [TCP Out-Of-Order] 46988 → 7001 [PSH, ACK] Seq=4212128097 Ack=1 Win=65536 Len=1696 TSval=2693245610 TSecr=2693245610
[1]+  Done                    sudo tcpdump -s78 -wt.tcpdump -ilo port 7001
--2020-02-25_11:27:17--

Upvotes: 1

Views: 409

Answers (0)

Related Questions