Reputation: 399
I'm trying to write a program to test data transfer speeds for various-sized packets in parallel. I noticed something odd, though, that the size of the packet seemed to have no effect on transfer time according to my program, whereas the Unix ping
binary would time out on some of the packet sizes I'm using. I was sending 4 packets containing the string 'testquest' and one that was just 2000 bytes set to 0. However, when I printed the results, they all contained 'testquest' (and were far shorter than 2000 bytes). The only thing I can conclude is that these sockets are somehow all receiving the same packet, which would explain how they all had the same rtt.
I made this MCVE to illustrate the issue (you can ignore the 'checksum' function, it's included for completeness but I know from experience that it works):
#!/usr/bin/env python3
import socket
import struct
import time
from multiprocessing.pool import ThreadPool as Pool
from sys import argv, byteorder
def calculate_checksum(pkt):
"""
Implementation of the "Internet Checksum" specified in RFC 1071 (https://tools.ieft.org/html/rfc1071)
Ideally this would act on the string as a series of 16-bit ints (host
packed), but this works.
Network data is big-endian, hosts are typically little-endian,
which makes this much more tedious than it needs to be.
"""
countTo = len(pkt) // 2 * 2
total, count = 0, 0
# Handle bytes in pairs (decoding as short ints)
loByte, hiByte = 0, 0
while count < countTo:
if (byteorder == "little"):
loByte = pkt[count]
hiByte = pkt[count + 1]
else:
loByte = pkt[count + 1]
hiByte = pkt[count]
total += hiByte * 256 + loByte
count += 2
# Handle last byte if applicable (odd-number of bytes)
# Endianness should be irrelevant in this case
if countTo < len(pkt): # Check for odd length
total += pkt[len(pkt) - 1]
total &= 0xffffffff # Truncate sum to 32 bits (a variance from ping.c, which
# uses signed ints, but overflow is unlikely in ping)
total = (total >> 16) + (total & 0xffff) # Add high 16 bits to low 16 bits
total += (total >> 16) # Add carry from above (if any)
return socket.htons((~total) & 0xffff)
def ping(args):
sock, payload = args[0], args[1]
header = struct.pack("!BBH", 8, 0, 0)
checksum = calculate_checksum(header+payload)
header = struct.pack("!BBH", 8, 0, checksum)
timestamp = time.time()
sock.send(header+payload)
try:
response = sock.recv(20+len(payload))
except socket.timeout:
return 0
return (len(response), (time.time() - timestamp) * 1000)
host = argv[1] # A host that doesn't respond to ping packets > 1500B
# 1 is ICMP protocol number
sockets = [socket.socket(socket.AF_INET, socket.SOCK_RAW, proto=1) for i in range(12)]
for i, sock in enumerate(sockets):
sock.settimeout(0.1)
sock.bind(("0.0.0.0", i))
sock.connect((host, 1)) # Port number should never matter for ICMP
args = [(sockets[i], bytes(2**i)) for i in range(12)]
for arg in args:
print(ping(arg))
arg[0].close()
This actually shows me something more troubling - it seems that the rtt is actually decreasing with increasing packet size! Calling this program (as root, to get socket permissions) outputs:
0
0
(24, 15.784025192260742)
(28, 0.04601478576660156)
(28, 0.025033950805664062)
(28, 0.033855438232421875)
(28, 0.03528594970703125)
(28, 0.04887580871582031)
(28, 0.05316734313964844)
(28, 0.03790855407714844)
(28, 0.0209808349609375)
(28, 0.024080276489257812)
but now notice what happens when I try to send a packet of size 2048 using ping
:
user@mycomputer ~/src/connvitals $ time ping -c1 -s2048 $box
PING <hostname redacted> (<IP address redacted>): 2048 data bytes
--- <hostname redacted> ping statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
real 0m11.018s
user 0m0.005s
sys 0m0.008s
Not only is the packet dropped, but it takes 11 seconds to do so! So why - if my timeout is set to 100ms - is this packet getting a "successful" response from my python script in only ~0.04ms??
Thank you in advance for any help you can provide.
I just checked again, and it seems that it's multiple sockets that are the problem, and the threading seems to have nothing to do with it. I get the same issue when I ping with each socket - then immediately close it - sequentially.
Upvotes: 4
Views: 285
Reputation: 10985
All your sockets are identical, and all bound to the same host. There simply isn't any information in the packet for the kernel to know which socket to go to, and raw(7) seems to imply all sockets will receive them.
You're probably getting all the responses in all the threads, meaning you're getting 12 times as many responses per thread as you're expecting.
Upvotes: 3